Repeating other column values when using pandas.Series.explode()

Question

I have a pandas dataframe of the form

    a           b
0   [a, b]      0
1   [c, d, e]   1

I have written a function to create a list of partial lists:

def partials(l):
    result = []
    for i, elem in enumerate(l):
        result.append(l[:i+1])
    return result

which, when applied to the series df['a'], and exploding, using d['a'].apply(partials).explode() correctly gives:

0          [a]
0       [a, b]
1          [c]
1       [c, d]
1    [c, d, e]

However, this series is necessarily longer than the original. How can I apply this function in-place to column a of my dataframe, such that the column b repeats its value wherever the corresponding line from the original dataframe is 'exploded', like this :

            a     b
0          [a]    0
0       [a, b]    0
1          [c]    1
1       [c, d]    1
1    [c, d, e]    1

?

You could do with your original code and slight modification on explode. Assign first: df['a'] = df['a'].apply(partials), then explode on column "a": df.explode('a') — Emma
– Emma, Commented Jan 7, 2021 at 20:10

Quang Hoang · Accepted Answer · 2021-01-07 20:05:09Z

2

You can join back:

(df['a'].apply(partials)
   .explode().to_frame()
   .join(df.drop('a', axis=1))
)

Output:

           a  b
0        [a]  0
0     [a, b]  0
1        [c]  1
1     [c, d]  1
1  [c, d, e]  1

answered Jan 7, 2021 at 20:05

Quang Hoang

151k11 gold badges64 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Repeating other column values when using pandas.Series.explode()

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related