0

I have a pandas dataframe of the form

    a           b
0   [a, b]      0
1   [c, d, e]   1

I have written a function to create a list of partial lists:

def partials(l):
    result = []
    for i, elem in enumerate(l):
        result.append(l[:i+1])
    return result

which, when applied to the series df['a'], and exploding, using d['a'].apply(partials).explode() correctly gives:

0          [a]
0       [a, b]
1          [c]
1       [c, d]
1    [c, d, e]

However, this series is necessarily longer than the original. How can I apply this function in-place to column a of my dataframe, such that the column b repeats its value wherever the corresponding line from the original dataframe is 'exploded', like this :

            a     b
0          [a]    0
0       [a, b]    0
1          [c]    1
1       [c, d]    1
1    [c, d, e]    1

?

1
  • 1
    You could do with your original code and slight modification on explode. Assign first: df['a'] = df['a'].apply(partials), then explode on column "a": df.explode('a') Commented Jan 7, 2021 at 20:10

1 Answer 1

2

You can join back:

(df['a'].apply(partials)
   .explode().to_frame()
   .join(df.drop('a', axis=1))
)

Output:

           a  b
0        [a]  0
0     [a, b]  0
1        [c]  1
1     [c, d]  1
1  [c, d, e]  1
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.