2

I wish to explode my dataset based on a specific column in Python.

Data

id  type    date    stat    energy
aa  ss      Q1 2022 3       10
aa  ss      Q2 2022 2       10
bb  uu      Q1 2022 1       15
bb  uu      Q2 2022 3       15
cc  ii      Q1 2022 0       0
            

Desired

id  type    date    stat    energy
aa  ss     Q1 2022  3       10
aa  ss     Q1 2022  3       10
aa  ss     Q1 2022  3       10
aa  ss     Q2 2022  2       10
aa  ss     Q2 2022  2       10
bb  uu     Q1 2022  1       15
bb  uu     Q2 2022  3       15
bb  uu     Q2 2022  3       15
bb  uu     Q2 2022  3       15
cc  ii     Q1 2022  0       0

Doing

df.explode(list['stat'])

Any suggestion is appreciated

3 Answers 3

1

Use df.index.repeat:

repeats = np.where(df['stat'] == 0, 1, df['stat'])
# OR
repeats = df['stat'].clip(lower=1)

out = df.reindex(df.index.repeat(repeats)).reset_index(drop=True)
print(out)

# Output
   id type     date  stat  energy
0  aa   ss  Q1 2022     3      10
1  aa   ss  Q1 2022     3      10
2  aa   ss  Q1 2022     3      10
3  aa   ss  Q2 2022     2      10
4  aa   ss  Q2 2022     2      10
5  bb   uu  Q1 2022     1      15
6  bb   uu  Q2 2022     3      15
7  bb   uu  Q2 2022     3      15
8  bb   uu  Q2 2022     3      15
9  cc   ii  Q1 2022     0       0
Sign up to request clarification or add additional context in comments.

2 Comments

sure - thank you the original soln works- was wondering what the edit changes? @corralien
Because if you repeat a row zero time, the row disappears so I need to keep at least one instance (your row 'cc')
1

Another solution could be

df['stat'] = [[x]*x if x > 0 else [x] for x in df['stat']]
new = df.explode('stat')

Comments

1

Faster and neater way to do it is to use np.repeat

m=df['stat'].ge(1)#Isolate rows to be duplicated
df1 = (pd.DataFrame(np.repeat(df[m].values,df.loc[m,'stat'], axis=0)#convert to numpy array and duplicate conditionally
                    , columns=df.columns)#Convert back to df
       .append(df[~m])#Reappend rows that had had zero dup required
      )
print(df1)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.