6

Let's say I have a data frame called df

x count 
d 2
e 3
f 2

Count would be the counter column and the # times I want it to repeat.

How would I expand it to make it

x count
d 2
d 2
e 3
e 3
e 3
f 2
f 2

I've already tried numpy.repeat(df,df.iloc['count']) and it errors out

1
  • what is d 2 e 3 f 2? are they columns? Commented Jul 17, 2015 at 22:03

2 Answers 2

10

You can use np.repeat()

import pandas as pd
import numpy as np

# your data
# ========================
df

   x  count
0  d      2
1  e      3
2  f      2

# processing
# ==================================
np.repeat(df.values, df['count'].values, axis=0)


array([['d', 2],
       ['d', 2],
       ['e', 3],
       ['e', 3],
       ['e', 3],
       ['f', 2],
       ['f', 2]], dtype=object)

pd.DataFrame(np.repeat(df.values, df['count'].values, axis=0), columns=['x', 'count'])

   x count
0  d     2
1  d     2
2  e     3
3  e     3
4  e     3
5  f     2
6  f     2
Sign up to request clarification or add additional context in comments.

Comments

3

You could use .loc with repeat like

In [295]: df.loc[df.index.repeat(df['count'])].reset_index(drop=True)
Out[295]:
   x  count
0  d      2
1  d      2
2  e      3
3  e      3
4  e      3
5  f      2
6  f      2

Or, using pd.Series.repeat you can

In [278]: df.set_index('x')['count'].repeat(df['count']).reset_index()
Out[278]:
   x  count
0  d      2
1  d      2
2  e      3
3  e      3
4  e      3
5  f      2
6  f      2

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.