1

I have a data frame as the one generated by the script below - bringing in dataframe "data".

Ideally I would like to generate a new dataframe that combines the id and a sequence of 1 : value.

d = {'id': ['a', 'b','c'], 'value': [1, 2,1]}
data = pd.DataFrame(data=d)
data

This means that the ideal output would be:

|------|---------|
|  ID  |  value  |
|------|---------|
|   a  |  1      |
|   b  |  1      |
|   b  |  2      |
|   c  |  1      |
|------|---------|

1 Answer 1

2

Use Index.repeat by column value and reassign values by counter by GroupBy.cumcount:

#if not default RangeIndex
#data = data.reset_index(drop=True)
df = data.loc[data.index.repeat(data['value'])]
df['value'] = df.groupby(level=0).cumcount() + 1
df = df.reset_index(drop=True)
print (df)
  id  value
0  a      1
1  b      1
2  b      2
3  c      1

Alternative solution with DataFrame.assign:

df = (data.loc[data.index.repeat(data['value'])]
          .assign(value=lambda x: x.groupby(level=0).cumcount() + 1)
          .reset_index(drop=True))
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.