0

I have the following dataframe:

d = {'ID':['X1','Y1','Z3','X1','X1','Z3','L1','H5','H5','H5'],'Prob':[0,0,1,1,1,1,0,0,0,0]}
frame = pd.DataFrame(d)

I am trying to count the number of times a 0 or 1 occurs within the prob column and then create new columns indicating the individual count:


ID     zero_count    one_count
X1     1             2
Y1     1             0
Z3     0             2
L1     1             0
H5     3             0

I have tried the following but not managed it so far:


frame.groupby(['ID','prob'])['index'].count().reset_index(name='zero_count')

Any ideas would be great.

2 Answers 2

1

You need a crosstab:

out = (pd
   .crosstab(frame['ID'], frame['Prob'])
   .rename(columns={0: 'zero', 1: 'one'})
   .add_suffix('_count')
   .reset_index().rename_axis(columns=None)
)

variant with num2words for automatic conversion of numbers into words:

from num2words import num2words

out = (pd
   .crosstab(frame['ID'], frame['Prob'])
   .rename(columns=num2words)
   .add_suffix('_count')
   .reset_index().rename_axis(columns=None)
)

output:

   ID  zero_count  one_count
0  H5           3          0
1  L1           1          0
2  X1           1          2
3  Y1           1          0
4  Z3           0          2
Sign up to request clarification or add additional context in comments.

Comments

0

You could use the the function unstack to display your different groupby results in columns. Like this:

frame.groupby(['ID','Prob']).value_counts().unstack(level=1,fill_value=0)

Output:

Prob    0   1
ID      
H5      3   0
L1      1   0
X1      1   2
Y1      1   0
Z3      0   2

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.