Pandas - Count occurences in a column

Question

I have this file with 19 columns of mixed dtypes. One of the column Names contain elements which are separated by space. For example:

Col1     Col2      
adress1   x
adress2   a b
adress3   x c
adress4   a x d

What I want to do is go over Col2 and find out how many times each element occurs and put the result in a new column along with its corresponding in Col1

Note the above columns were already processed as a Dataframe. I have this which somewhat give me the results but not what I want ultimately.

new_df = pd.Dataframe(old_df.Col2.str.split(' ').tolist(), index=old_df.Col1).stack

How do I put the results in a new column (replacing Col2) and also have the remaining columnS?

Something like:

Col1     Col2      Col3
adress1   x        something
adress2   a        something1 
adress2   b        something1
adress3   x        NaN
adress3   c        NaN

Also calculate occurrence of items in Col2?

BENY · Accepted Answer · 2019-11-21 15:51:04Z

1

We can do split first then do explode

s=df.assign(Col2=df.Col2.str.split()).explode('Col2')
s=s.groupby(['Col1','Col2']).size().to_frame('count').reset_index()
Out[48]: 
      Col1 Col2  count
0  adress1    x      1
1  adress2    a      1
2  adress2    b      1
3  adress3    c      1
4  adress3    x      1
5  adress4    a      1
6  adress4    d      1
7  adress4    x      1

answered Nov 21, 2019 at 15:51

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Rookie Over a year ago

This is very neat. Works like a treat. Thank you.

Collectives™ on Stack Overflow

Pandas - Count occurences in a column

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related