1

Let's say I have a dataframe like this:

 v1 v2
 a  1
 a  1
 b  1
 b  2

I would like to groupby on v1 and make the count of each possible value in v2. So the result would be something like:

    v2    
     1  2
 v1
  a  2  0 
  b  1  1

I can do something like this

df.groupby("v1")\
.agg(
    {
    "v2": {
             "0": lambda x: sum(x==0),
             "1": lambda x: sum(x==1)
           }
    }
} 

But it's not really nice if the number of values is hight, or change! I've seen this post but couldn't make it working with my example.

Thanks for your help :)

2 Answers 2

1

The most efficient method is crosstab:

pd.crosstab(df['v1'], columns = df['v2'])

Result

v2  1  2
v1
a   2  0
b   1  1

Pandas crosstab documentation.

Sign up to request clarification or add additional context in comments.

Comments

1

If I'm not wrong,
You don't need agg function to obtain this result You just need to groupby v1 and v2. Unstack it.

v1 = 'a a b b'.split()
v2 = '1 1 1 2'.split()
import pandas as pd

df = pd.DataFrame({'v1': v1,
                   'v2': v2})

print(df)
g = df.groupby(['v1', 'v2'])
print(g.size().unstack())

This will return

v2    1    2
v1          
a   2.0  NaN
b   1.0  1.0

To fill the NaN.

print(g.size().unstack().fillna(0))
v2    1    2
v1          
a   2.0  0.0
b   1.0  1.0

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.