creating aggregation with lambda in loop

Question

Let's say I have a dataframe like this:

I would like to groupby on v1 and make the count of each possible value in v2. So the result would be something like:

I can do something like this

df.groupby("v1")\
.agg(
    {
    "v2": {
             "0": lambda x: sum(x==0),
             "1": lambda x: sum(x==1)
           }
    }
}

But it's not really nice if the number of values is hight, or change! I've seen this post but couldn't make it working with my example.

Thanks for your help :)

willk · Accepted Answer · 2018-11-20 17:29:03Z

1

The most efficient method is crosstab:

pd.crosstab(df['v1'], columns = df['v2'])

Result

Pandas crosstab documentation.

edited Nov 20, 2018 at 17:29

answered Nov 20, 2018 at 17:19

willk

3,8252 gold badges33 silver badges45 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

kaihami · Accepted Answer · 2018-11-20 17:19:34Z

1

If I'm not wrong,
You don't need agg function to obtain this result You just need to groupby v1 and v2. Unstack it.

v1 = 'a a b b'.split()
v2 = '1 1 1 2'.split()
import pandas as pd

df = pd.DataFrame({'v1': v1,
                   'v2': v2})

print(df)
g = df.groupby(['v1', 'v2'])
print(g.size().unstack())

This will return

v2    1    2
v1          
a   2.0  NaN
b   1.0  1.0

To fill the NaN.

print(g.size().unstack().fillna(0))
v2    1    2
v1          
a   2.0  0.0
b   1.0  1.0

answered Nov 20, 2018 at 17:19

kaihami

8158 silver badges19 bronze badges

Collectives™ on Stack Overflow

creating aggregation with lambda in loop

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related