2

I have the following dataframe in pandas where there's a unique index (employee) for each row and also a group label type:

df = pandas.DataFrame({"employee": ["a", "b", "c", "d"], "type": ["X", "Y", "Y", "Y"], "value": [10,20,30,40]})
df = df.set_index("employee")

I want to group the employees by type and then calculate a statistic for each type. How can I do this and get a final dataframe which is type x statistic, for example type x (mean of types)? I tried using groupby:

g = df.groupby(lambda x: df.ix[x]["type"])
result = g.mean()

this is inefficient since it references the index ix of df for each row - is there a better way?

1
  • 4
    why not just use g = df.groupby("type")? Commented Aug 8, 2013 at 5:34

1 Answer 1

4

Like @sza says, you can use:

In [11]: g = df.groupby("type")

In [12]: g.mean()
Out[12]:
      value
type
X        10
Y        30

see the groupby docs for more...

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.