1

We can use .idxmax to get the maximum value of a dataframe­(df). My problem is that I have a df with several columns (more than 10), one of a column has identifiers of same value. I need to extract the identifiers with the maximum value:

>df

id  value
a   0
b   1
b   1
c   0
c   2
c   1

Now, this is what I'd want:

>df

id  value
a   0
b   1
c   2

I am trying to get it by using df.groupy(['id']), but it is a bit tricky:

df.groupby(["id"]).ix[df['value'].idxmax()]

Of course, that doesn't work. I fear that I am not on the right path, so I thought I'd ask you guys! Thanks!

1 Answer 1

5

Close! Groupby the id, then use the value column; return the max for each group.

In [14]: df.groupby('id')['value'].max()
Out[14]: 
id
a     0
b     1
c     2
Name: value, dtype: int64

Op wants to provide these locations back to the frame, just create a transform and assign.

In [17]: df['max'] = df.groupby('id')['value'].transform(lambda x: x.max())

In [18]: df
Out[18]: 
  id  value  max
0  a      0    0
1  b      1    1
2  b      1    1
3  c      0    2
4  c      2    2
5  c      1    2
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.