Find maximum value of a column and return the corresponding row values using Pandas

Question

Country       Place  Value
US       NewYork     562
US       Michigan    854
US       Illinois    356
UK       London      778
UK       Manchester  512
Spain    Madrid      509
India    Mumbai      196
US       Kansas      894
UK       Liverpool   796
Spain    Barcelona   792

Using Pandas I am trying to find the Country and Place with the maximum value.

This returns the maximum value:

data.groupby(['Country','Place'])['Value'].max()

But how do I get the corresponding Country and Place name?

Does this answer your question? Find row where values for column is maximal in a pandas DataFrame — Gonçalo Peres
– Gonçalo Peres, Commented May 20, 2021 at 11:59

quantif · Accepted Answer · 2018-05-11 21:01:06Z

246

Assuming df has a unique index, this gives the row with the maximum value:

In [34]: df.loc[df['Value'].idxmax()]
Out[34]: 
Country        US
Place      Kansas
Value         894
Name: 7

Note that idxmax returns index labels. So if the DataFrame has duplicates in the index, the label may not uniquely identify the row, so df.loc may return more than one row.

Therefore, if df does not have a unique index, you must make the index unique before proceeding as above. Depending on the DataFrame, sometimes you can use stack or set_index to make the index unique. Or, you can simply reset the index (so the rows become renumbered, starting at 0):

df = df.reset_index()

edited May 11, 2018 at 21:01

quantif

1663 silver badges12 bronze badges

answered Apr 1, 2013 at 10:58

unutbu

886k197 gold badges1.9k silver badges1.7k bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

starriet 차주녕 Over a year ago

Note this returns the first max row if there are multiple max values.

CodeCabbie Over a year ago

For future and others - note the words "Note that idxmax returns index labels."

eyllanesc · Accepted Answer · 2018-05-05 17:58:47Z

127

df[df['Value']==df['Value'].max()]

This will return the entire row with max value

edited May 5, 2018 at 17:58

eyllanesc

246k19 gold badges205 silver badges282 bronze badges

answered Apr 30, 2018 at 17:07

Gaurav

1,4291 gold badge10 silver badges4 bronze badges

2 Comments

penta Over a year ago

Explanation :- The inner expression does a boolean check throughout the length of the dataFrame & that index which satisfies the right hand side of the expression( .max()) returns the index, which in turn calls the complete row of that dataFrame

starriet 차주녕 Over a year ago

Worth noting that this returns all rows if there are multiple same max values.

lenz · Accepted Answer · 2019-01-29 21:32:45Z

21

I think the easiest way to return a row with the maximum value is by getting its index. argmax() can be used to return the index of the row with the largest value.

index = df.Value.argmax()

Now the index could be used to get the features for that particular row:

df.iloc[df.Value.argmax(), 0:2]

edited Jan 29, 2019 at 21:32

lenz

5,8585 gold badges27 silver badges47 bronze badges

answered May 9, 2018 at 10:48

sharad kakran

3112 silver badges2 bronze badges

Comments

HYRY · Accepted Answer · 2013-04-01 11:04:40Z

13

The country and place is the index of the series, if you don't need the index, you can set as_index=False:

df.groupby(['country','place'], as_index=False)['value'].max()

Edit:

It seems that you want the place with max value for every country, following code will do what you want:

df.groupby("country").apply(lambda df:df.irow(df.value.argmax()))

edited Apr 1, 2013 at 11:04

answered Apr 1, 2013 at 10:50

HYRY

97.8k28 gold badges197 silver badges192 bronze badges

1 Comment

richie Over a year ago

that would only return the column names and the dtypes

Morez · Accepted Answer · 2020-09-29 14:07:26Z

10

You can use:

print(df[df['Value']==df['Value'].max()])

edited Sep 29, 2020 at 14:07

Morez

2,2493 gold badges15 silver badges39 bronze badges

answered Feb 16, 2020 at 15:01

kelvinkahuro

1211 silver badge4 bronze badges

Comments

Erfan · Accepted Answer · 2021-03-10 12:23:39Z

10

Using `DataFrame.nlargest`.

The dedicated method for this is nlargest which uses algorithm.SelectNFrame on the background, which is a performant way of doing: sort_values().head(n)

   x  y  a  b
0  1  2  a  x
1  2  4  b  x
2  3  6  c  y
3  4  1  a  z
4  5  2  b  z
5  6  3  c  z

df.nlargest(1, 'y')

   x  y  a  b
2  3  6  c  y

edited Mar 10, 2021 at 12:23

answered Mar 10, 2021 at 12:18

Erfan

43.3k10 gold badges75 silver badges86 bronze badges

Comments

waitingkuo · Accepted Answer · 2013-04-01 11:02:09Z

9

Use the index attribute of DataFrame. Note that I don't type all the rows in the example.

In [14]: df = data.groupby(['Country','Place'])['Value'].max()

In [15]: df.index
Out[15]: 
MultiIndex
[Spain  Manchester, UK     London    , US     Mchigan   ,        NewYork   ]

In [16]: df.index[0]
Out[16]: ('Spain', 'Manchester')

In [17]: df.index[1]
Out[17]: ('UK', 'London')

You can also get the value by that index:

In [21]: for index in df.index:
    print index, df[index]
   ....:      
('Spain', 'Manchester') 512
('UK', 'London') 778
('US', 'Mchigan') 854
('US', 'NewYork') 562

Edit

Sorry for misunderstanding what you want, try followings:

In [52]: s=data.max()

In [53]: print '%s, %s, %s' % (s['Country'], s['Place'], s['Value'])
US, NewYork, 854

edited Apr 1, 2013 at 11:02

answered Apr 1, 2013 at 10:44

waitingkuo

94.5k28 gold badges119 silver badges122 bronze badges

2 Comments

richie Over a year ago

correct. But I'm looking for a one line output that says, 'US, Kansas, 894'

richie Over a year ago

Thanks. This would solve the problem for the current dataset where there is just 1 column with values. When there are more columns with values @unutbu's solution would work better. Thanks anyway.

Connor · Accepted Answer · 2020-03-25 08:36:21Z

9

In order to print the Country and Place with maximum value, use the following line of code.

print(df[['Country', 'Place']][df.Value == df.Value.max()])

edited Mar 25, 2020 at 8:36

Connor

4,9842 gold badges34 silver badges42 bronze badges

answered Feb 20, 2018 at 6:53

Arpit Sharma

3651 gold badge6 silver badges16 bronze badges

Comments

groenhen · Accepted Answer · 2020-03-23 07:47:25Z

3

import pandas
df is the data frame you create.

Use the command:

df1=df[['Country','Place']][df.Value == df['Value'].max()]

This will display the country and place whose value is maximum.

edited Mar 23, 2020 at 7:47

groenhen

3,02725 gold badges51 silver badges69 bronze badges

answered Mar 23, 2020 at 7:22

raksha

311 bronze badge

Comments

Marcin Lentner · Accepted Answer · 2019-01-14 21:12:12Z

2

My solution for finding maximum values in columns:

df.ix[df.idxmax()]

, also minimum:

df.ix[df.idxmin()]

answered Jan 14, 2019 at 21:12

Marcin Lentner

851 silver badge6 bronze badges

Comments

saran3h · Accepted Answer · 2019-05-26 05:47:22Z

2

I'd recommend using nlargest for better performance and shorter code. import pandas

df[col_name].value_counts().nlargest(n=1)

answered May 26, 2019 at 5:47

saran3h

14.4k5 gold badges53 silver badges70 bronze badges

Comments

Jefferson Sankara · Accepted Answer · 2019-11-29 04:16:48Z

0

I encountered a similar error while trying to import data using pandas, The first column on my dataset had spaces before the start of the words. I removed the spaces and it worked like a charm!!

answered Nov 29, 2019 at 4:16

Jefferson Sankara

7551 gold badge5 silver badges7 bronze badges

Collectives™ on Stack Overflow

Find maximum value of a column and return the corresponding row values using Pandas

12 Answers 12

2 Comments

2 Comments

Comments

1 Comment

Comments

Using `DataFrame.nlargest`.

Comments

Edit

2 Comments

Comments

Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

12 Answers 12

2 Comments

2 Comments

Comments

1 Comment

Comments

Using DataFrame.nlargest.

Comments

Edit

2 Comments

Comments

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related

Using `DataFrame.nlargest`.