Find all indices of maximum in Pandas DataFrame

Question

I need to find all indices where the maximum value (per row) is obtained in a Pandas DataFrame. For instance, if I have a dataFrame like this:

   cat1  cat2  cat3
0     0     2     2
1     3     0     1
2     1     1     0

then the method I am looking for would yield a result like:

[['cat2', 'cat3'],
 ['cat1'],
 ['cat1', 'cat2']]

This is a list of lists, but some other data structure is also okay.

I cannot use df.idxmax(axis=1), because it only yields the first maximum.

unutbu · Accepted Answer · 2014-02-07 20:16:25Z

4

Here is the information, in a different data structure:

In [8]: df = pd.DataFrame({'cat1':[0,3,1], 'cat2':[2,0,1], 'cat3':[2,1,0]})

In [9]: df
Out[9]: 
   cat1  cat2  cat3
0     0     2     2
1     3     0     1
2     1     1     0

[3 rows x 3 columns]

In [10]: rowmax = df.max(axis=1)

The max values are indicated by True values:

In [82]: df.values == rowmax[:,None]
Out[82]: 
array([[False,  True,  True],
       [ True, False, False],
       [ True,  True, False]], dtype=bool)

np.where returns the indices where the DataFrame above is True.

In [84]: np.where(df.values == rowmax[:,None])
Out[84]: (array([0, 0, 1, 2, 2]), array([1, 2, 0, 0, 1]))

The first array indicates index values for axis=0, the second array for axis=1. There are 5 values in each array since there are five locations that are True.

You could use itertools.groupby to build the list of lists you posted, though perhaps you don't need this given the data structures above:

In [46]: import itertools as IT

In [47]: import operator

In [48]: idx = np.where(df.values == rowmax[:,None])

In [49]: groups = IT.groupby(zip(*idx), key=operator.itemgetter(0))

In [50]: [[df.columns[j] for i, j in grp] for k, grp in groups]
Out[50]: [['cat1', 'cat1'], ['cat2'], ['cat3', 'cat3']]

edited Feb 7, 2014 at 20:16

answered Feb 7, 2014 at 12:49

unutbu

886k197 gold badges1.9k silver badges1.7k bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Jeff Over a year ago

df==DataFrame(np.tile(rowmax,len(df)).reshape(df.shape).T,index=df.index,columns=df.columns) will get your boolean frame (kind of like a broadcasted comparison operator); faster, but prob not more clear than the apply

unutbu Over a year ago

@Jeff: Good idea. df.values == rowmax[:,None] is about 10x faster still.

RafG Over a year ago

Thanks a lot! Unless I'm mistaken, your last line should read [[df.columns[j] for i, j in grp] for k, grp in groups] no?

Jeff Over a year ago

@unutbu actually I think their is an open issue to make broadcast able comparisons ( eg like div,mul and such) - can u link this too it (.or create and issue if their isn't one) thanks

Jeff Over a year ago

gr8 thanks! interested in doing a PR?? (its not that involved!)

|

Anton vBR · Accepted Answer · 2018-05-28 21:43:43Z

1

You could do

In [2560]: cols = df.columns.values

In [2561]: vals = df.values

In [2562]: [cols[v].tolist() for v in vals == vals.max(1)[:, None]]
Out[2562]: [['cat2', 'cat3'], 
            ['cat1'], 
            ['cat1', 'cat2']]

Update

Here is a full example:

import pandas as pd
import numpy as np

np.random.seed(400)

df = pd.DataFrame({
    'a': np.random.randint(0,3,size=10), 
    'b': np.random.randint(0,3,size=10), 
    'c': np.random.randint(0,5,size=10),
})

print(df)

out = [df.columns[i].tolist() for i in df.values == df.max(axis=1)[:,None]]

for i in out:
    print(i)

Returning from print(df):

And from print(out):

['c']
['c']
['a', 'b', 'c']
['c']
['a', 'b']
['a', 'b', 'c']
['c']
['a', 'c']
['c']
['c']

edited May 28, 2018 at 21:43

Anton vBR

19k6 gold badges47 silver badges47 bronze badges

answered Oct 15, 2017 at 16:09

Zero

77.4k22 gold badges153 silver badges153 bronze badges

1 Comment

Anton vBR Over a year ago

Another question (dupe) led me here. A shorter version: [df.columns[i].tolist() for i in df.values == df.max(axis=1)[:,None]]

Collectives™ on Stack Overflow

Find all indices of maximum in Pandas DataFrame

2 Answers 2

6 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

6 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related