4

The problem is simple and so must be solution but I am not able to find it.

I want to find which row and column in Pandas DataFrame has minimum value and how much is it.

I have tried following code (in addition to various combinations):

df = pd.DataFrame(data=[[4,5,6],[2,1,3],[7,0,5],[2,5,3]], 
                 index = ['R1','R2','R3','R4'], 
                 columns=['C1','C2','C3'])

print(df)

print(df.loc[df.idxmin(axis=0), df.idxmin(axis=1)])

The dataframe (df) being searched is:

    C1  C2  C3
R1   4   5   6
R2   2   1   3
R3   7   0   5
R4   2   5   3

Output for the loc command:

    C1  C2  C2  C1
R2   2   1   1   2
R3   7   0   0   7
R2   2   1   1   2

What I need is:

    C2
R3   0

How can I get this simple result?

2
  • Performance is important? Commented Nov 14, 2018 at 6:35
  • Working with some missing values is most important. Then display and then performance. Commented Nov 14, 2018 at 6:46

3 Answers 3

6

Use:

a, b = df.stack().idxmin()
print(df.loc[[a], [b]])
    C2
R3   0

Another @John Zwinck solution working with missing values - use numpy.nanargmin:

df = pd.DataFrame(data=[[4,5,6],[2,np.nan,3],[7,0,5],[2,5,3]], 
    index = ['R1','R2','R3','R4'], 
    columns=['C1','C2','C3'])

print(df)
    C1   C2  C3
R1   4  5.0   6
R2   2  NaN   3
R3   7  0.0   5
R4   2  5.0   3

#https://stackoverflow.com/a/3230123
ri, ci = np.unravel_index(np.nanargmin(df.values), df.shape)
print(df.iloc[[ri], [ci]])
     C2
R3  0.0
Sign up to request clarification or add additional context in comments.

3 Comments

Great. Forgot to add in question: there are some np.nan values in real df. Will this code work there as well?
@rnso - sure, pandas function working with nans nice.
@rnso - changed solution for working with missing values.
1

I'd get the index this way:

np.unravel_index(np.argmin(df.values), df.shape)

This is much faster than df.stack().idxmin().

It gives you a tuple such as (2, 1) in your example. Pass that to df.iloc[] to get the value.

2 Comments

It works but it is not giving row and column names in output. Also will it work if there are some np.nan values in df?
@rnso: if you want to ignore NANs, simply use nanargmin instead of argmin. If you want the row and column names, you can use df.columns[x] and df.index[y] or df.iloc[[x], [y]] as in jezrael's answer.
1

Or min+min+dropna+T+dropna+T:

>>> df[df==df.min(axis=1).min()].dropna(how='all').T.dropna().T
     C2
R3  0.0
>>> 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.