2

I have a dataframe:

df = pd.DataFrame({'A': [0, 0, 1], 'B': [1, 0, 0]}, index=['x', 'y', 'z'])

   A  B
x  0  1
y  0  0
z  1  0

For each row, I want the names of all the columns with the lowest value (edit: per row), something like:

x  A
y  A
y  B
z  B

# or 

x  [A]
y  [A, B]
z  [B]

I know idxmin() gives the first instance of the lowest value:

df.idxmin(axis=1)

x    A
y    A
z    B

But what is an efficient way to get all of them?

This question gives all of the rows with the minimum value in a specific column, but that's not quite what I want.

Edit: Here's a better toy df to play with for getting the column names with the minimum value in each row:

df2 = pd.DataFrame({'A': [1, 0, 6], 'B': [3, 0, 2]}, index=['x', 'y', 'z'])

   A  B
x  1  3
y  0  0
z  6  2
3
  • Upvote because you found df.idxmin Commented Feb 8, 2022 at 21:46
  • Should the minimum be per group of overall? Commented Feb 8, 2022 at 21:50
  • @mozway the minimum per row Commented Feb 8, 2022 at 22:14

3 Answers 3

2

You can use groupby+transform('min'):

s = df.stack()
s[s.eq(s.groupby(level=0).transform('min'))]

Output:

x  A    0
y  A    0
   B    0
z  B    0

Alternative format:

s = df.stack()
(s[s.eq(s.groupby(level=0).transform('min'))]
  .reset_index()
  .groupby('level_0')['level_1'].apply(list)
 )

Output:

level_0
x       [A]
y    [A, B]
z       [B]
Name: level_1, dtype: object
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for this. what does level=-1 do in groupby? (It also seems like levels 1 and 0 both give the same result, for this toy example at least.)
This means the last (right) level of the multindex ;)
@hmg I did a mistake, it should be level 0 to get the min per row. Level -1 would get the min per column (coincidentally identical here)
2

Convert the df into bool by finding every min value and pull columns that return True into a list

 s= df==df.min()

df['column_min']=s.agg(lambda s: s.index[s].values, axis=1)




 A   B   column_min
x  0  1        [A]
y  0  0     [A, B]
z  1  0        [B]

1 Comment

OP clarified the requirements, you need to use a groupby to get the min per row ;)
1

This is a one-liner, similar to @mozway's second solution but uses a boolean mask similar to @wwnde's:

min_cols = df.eq(df.min(axis=1), axis=0).stack().groupby(level=0).apply(lambda x: x.index.get_level_values(1)[x].tolist())

Output:

x       [A]
y    [A, B]
z       [B]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.