2

I'm trying to isolate and print the maximum value in a pandas dataframe in python.

# Data frame:

df
>>     0   A   B   C
    0  0   0   0   0
    A  0  -3  -3   5
    B  0  -3  -6   2
    C  0   5   0  -3
    D  0   5   2  -3
    E  0   0  10   5
    F  0  -3   5  15

I have managed to isolate the value with the following code:

x = df.max(axis=0)
maxValue = max(x)

maxValue
>> 15

But how can I access this element? Is there a way to iterate through the elements of the data frame such that

for elements in df:
    if element == maxValue:
        m = element

Or something on those lines? I need to find the largest element, in this case 15, and retrieve its position i.e. (C,F) in this example. I then need to store this and then find the next largest element surrounding the first, along with its position.

# desired output
[(C,F), (B,E), (A,D)]

I hope this makes sense! Any advice on how I could implement this would be much appreciated! :)

1

3 Answers 3

2

You can use:

#replace 'df.iloc[:,1:]' with 'df' if first column isnt 0
out = [*df.iloc[:,1:][::-1].idxmax().items()] 
#[('A', 'D'), ('B', 'E'), ('C', 'F')]
Sign up to request clarification or add additional context in comments.

5 Comments

I tried this and I got a TypeError: reduction operation 'argmax' not allowed for this dtype ?
@CharlieVagg can you try [*df.iloc[:,1:][::-1].astype(float).idxmax().items()] ?
I got this as an output: [<itertools.izip object at 0x7f5ef0b76c80>]
@CharlieVagg works fine for me, may be some version issues not sure try list(df.iloc[:,1:][::-1].astype(float).idxmax().items()) but really works if you test with the sample data if all columns are numeric
ah I was using python 2 instead of 3, it has worked now. Thank you so much!
2

IIUC sort_values + stack

df.stack().sort_values().groupby(level=1).tail(1).index.tolist()
Out[229]: [('A', '0'), ('D', 'A'), ('E', 'B'), ('F', 'C')]

Comments

2

I understand question is necessary sorting maximal values, so use if nedd omit first column DataFrame.iloc, then DataFrame.agg for positions of maximums with max for maximums, sorting them by DataFrame.sort_values, select it to Series and last convert to list of tuples:

L = (list(df.iloc[:, 1:]
            .agg(['idxmax','max'])
            .sort_values('max', axis=1, ascending=False)
            .loc['idxmax'].items()))
print (L)
[('C', 'F'), ('B', 'E'), ('A', 'C')]

For all columns remove iloc:

L = (list(df.agg(['idxmax','max'])
            .sort_values('max', axis=1, ascending=False)
            .loc['idxmax'].items()))
print (L)
[('C', 'F'), ('B', 'E'), ('A', 'C'), ('0', '0')]

2 Comments

I get a KeyError: 'idxmax'. Do you know why this might be?
@CharlieVagg - Yes, data are not numeric. Try L = (list(df.select_dtypes(np.number).agg(['idxmax','max']) .sort_values('max', axis=1, ascending=False) .loc['idxmax'].items())) print (L)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.