0

This is my first time asking a question. I have a dataframe that looks like below:

import pandas as pd
data = [['AK', 'Co',2957],
    ['AK', 'Ot', 15],
    ['AK','Petr', 86848],
    ['AL', 'Co',167],
    ['AL', 'Ot', 10592],
    ['AL',  'Petr',1667]]
my_df = pd.DataFrame(data, columns = ['State', 'Energy', 'Elec']) 
print(my_df)

I need to find the maximum and minimum values of the third column based on the first two columns. I did browse through a few stackoverflow questions but couldn't find the right way to solve this. My output should look like below:

data = [['AK','Ot', 15],
['AK','Petr',86848],
['AL','Co',167],
['AL','Ot', 10592]]

my_df = pd.DataFrame(data, columns = ['State', 'Energy', 'Elec']) 
print(my_df)

Note: Please let me know where I am lagging before leaving a negative marking on the question

This link helped me: Python pandas dataframe: find max for each unique values of an another column

9
  • Why is this missing: ['AL', 'Petr',1667]? Commented Dec 3, 2020 at 17:12
  • @DaniMesejo, thank you for teh reply. I am only trying to find the min value and max value of third column based on first and second column. For 'state' column 'AL', the min value is 167 and max value is 10592, which need to be displayed. Commented Dec 3, 2020 at 17:13
  • 3
    df.groupby(['State','Energy'])['Elec'].agg(['min','max'])? Commented Dec 3, 2020 at 17:14
  • Does this answer your question? Pandas dataframe max and min value Commented Dec 3, 2020 at 17:14
  • 1
    This link helped me: stackoverflow.com/questions/40568438/… Commented Dec 3, 2020 at 17:41

1 Answer 1

1

try idxmin and idxmax with .loc filter.

    new_df = my_df.loc[
        my_df.groupby(["State"])
        .agg(ElecMin=("Elec", "idxmin"), ElecMax=("Elec", "idxmax"))
        .stack()
    ]
)

print(new_df)

  State Energy   Elec
0    AK     Ot     15
1    AK   Petr  86848
2    AL     Co    167
3    AL     Ot  10592
Sign up to request clarification or add additional context in comments.

2 Comments

you can alternatively do my_df.loc[my_df.groupby("State")['Elec'].agg(['idxmin', 'idxmax']).stack()] ;)
@ShubhamSharma thanks :) yeah that makes more sense and looks cleaner but i want to spread more awareness of the .agg with NamedTuples they are pretty awesome so i try to use them more in examples

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.