1

After concatenating four multindexed tables with yearly kg/ha data I end up with a dataframe containing 22617 rows and 144 columns. What I want to do is to find the maximum of each index/year combination to have a dataframe with 36 columns. Here is an example of the data with two columns of two of the initial dataframes:

                               Y1980      Y1981      Y1980      Y1981
FID_CATCHM CCA_2  GRIDCODE                     
0          1059.0 2         21.70426  22.058224   21.70426  22.058224 
                  3         21.70426  22.058224    0.00000   0.000000
                  4          0.00000   0.000000   21.70426  22.058224
1          1059.0 2          0.00000   0.000000   21.70426  22.058224
                  4         21.70426  22.058224   21.70426  22.058224
2          1001.0 2         20.71299  21.058432   20.71299  21.058432
                  3          0.00000   0.000000   20.71299  21.058432
           1054.0 2         20.25414  20.283833   20.25414  20.283833
                  4          0.00000   0.000000   20.25414  20.283833
           1059.0 2         21.70426  22.058224   21.70426  22.058224
                  3         21.70426  22.058224   21.70426  22.058224
                  4         21.70426  22.058224   21.70426  22.058224
3          1059.0 1         21.70426  22.058224    0.00000   0.000000
                  2         21.70426  22.058224   21.70426  22.058224
                  3         21.70426  22.058224   21.70426  22.058224
                  4         21.70426  22.058224   21.70426  22.058224
4          1058.0 1          0.00000   0.000000   23.79386  24.201496
                  2         23.79386  24.201496   23.79386  24.201496
                  3          0.00000   0.000000    0.00000   0.000000
                  4         23.79386  24.201496   23.79386  24.201496
                     

What I tried to do is to use a mask

df_max = (df
           .groupby(['FID_CATCHM',
               'CCA_2', 'GRIDCODE'])
           .max())
df_mask = df_max.max(axis=1).to_frame('maximum')

but the output is identical to the concatenated dataframe. How can this be done? I appreciate every help.

1 Answer 1

3

I think you need max per columns and if necessary then per MultiIndex:

df = df.max(level=0, axis=1).max(level=[0,1,2], axis=0)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.