0
Date        Count_Doc   Sum_Words   S&P 500     Russel 2000  Nasdaq     
2017-02-16  0.069946    3.839240    -0.568454   -0.514334   -0.592410
2017-04-12  1.655428    3.667811    -0.891697   -1.450381   -1.047976
2017-04-19  2.371889    2.110689    -0.284174   0.401092    0.427705
2017-04-20  3.261538    2.995514    1.846039    1.360092    1.660339
2017-05-02  0.738549    2.197852    0.081593    -0.849580   -0.231491

I want to leave "Count_Doc" and "Sum_Words" columns unchanged, but I am trying to sort the rest columns to be sorted by its values like below: (The order is not sorted, just ramdomly suffled )

Date        Count_Doc   Sum_Words   1st         2nd         3rd
2017-02-16  0.069946    3.839240    S&P 500     Nasdaq      Russel 2000
2017-04-12  1.655428    3.667811    Nasdaq      S&P 500     Russel 2000
2017-04-19  2.371889    2.110689    Nasdaq      S&P 500     Russel 2000
2017-04-20  3.261538    2.995514    Russel 2000 Nasdaq      S&P 500 
2017-05-02  0.738549    2.197852    Russel 2000 S&P 500     Nasdaq  

Is there any way to return the name of column as DataFrame values like that?

Thank you!

2 Answers 2

1

Use this:

df = df.set_index(['Date','Count_Doc','Sum_Words'])
df_out = pd.DataFrame(df.columns[df.values.argsort(1)[::-1]].values, 
                       df.index, 
                       columns=['1st','2nd','3rd']).reset_index()
df_out

Output:

         Date  Count_Doc  Sum_Words          1st          2nd          3rd
0  2017-02-16   0.069946   3.839240  Russel 2000       Nasdaq      S&P 500
1  2017-04-12   1.655428   3.667811  Russel 2000       Nasdaq      S&P 500
2  2017-04-19   2.371889   2.110689      S&P 500  Russel 2000       Nasdaq
3  2017-04-20   3.261538   2.995514  Russel 2000       Nasdaq      S&P 500
4  2017-05-02   0.738549   2.197852       Nasdaq      S&P 500  Russel 2000
Sign up to request clarification or add additional context in comments.

Comments

1

you can add 3 additional columns to your dataframe by ranking from the 3 indexes for each row.

df[['1st', '2nd', '3rd']] = df.iloc[:, [3,4,5]].apply(lambda x: pd.Series(x.sort_values(ascending=False).index), axis=1)
outputs:

         Date  Count_Doc  Sum_Words     ...               1st          2nd          3rd
0  2017-02-16   0.069946   3.839240     ...       Russel 2000      S&P 500       Nasdaq
1  2017-04-12   1.655428   3.667811     ...           S&P 500       Nasdaq  Russel 2000
2  2017-04-19   2.371889   2.110689     ...            Nasdaq  Russel 2000      S&P 500
3  2017-04-20   3.261538   2.995514     ...           S&P 500       Nasdaq  Russel 2000
4  2017-05-02   0.738549   2.197852     ...           S&P 500       Nasdaq  Russel 2000

here, i'm picking only the 3 columns whose names we want to sort, then applying a function row-wise which takes a series, sorts it, gets its indexes (i.e. names) and returns the indexes as a new series.

this is then assigned to the new columns ['1st', '2nd', '3rd'].

note, the sort order I used is descending, whereas in your example output you've just shown a random order.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.