3

I have the following dataframe

df
A    B    C    D
1    2    NA   3
2    3    NA   1
3    NA   1    2

A, B, C, and D are answers to a question. Basically, respondents ranked answers from 1 to 3 which means that one line cannot have 2 values the same. I am trying to make a new column which is a summary of the top 3 something such as.

1st    2nd    3rd
A      B      D
D      A      B
C      D      A

This format will make it easier for me to come up with conclusions such as, here are the 3rd top answers.

I didn't find any way to do this. Could you help me, please? Thank you very much!

0

2 Answers 2

2

One way is using argsort and indexing the columns:

pd.DataFrame(df.columns[df.values.argsort()[:,:-1]],
             columns=['1st', '2nd', '2rd'])

  1st 2nd 2rd
0   A   B   D
1   D   A   B
2   C   D   A
Sign up to request clarification or add additional context in comments.

Comments

0

Another way is to use stack()/pivot():

(df.stack().astype(int)
   .reset_index(name='val')
   .pivot('level_0', 'val', 'level_1')
)

Output:

val      1  2  3
level_0         
0        A  B  D
1        D  A  B
2        C  D  A

2 Comments

What if I have other columns in my dataframe that I want to leave unchange?
You can either set those columns as index, or just run the script on the answer columns (e.g. df[['A','B','C','D']].stack()...)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.