2

I have two Dataframes, G1 and G2:

G1:

studentId midterm homework final
101       45       69       89  
102       20       22       11
103       11       11       11 

G2:

studentId midterm homework final
101       55       69       89  
102       20       65       11
103       11       11       11

Trying to get the output as below:

student           Old  New
101     Midterm    45   55
102     Homework   22   65   
4
  • Am not able to add below code in main question, sorry for that: code i tried: code: `GBool = (G1!= G2).stack() Gdiff = pd.concat([G1.stack()[GBool],G2.stack() [GBool]], axis=1) Gdiff.columns=["Old", "New"] print(Gdiff) but am not able to get the student id in output Commented Dec 14, 2019 at 7:00
  • @ jezrael, i referred the link you suggested, but not solved my problem Commented Dec 14, 2019 at 7:02
  • What is error here? Commented Dec 14, 2019 at 7:04
  • Am able to fetch the differences, but i want to print the first column(studentId in this case) it is giving error: Tried with below code: new_Gdiff = pd.concat([G1db.iloc[0:,0],diff[],axis=1) getting error as : AttributeError: 'list' object has no attribute 'columns'. Need help on this. Commented Dec 14, 2019 at 7:07

2 Answers 2

2

Create index by columns studentId for both DataFrames, also is possible add parameter keys to concat for new columns names:

G1 = G1.set_index('studentId')
G2 = G2.set_index('studentId')
GBool = (G1!= G2).stack() 
Gdiff = pd.concat([G1.stack()[GBool],G2.stack() [GBool]], axis=1, keys=('Old','New'))
print(Gdiff) 
                    Old  New
studentId                   
101       midterm    45   55
102       homework   22   65

If is necessary convert MultiIndex to columns:

Gdiff = (pd.concat([G1.stack()[GBool],
                   G2.stack()[GBool]], 
                   axis=1, 
                   keys=('Old','New')).rename_axis(('studentId','type')).reset_index())
print(Gdiff) 
   studentId      type  Old  New
0        101   midterm   45   55
1        102  homework   22   65
Sign up to request clarification or add additional context in comments.

Comments

2

Here is an alternate way using merge and pd.wide_to_long:

m=df1.merge(df2,on='studentId',suffixes=('_old','_new'))
n=(pd.wide_to_long(m,df1.columns[1:],'studentId','type',sep='_',suffix='\w+')
                                                         .unstack().stack(0))
n.query("new!=old")

           type    new old
studentId           
101       midterm   55  45
102       homework  65  22

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.