Getting error while adding column of a dataframe in pandas

Question

I have two Dataframes, G1 and G2:

G1:

studentId midterm homework final
101       45       69       89  
102       20       22       11
103       11       11       11

G2:

studentId midterm homework final
101       55       69       89  
102       20       65       11
103       11       11       11

Trying to get the output as below:

student           Old  New
101     Midterm    45   55
102     Homework   22   65

Am not able to add below code in main question, sorry for that: code i tried: code: `GBool = (G1!= G2).stack() Gdiff = pd.concat([G1.stack()[GBool],G2.stack() [GBool]], axis=1) Gdiff.columns=["Old", "New"] print(Gdiff) but am not able to get the student id in output — Geetha Anand
– Geetha Anand, Commented Dec 14, 2019 at 7:00
@ jezrael, i referred the link you suggested, but not solved my problem — Geetha Anand
– Geetha Anand, Commented Dec 14, 2019 at 7:02
Am able to fetch the differences, but i want to print the first column(studentId in this case) it is giving error: Tried with below code: new_Gdiff = pd.concat([G1db.iloc[0:,0],diff[],axis=1) getting error as : AttributeError: 'list' object has no attribute 'columns'. Need help on this. — Geetha Anand
– Geetha Anand, Commented Dec 14, 2019 at 7:07

jezrael · Accepted Answer · 2019-12-14 07:13:23Z

2

Create index by columns studentId for both DataFrames, also is possible add parameter keys to concat for new columns names:

G1 = G1.set_index('studentId')
G2 = G2.set_index('studentId')
GBool = (G1!= G2).stack() 
Gdiff = pd.concat([G1.stack()[GBool],G2.stack() [GBool]], axis=1, keys=('Old','New'))
print(Gdiff) 
                    Old  New
studentId                   
101       midterm    45   55
102       homework   22   65

If is necessary convert MultiIndex to columns:

Gdiff = (pd.concat([G1.stack()[GBool],
                   G2.stack()[GBool]], 
                   axis=1, 
                   keys=('Old','New')).rename_axis(('studentId','type')).reset_index())
print(Gdiff) 
   studentId      type  Old  New
0        101   midterm   45   55
1        102  homework   22   65

edited Dec 14, 2019 at 7:13

answered Dec 14, 2019 at 7:08

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Comments

anky · Accepted Answer · 2019-12-14 07:39:30Z

2

Here is an alternate way using merge and pd.wide_to_long:

m=df1.merge(df2,on='studentId',suffixes=('_old','_new'))
n=(pd.wide_to_long(m,df1.columns[1:],'studentId','type',sep='_',suffix='\w+')
                                                         .unstack().stack(0))
n.query("new!=old")

           type    new old
studentId           
101       midterm   55  45
102       homework  65  22

answered Dec 14, 2019 at 7:39

anky

75.3k11 gold badges46 silver badges76 bronze badges

Collectives™ on Stack Overflow

Getting error while adding column of a dataframe in pandas

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related