Combine 2 rows into one dataframe [duplicate]

Question

I try to combine two rows in dataframe into one

              ID           Value1         Value2    
0             ID_1           NaN            2           
1             ID_2           NaN            7    
2             ID_1           5             NaN   
3             ID_2           8             NaN

The result should be the following

              ID           Value1         Value2    
1             ID_1           5              2     
2             ID_2           8              7

Is it possible with a method of dataframe ?

Nk03 · Accepted Answer · 2021-05-10 13:45:35Z

1

via stack/unstack

df = df.set_index('ID').stack().unstack().reset_index()

Output:

     ID  Value1  Value2
0  ID_1     5.0     2.0
1  ID_2     8.0     7.0

answered May 10, 2021 at 13:45

Nk03

15k2 gold badges11 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

tferrari Over a year ago

It's not working, I have this error ValueError: Index contains duplicate entries, cannot reshape

Anurag Dabas · Accepted Answer · 2021-05-10 14:15:17Z

1

Use set_index() , apply() method and sorted() method:

newdf=df.set_index('ID').apply(lambda x : sorted(x,key=pd.isnull))

Finally use boolean masking and isna() method:

newdf=newdf[~newdf.isna().all(1)]

Now If you print newdf you will get your desired output:

       Value1   Value2
ID      
ID_1    5.0     2.0
ID_2    8.0     7.0

If needed use reset_index() method:

newdf=newdf.reset_index()

Output of above code:

    ID      Value1  Value2
0   ID_1    5.0     2.0
1   ID_2    8.0     7.0

edited May 10, 2021 at 14:15

answered May 10, 2021 at 14:02

Anurag Dabas

24.3k9 gold badges25 silver badges41 bronze badges

2 Comments

Nk03 Over a year ago

If there are unequal amounts of NAN's in value1/value2 will it not drop all after sorting?

Anurag Dabas Over a year ago

you are right....updated answer....btw thanks for noticing this @Nk03 :)

Celius Stingher · Accepted Answer · 2021-05-10 17:27:29Z

0

For this particular case you can also use groupby():

df = df.groupby('ID')['Value1','Value2'].sum().reset_index()

edited May 10, 2021 at 17:27

answered May 10, 2021 at 14:18

Celius Stingher

18.4k6 gold badges26 silver badges54 bronze badges

Collectives™ on Stack Overflow

Combine 2 rows into one dataframe [duplicate]

3 Answers 3

1 Comment

2 Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

2 Comments

Comments

Linked

Related