5

I'm currently writing a program, and have two dataframes, indexed by strings, with the following format:

         col1   col2   col3   col4
row1     65      24     47     35
row2     33      48     25     89
row3     65      34     67     34
row4     24      12     52     17

and

         col5   col6
row1     81     58
row2     25     36
row3     67     70
row4     52     82

and would like to merge/join/concatenate the frames into something which looks like this:

         col1   col2   col3   col4   col5   col6
row1     65      24     47     35    81     58
row2     33      48     25     89    25     36
row3     65      34     67     34    67     70
row4     24      12     52     17    52     82

With every method i've tried, and after reading through the Pandas documentation on merging/concatenation/joining, I was unable to find a way to perform such a merge without duplicate row indices, and usually the operations produced something looking like this:

         col1   col2   col3   col4  col5   col6
row1     65      24     47     35
row2     33      48     25     89
row3     65      34     67     34
row4     24      12     52     17
row1                                 81     58
row2                                 25     36
row3                                 67     70
row4                                 52     82

However, this is not the format I want my data in. What would be the most efficient way to perform a merge, so that values with identical indices are merged together? Note that the dataframes may be of different dimensions as well in some cases.

2
  • Show us what you tried. This is a simple horizontal merge: pd.concat([df1, df2], axis=1). Commented Sep 18, 2017 at 1:22
  • I've tried using pd.concat; I get the following error: ValueError: Shape of passed values is (8, 35), indices imply (8, 34) Commented Sep 18, 2017 at 5:32

1 Answer 1

1

pd.concat along the first axis

pd.concat([df1, df2], 1)

      col1  col2  col3  col4  col5  col6
row1    65    24    47    35    81    58
row2    33    48    25    89    25    36
row3    65    34    67    34    67    70
row4    24    12    52    17    52    82

If the problem is with your index, you can add a parameter ignore_index=True:

df = pd.concat([df1, df2], 1, ignore_index=True)
df
       0   1   2   3   4   5
row1  65  24  47  35  81  58
row2  33  48  25  89  25  36
row3  65  34  67  34  67  70
row4  24  12  52  17  52  82

DataFrame.align

Another option,

df3, df4 = df1.align(df2)    
df3.fillna(0) + df4.fillna(0)

      col1  col2  col3  col4  col5  col6
row1  65.0  24.0  47.0  35.0  81.0  58.0
row2  33.0  48.0  25.0  89.0  25.0  36.0
row3  65.0  34.0  67.0  34.0  67.0  70.0
row4  24.0  12.0  52.0  17.0  52.0  82.0
Sign up to request clarification or add additional context in comments.

14 Comments

concat should work but i think i'm encountering an error because I'm trying to merge two dataframes of unequal shape/size. using pd.concat gives the following error: ValueError: Shape of passed values is (8, 35), indices imply (8, 34) maybe the problem is with my index? would be weird, the dataframes are indexed by strings
@ice_cream Try this: df1, df2 = df1.align(df2); df = df1 + df2;
@ice_cream Wait, I need to give you a modified version of that line.
@ice_cream Are all the columns numeric? If yes, you can do this: df3, df4 = df1.align(df2); df = df3.fillna(0) + df4.fillna(0)
the columns are indexed by strings as well
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.