Merging dataframes based on index

Question

How can I merge 2 dataframe df1 and df2 in order to get df3 that has the rows of df1 and df2 that have the same index (and the same values in the columns)?

df1 = pd.DataFrame({'A': ['A0', 'A2', 'A3', 'A7'],
                        'B': ['B0', 'B2', 'B3', 'B7'],
                        'C': ['C0', 'C2', 'C3', 'C7'],
                        'D': ['D0', 'D2', 'D3', 'D7']},
                         index=[0, 2, 3,7])

test 1

df2 = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A7'],
                    'B': ['B0', 'B1', 'B2', 'B7'],
                    'C': ['C0', 'C1', 'C2', 'C7'],
                    'D': ['D0', 'D1', 'D2', 'D7']},
                     index=[0, 1, 2, 7])

test 2

df2 = pd.DataFrame({'A': ['A1'],
                    'B': ['B1'],
                    'C': ['C1'],
                    'D': ['D1']},
                     index=[1])

Expected output test 1

Out[13]: 
    A   B   C   D
0  A0  B0  C0  D0
2  A2  B2  C2  D2
7  A7  B7  C7  D7

Expected output test 2

Empty DataFrame
Columns: [A, B, C, D]
Index: []

EdChum · Accepted Answer · 2017-08-24 10:31:29Z

2

Just merge:

In[111]:
df1.merge(df2)

Out[111]: 
    A   B   C   D
0  A0  B0  C0  D0

The default params for merge is to merge all columns, performing an inner merge so only where all values agree

Looking at the index matching requirement, I'd filter the df prior to the merge:

In[131]:
filtered = df1.loc[df2.index].dropna()
filtered

Out[131]: 
    A   B   C   D
1  A1  B1  C1  D1

and then merge

In[132]:
filtered.merge(df2)
Out[132]: 
    A   B   C   D
0  A0  B0  C0  D0

if the indices do not match at all, say the first row of df2 is 1 instead of 2:

In[133]:
filtered = df1.loc[df2.index].dropna()
filtered
Out[133]: 
    A   B   C   D
1  A1  B1  C1  D1

then merge will return an empty df because the index row value doesn't agree:

In[134]:
filtered.merge(df2)

Out[132]: 
Empty DataFrame
Columns: [A, B, C, D]
Index: []

UPDATE

On your new dataset, merge will reset the index which is the default behaviour:

In[152]:
filtered.merge(df2)

Out[152]: 
    A   B   C   D
0  A0  B0  C0  D0
1  A2  B2  C2  D2
2  A7  B7  C7  D7

So to retain the indices, we can just make a boolean mask using the equality operator and call dropna so that any rows with any NaN values which will occur where the values don't agree will get dropped, this should handle all cases:

In[153]:
filtered[filtered== df2.loc[filtered.index]].dropna()

Out[153]: 
    A   B   C   D
0  A0  B0  C0  D0
2  A2  B2  C2  D2
7  A7  B7  C7  D7

edited Aug 24, 2017 at 10:31

answered Aug 24, 2017 at 8:50

EdChum

397k204 gold badges837 silver badges583 bronze badges

Sign up to request clarification or add additional context in comments.

13 Comments

gabboshow Over a year ago

I've edited the question with a different test case... not sure the solution that you gave works for this case..

cs95 Over a year ago

It gives the right rows but doesn't preserve the indices, it seems.

EdChum Over a year ago

@cᴏʟᴅsᴘᴇᴇᴅ indices are ignored when merging on columns, semantically I think merge is the correct approach, I'll update to show how to get this to work as posted

gabboshow Over a year ago

@EdChum the problem of this solution is that if there are not common indeces the merge will give a warining... any idea how to solve it?, I added the test case in the question

gabboshow Over a year ago

actually it gives an error KeyError: "None of [Int64Index([1], dtype='int64')] are in the [index]"

|

P.Tillmann · Accepted Answer · 2017-08-24 08:51:32Z

1

If you are sure that the values are the same you can do:

df1.loc[df1.index.to_series().isin(df2.index)]

Theres no need to do a merge.

answered Aug 24, 2017 at 8:51

P.Tillmann

2,12012 silver badges17 bronze badges

2 Comments

EdChum Over a year ago

Semantically this is just matching on indices and not column or column values which is not what the OP's question is about

P.Tillmann Over a year ago

Well, thats exactly what he asked. He doesn't want to merge data from two dataframes, he just wants to filter based on the index.

Collectives™ on Stack Overflow

Merging dataframes based on index

test 1

test 2

Expected output test 1

Expected output test 2

2 Answers 2

13 Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

test 1

test 2

Expected output test 1

Expected output test 2

2 Answers 2

13 Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related