Using Merge on a column and Index in Pandas

Question

I have two separate dataframes that share a project number. In type_df, the project number is the index. In time_df, the project number is a column. I would like to count the number of rows in type_df that have a Project Type of 2. I am trying to do this with pandas.merge(). It works great when using both columns, but not indices. I'm not sure how to reference the index and if merge is even the right way to do this.

import pandas as pd
type_df = pd.DataFrame(data = [['Type 1'], ['Type 2']], 
                       columns=['Project Type'], 
                       index=['Project2', 'Project1'])
time_df = pd.DataFrame(data = [['Project1', 13], ['Project1', 12], 
                               ['Project2', 41]], 
                       columns=['Project', 'Time'])
merged = pd.merge(time_df,type_df, on=[index,'Project'])
print merged[merged['Project Type'] == 'Type 2']['Project Type'].count()

Error:

Name 'Index' is not defined.

Desired Output:

maxymoo · Accepted Answer · 2018-04-18 07:34:39Z

112

If you want to use an index in your merge you have to specify left_index=True or right_index=True, and then use left_on or right_on. For you it should look something like this:

merged = pd.merge(type_df, time_df, left_index=True, right_on='Project')

edited Apr 18, 2018 at 7:34

answered Jul 21, 2015 at 1:43

maxymoo

36.7k12 gold badges97 silver badges121 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Gonçalo Peres · Accepted Answer · 2021-01-05 09:05:45Z

17

Another solution is use DataFrame.join:

df3 = type_df.join(time_df, on='Project')

For version pandas 0.23.0+ the on, left_on, and right_on parameters may now refer to either column names or index level names:

left_index = pd.Index(['K0', 'K0', 'K1', 'K2'], name='key1')
left = pd.DataFrame({'A': ['A0', 'A1', 'A2', 'A3'],
                    'B': ['B0', 'B1', 'B2', 'B3'],
                     'key2': ['K0', 'K1', 'K0', 'K1']},
                    index=left_index)
                    
right_index = pd.Index(['K0', 'K1', 'K2', 'K2'], name='key1')

right = pd.DataFrame({'C': ['C0', 'C1', 'C2', 'C3'],
                     'D': ['D0', 'D1', 'D2', 'D3'],
                     'key2': ['K0', 'K0', 'K0', 'K1']},
                      index=right_index)
          
print (left)    
       A   B key2
key1             
K0    A0  B0   K0
K0    A1  B1   K1
K1    A2  B2   K0
K2    A3  B3   K1
        
print (right)
       C   D key2
key1             
K0    C0  D0   K0
K1    C1  D1   K0
K2    C2  D2   K0
K2    C3  D3   K1

df = left.merge(right, on=['key1', 'key2'])
print (df)
       A   B key2   C   D
key1                     
K0    A0  B0   K0  C0  D0
K1    A2  B2   K0  C1  D1
K2    A3  B3   K1  C3  D3

edited Jan 5, 2021 at 9:05

Gonçalo Peres

13.8k5 gold badges73 silver badges95 bronze badges

answered Aug 10, 2017 at 15:49

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

2 Comments

Murtaza Haji Over a year ago

Can I pass numeric index of column instead of column name? I have duplicate column names and this one fails because of that.

MrR Over a year ago

Confusing. Current version of join does not have left_on and right_on.

dermen · Accepted Answer · 2015-07-21 01:46:15Z

2

You must have the same column in each dataframe to merge on.

In this case, just make a 'Project' column for type_df, then merge on that:

type_df['Project'] = type_df.index.values
merged = pd.merge(time_df,type_df, on='Project', how='inner')
merged
#    Project  Time Project Type
#0  Project1    13       Type 2
#1  Project1    12       Type 2
#2  Project2    41       Type 1

print merged[merged['Project Type'] == 'Type 2']['Project Type'].count()
2

answered Jul 21, 2015 at 1:46

dermen

5,4004 gold badges25 silver badges37 bronze badges

1 Comment

tvo 2 days ago

Merging is possible on index also, as shown in the accepted answer.

Collectives™ on Stack Overflow

Using Merge on a column and Index in Pandas

3 Answers 3

Comments

2 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

2 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related