Pandas: vertical look up with two dataframes

Question

I have a dataframe df1 of coordinate values like this:

    lat         lon         E               N
0   48.010258   -6.156909   90089.518711    -201738.245555
1   48.021648   -6.105887   93961.324059    -200676.766517
2   48.033028   -6.054801   97836.321204    -199614.270439
... ...         ...         ...             ...

and another dataframe df2 that associates a climatic value to each (lat, lon) pair:

    lat         lon        val
0   48.010258   -6.156909  17.11
1   48.021648   -6.105887  22.23
2   48.033028   -6.054801  39.86
... ...         ...        ...

I want to create a new column, df1['corr_pos'], where each row is given the index of df2 corresponding to the (lat, lon) pair in df1. It is like using VLOOKUP in Excel, but using two values to identify the correct index instead of using only one. The two values are the coordinate pair.

The output would be:

    lat         lon         E               N               corr_pos
0   48.010258   -6.156909   90089.518711    -201738.245555  0
1   48.021648   -6.105887   93961.324059    -200676.766517  3
2   48.033028   -6.054801   97836.321204    -199614.270439  8
... ...         ...         ...             ...             ...

The dataframes df1 and df2 do not have the same order. How could I implement this in pandas?

FaCoffee · Accepted Answer · 2017-02-01 12:46:17Z

1

I think you need merge with reset_index to create a new column from index:

print (df2)
          lat       lon    val
7   48.010258 -6.156909  17.11
10  48.021648 -6.105887  22.23
12  48.033028 -6.054801  39.86
df = pd.merge(df1, 
              df2.reset_index().drop('val', axis=1).rename(columns={'index':'corr_pos'}), 
              on=['lat','lon'], 
              how='left')
print (df)
         lat       lon             E              N  corr_pos
0  48.010258 -6.156909  90089.518711 -201738.245555         7
1  48.021648 -6.105887  93961.324059 -200676.766517        10
2  48.033028 -6.054801  97836.321204 -199614.270439        12

If df2 has many columns, it is better to use subset as merge will delete them:

df = pd.merge(df1, 
              df2.reset_index()[['lat','lon', 'index']].rename(columns={'index':'corr_pos'}),
              on=['lat','lon'], 
              how='left')
print (df)
         lat       lon             E              N  corr_pos
0  48.010258 -6.156909  90089.518711 -201738.245555         7
1  48.021648 -6.105887  93961.324059 -200676.766517        10
2  48.033028 -6.054801  97836.321204 -199614.270439        12

edited Feb 1, 2017 at 12:46

FaCoffee

7,97932 gold badges105 silver badges184 bronze badges

answered Feb 1, 2017 at 12:06

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

FaCoffee Over a year ago

What if I want to add the corr_pos column to df1, instead of creating a new dataframe? Of course I don't want to lose the correspondences.

jezrael Over a year ago

Hmmm, there is problem you need join by 2 columns, so cannot use map solution. I have no idea it is possible without merge, maybe join - all methods return new dataframe..

FaCoffee Over a year ago

But I can create a new df and then use map to add the new column to df1.

jezrael Over a year ago

Yes, but map need one column which is mapped to another values. If check this solution there is only one column in on parameter, so map can be used. But if have more as 1 joined columns on=['val1','val2',...] then map solution is impossible.

FaCoffee Over a year ago

On second thoughts this doesn't work. It creates a corr_pos_x and corr_pos_y that I do not want.

|

Collectives™ on Stack Overflow

Pandas: vertical look up with two dataframes

1 Answer 1

6 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related