Update dataframe column based on another dataframe column without for loop

Question

I have two dataframes df1 and df2.

df1:

id val
1  25
2  40
3  78

df2:

id val
2  8
1  5

Now I want to do something like df1['val'] = df1['val']/df2['val'] for matching id. I can do that by iterating over all df2 rows as df2 is a subset of df1 so it may be missing some values, which I want to keep unchanged. This is what I have right now:

for row in df2.iterrows():
    df1.loc[df1['id']==row[1]['id'], 'val'] /= row[1]['val']

df1:

id val
1  5
2  5
3  78

How can I achieve the same without using for loop to improve speed?

jezrael · Accepted Answer · 2021-09-14 10:35:18Z

2

Use Series.map with Series.div:

df1['val'] = df1['val'].div(df1['id'].map(df2.set_index('id')['val']), fill_value=1)
print (df1)
   id   val
0   1   5.0
1   2   5.0
2   3  78.0

Solution with merge with left join:

df1['val'] = df1['val'].div(df1.merge(df2, on='id', how='left')['val_y'], fill_value=1)
          
print (df1)
   id   val
0   1   5.0
1   2   5.0
2   3  78.0

edited Sep 14, 2021 at 10:35

answered Sep 14, 2021 at 10:25

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Update dataframe column based on another dataframe column without for loop

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related