0

I have a master dataframe with two sets of values:

df1 = pd.DataFrame({'id1': [1, 1, 2, 2],
               'dir1': [True, False, True, False],
               'value1': [55, 40, 84, 31],
               'id2': [3, 3, 4, 4],
               'dir2': [True, False, False, True],
               'value2': [60, 30, 7, 15]})

   id1   dir1  value1  id2   dir2  value2
0    1   True      55    3   True      60
1    1  False      40    3  False      30
2    2   True      84    4  False       7
3    2  False      31    4   True      15

I then have an update dataframe that looks like this:

df2 = pd.DataFrame({'id': [1, 2, 3, 4],
               'value': [21, 22, 23, 24]})
   id  value
0   1     21
1   2     22
2   3     23
3   4     24

I want to update df1 with the new values of df2 but only where dirX is True. Data should then look like this:

   id1   dir1  value1  id2   dir2  value2
0    1   True     *21    3   True     *23
1    1  False      40    3  False      30
2    2   True     *22    4  False       7
3    2  False      31    4   True     *24

Any idea if something like this is even possible? I tried looking at .update but I could not get it to work. I'm fairly new to python and only coding at 23:00, so maybe I'm just not as sharp as I need to be.

2
  • Hi, why you are not looping through df1 and change values when dirX is true? Commented Jan 20, 2021 at 20:57
  • I love loops, but what I've learned about pandas is, loops are your VERY last resort. Always way slow and way inefficient. Commented Jan 21, 2021 at 5:16

2 Answers 2

1

I agree with Thales' answer. First, you merge df2 with df1 based on id1:

df = df1.merge(df2, left_on='id1', right_on='id')

Then, you replace value1 based on dir1 with value:

df.value1 = np.where(df.dir1 == True, df.value, df.value1)

Then, you drop the extra columns

df = df.drop(['id', 'value'],axis=1)

Then, you merge df2 with df1 based on id2:

df = df.merge(df2, left_on='id2', right_on='id')

Do the same replacing, but for value2

df.value2 = np.where(df.dir2 == True, df.value, df.value2)

Then, drop the extra columns:

df = df.drop(['id', 'value'],axis=1)

The resulting dataframe will look like:

   id1   dir1  value1  id2   dir2  value2
0    1   True      21    3   True      23
1    1  False      40    3  False      30
2    2   True      22    4  False       7
3    2  False      31    4   True      24
Sign up to request clarification or add additional context in comments.

1 Comment

Like a charm! Thank you so much!
0

Try to use np.where function from numpy.

Maybe something like this:

df_1['value1'] = np.where(df_1['dir2'] == True, df_2['value'], df_1['value1'])

Maybe you'll need some adjustments or some merges, but I think this will help you to find a solution.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.