0

Consider I have a data frame :

>>> data
   c0  c1  c2   _c1   _c2
0   0   1   2  18.0  19.0
1   3   4   5   NaN   NaN
2   6   7   8  20.0  21.0
3   9  10  11   NaN   NaN
4  12  13  14   NaN   NaN
5  15  16  17   NaN   NaN

I want to update the values in the c1 and c2 columns with the values in the _c1 and _c2 columns whenever those latter values are not NaN. Why won't the following work, and what is the correct way to do this?

>>> data.loc[~(data._c1.isna()),['c1','c2']]=data.loc[~(data._c1.isna()),['_c1','_c2']]
>>> data
   c0    c1    c2   _c1   _c2
0   0   NaN   NaN  18.0  19.0
1   3   4.0   5.0   NaN   NaN
2   6   NaN   NaN  20.0  21.0
3   9  10.0  11.0   NaN   NaN
4  12  13.0  14.0   NaN   NaN
5  15  16.0  17.0   NaN   NaN

For completeness's sake I want the result to look like

>>> data.loc[~(data._c1.isna()),['c1','c2']]=data.loc[~(data._c1.isna()),['_c1','_c2']]
>>> data
   c0    c1    c2   _c1   _c2
0   0  18.0  19.0. 18.0  19.0
1   3   4.0   5.0   NaN   NaN
2   6  20.0  21.0  20.0  21.0
3   9  10.0  11.0   NaN   NaN
4  12  13.0  14.0   NaN   NaN
5  15  16.0  17.0   NaN   NaN
2
  • 1
    Prefer use data._c1.notna() rather than ~(data._c1.isna()) for readability. Commented Nov 19, 2021 at 22:26
  • Thanks for pointing that out @Corralien I'll make sure to change that in my code base. Commented Nov 20, 2021 at 20:15

2 Answers 2

1

I recommend update after rename

df.update(df[['_c1','_c2']].rename(columns={'_c1':'c1','_c2':'c2'}))
df
Out[266]: 
   c0    c1    c2   _c1   _c2
0   0  18.0  19.0  18.0  19.0
1   3   4.0   5.0   NaN   NaN
2   6  20.0  21.0  20.0  21.0
3   9  10.0  11.0   NaN   NaN
4  12  13.0  14.0   NaN   NaN
5  15  16.0  17.0   NaN   NaN
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks I didn't even know that df.update was a thing. I'll use that a lot more now I think. Do you by any chance know WHY the above doesn't work like I thought it would?
1

You can use np.where:

df[['c1', 'c2']] = np.where(df[['_c1', '_c2']].notna(),
                            df[['_c1', '_c2']],
                            df[['c1', 'c2']])
print(df)

# Output:
   c0    c1    c2   _c1   _c2
0   0  18.0  19.0  18.0  19.0
1   3   4.0   5.0   NaN   NaN
2   6  20.0  21.0  20.0  21.0
3   9  10.0  11.0   NaN   NaN
4  12  13.0  14.0   NaN   NaN
5  15  16.0  17.0   NaN   NaN

Update

Do you know by any chance WHY the above doesn't work the way I thought it would?

Your column names from left and right side of your expression are different so Pandas can't use this values even if the shape is the same.

# Left side of your expression
>>> data.loc[~(data._c1.isna()),['c1','c2']]
     c1    c2  # <- note the column names
0  18.0  19.0
2  20.0  21.0

# Right side of your expression
>>> data.loc[~(data._c1.isna()),['_c1','_c2']]
    _c1   _c2  # <- Your column names are difference from left side
0  18.0  19.0
2  20.0  21.0

How to solve it? Simply use .values on the right side. As your right side is not row/column indexed, Pandas use the shape to set the values.

data.loc[~(data._c1.isna()),['c1','c2']] = \
    data.loc[~(data._c1.isna()),['_c1','_c2']].values
print(data)

# Output:
   c0    c1    c2   _c1   _c2
0   0  18.0  19.0  18.0  19.0
1   3   4.0   5.0   NaN   NaN
2   6  20.0  21.0  20.0  21.0
3   9  10.0  11.0   NaN   NaN
4  12  13.0  14.0   NaN   NaN
5  15  16.0  17.0   NaN   NaN

8 Comments

Thanks so much! This is also a good and efficient answer but BENY got their answer in a little bit quicker than you. I'm using this elsewhere in my codebase because it's fast and effective and I'm kind of embarrassed I didn't think of it since I do know how to use np.where. Do you know by any chance WHY the above doesn't work the way I thought it would?
No word on WHY the above doesn't work though? :D
You updated your comment, no? What's wrong. Do you have an error?
No I just don't understand why my original solution doesn't work. Your solution is definitely correct, I just don't see why my original solution isn't.
Do you miss my updated answer?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.