0

I have a dataframe as below:

df = pd.DataFrame({'ORDER':["A", "A", "A", "B", "B","B"], 'New1': [2, 1, 3, 4, np.nan, np.nan], 'New2': [np.nan, np.nan, np.nan, np.nan, 5, np.nan]})
df

    ORDER   New1    New2
0   A       2.0     NaN
1   A       1.0     NaN
2   A       3.0     NaN
3   B       4.0     NaN
4   B       NaN     5.0
5   B       NaN     NaN

I want to create a column "New" by merging the columns New1 and New2 in a way that if one of the columns is NaN and another one has the value, keep the value. Foe example New for row1 will be 2.

My expected output

    ORDER   New 
0   A       2.0 
1   A       1.0 
2   A       3.0 
3   B       4.0 
4   B       5.0 
5   B       NaN

2 Answers 2

2
df["New"]= df.loc[:,["New1","New2"]].sum(axis=1).replace(0.0,np.NaN)
Sign up to request clarification or add additional context in comments.

Comments

1

(Note: At the end of this answer is the one-line solution.)

The series' method .combine_first() does what you wanted:

resulting_column = df.New1.combine_first(df.New2)
resulting_column
0    2.0
1    1.0
2    3.0
3    4.0
4    5.0
5    NaN
Name: New1, dtype: float64

Then rename this series (see the last row — its name is New1) to New and join it with df[["ORDER"]]

resulting_column.name = "New"
df_result = df[["ORDER"]].join(resulting_column)
df_result
  ORDER  New
0     A  2.0
1     A  1.0
2     A  3.0
3     B  4.0
4     B  5.0
5     B  NaN

One-line solution:

df_result = df[["ORDER"]].join(df.New1.combine_first(df.New2).rename("New"))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.