fill new column of pandas DataFrame based on if-else of other columns

Question

I have a situation where I want to create a new column in a Pandas DataFrame and populate it according to conditions involving 2 other columns. In this example:

import pandas as pd
import numpy as np

df = pd.DataFrame(np.array([['value1','value2'],['value',np.NaN],[np.NaN,np.NaN]]), columns=['col1','col2'])

I would like to create a new column, 'new col', which consists of 1) the value in 'col2' if it is not NaN else, 2) the value in 'col1' if it is not NaN else, 3) NaN

I am trying this function with .apply() but it is not returning the desired result

def singleval(row):
    if row['col2'] != np.NaN:
        val = row['col2']
    elif row['col1'] != np.NaN:
        val = row['col1']
    else:
        val = np.NaN
    return val

df['new col'] = df.apply(singleval,axis=1)

i want the values in 'new col' to be ['value2', 'value', 'nan']

Erfan · Accepted Answer · 2019-05-13 23:40:12Z

2

Method 1 `fillna`

In this case, we can simply use fillna on col2 with values from col1:

df['new col'] = df['col2'].fillna(df['col1'])

     col1    col2 new col
0  value1  value2  value2
1   value     NaN   value
2     NaN     NaN     NaN

Method 2 `np.select`

If you have multiple conditions, use np.select which you pass a list of conditions and based on those conditions you pass it choices:

conditions = [
    df['col2'].notnull(),
    df['col1'].notnull(),
]

choices=[df['col2'], df['col1']]

df['new col'] = np.select(conditions, choices, default=np.NaN)

     col1    col2 new col
0  value1  value2  value2
1   value     NaN   value
2     NaN     NaN     NaN

Note

Your dataframe wasn't correct with the NaN, use this one instead to test:

df = pd.DataFrame({'col1':['value1', 'value', np.NaN],
                   'col2':['value2', np.NaN, np.NaN]})

Edit: why was the function not working?

np.NaN == np.NaN will return False
while np.NaN is np.NaN will return True.

See this question for the explanation of this.

So to fix your function you have to use is not:

def singleval(row):
    if row['col2'] is not np.NaN:
        val = row['col2']
    elif row['col1'] is not np.NaN:
        val = row['col1']
    else:
        val = np.NaN
    return val

df['new col'] = df.apply(singleval, axis=1)

     col1    col2 new col
0  value1  value2  value2
1   value     NaN   value
2     NaN     NaN     NaN

edited May 13, 2019 at 23:40

answered May 13, 2019 at 23:20

Erfan

43.4k10 gold badges75 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

laszlopanaflex Over a year ago

struggling to see why your df is different from my df...nevermind: looks like it has to do with np.array()

Erfan Over a year ago

Not sure either, would be a good question on SO as well :). @laszlopanaflex

laszlopanaflex Over a year ago

thank you the 2 solutions. is it possible to explain why my original approach didn't work? im not able to see where the if-elif-else approach breaks down...

Erfan Over a year ago

Added explanation about your approach @laszlopanaflex, good question btw!

Andy L. · Accepted Answer · 2019-05-14 01:26:13Z

0

Use df.ffill on axis=1

df['new_col'] = df.ffill(1).col2

Out[1318]:
     col1    col2 new_col
0  value1  value2  value2
1   value     NaN   value
2     NaN     NaN     NaN

answered May 14, 2019 at 1:26

Andy L.

25.3k4 gold badges20 silver badges30 bronze badges

Comments

Quang Hoang · Accepted Answer · 2019-05-14 01:34:07Z

0

Try this:

df['col3'] = df[['col1','col2']].stack().groupby(level=0).last()

output:

    col1    col2    col3
0   value1  value2  value2
1   value   nan     value
2   nan     nan     nan

edited May 14, 2019 at 1:34

answered May 13, 2019 at 23:18

Quang Hoang

151k11 gold badges64 silver badges86 bronze badges

Collectives™ on Stack Overflow

fill new column of pandas DataFrame based on if-else of other columns

3 Answers 3

Method 1 `fillna`

Method 2 `np.select`

4 Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Method 1 fillna

Method 2 np.select

4 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related

Method 1 `fillna`

Method 2 `np.select`