1

I have the following DataFrame with two columns and I would like to create a new column based on this condition :

  • if the value of y is -1 take the value of x
  • if the value of x is -1 take the value of y
df = pd.DataFrame({'x': ['day', 'day', 'night', '-1', '-1', '-1'],
                   'y': ['-1', '-1', '-1', 'night', 'day', 'day']})
df

I have tried the following but I got a None and NaN output

def func(x):
    if (x['y'] == -1):
        return x['x']
    elif (x['x'] == -1):
        return x['y']

df = df.assign(z=df.apply(func, axis=1))
df

and

conditions = [
    (df['y'] == -1),
    (df['x'] == -1),
]

choices = [df['x'],df['y']]
df['z1'] = np.select(conditions, choices, default=np.nan)
df

the expected result should be like this

df = pd.DataFrame({'x': ['day', 'day', 'night', '-1', '-1', '-1'],
                   'y': ['-1', '-1', '-1', 'night', 'day', 'day'],
                   'z':['day','day','night','night','day','day']})
df

1 Answer 1

2

What you are doing is fine, except one moment: you are comparing with integer in func, but you have strings in your dataframe, consider to rewrite it like this (add quotes around -1):

def func(x):
    if (x['y'] == '-1'):
        return x['x']
    elif (x['x'] == '-1'):
        return x['y']

or use integers in the dataframe:

df = pd.DataFrame({'x': ['day', 'day', 'night', -1, -1, -1],
                   'y': [-1, -1, -1, 'night', 'day', 'day']})

but not both at the same time of course.

You can also consider casting types or comparing with multiple values n in [-1, '-1'] if it suits your needs better.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.