1

I have this df:

import pandas as pd

df1 = pd.DataFrame({
  'Type': ['red', 'blue', 'red', 'red', 'blue'],
  'V1': ['No', 'No', 'No', 'Yes', 'No'],
  'V2': ['Yes', 'Yes', 'No', 'Yes', 'No'],
  'V3': ['Yes', 'No', 'No', 'Yes', 'No'],
  'V4': ['No', 'No', 'No', 'Yes', 'Yes']
})

And I want a dataframe that looks like this:

    Type    V1    V2    V3    V4   V3_4 
0   red     No    Yes   Yes   No   Yes
1   blue    No    Yes   No    No   No
2   red     No    No    No    No   No
3   red     Yes   Yes   Yes   Yes  Yes
4   blue    No    No    No    Yes  Yes

So basically any "Yes" values from V3 are carried forward into a new column V3_4 as well as "Yes" values from V4 into column V3_4.

It looks like I can do this either with a ffill or build a python function with some logic. I would be fine with either method and am wondering what the most elegant is.

0

4 Answers 4

6

Using np.where:

df['V3_4'] = np.where(df.V3.eq('Yes') | df.V4.eq('Yes'), 'Yes', 'No')

   Type   V1   V2   V3   V4 V3_4
0   red   No  Yes  Yes   No  Yes
1  blue   No  Yes   No   No   No
2   red   No   No   No   No   No
3   red  Yes  Yes  Yes  Yes  Yes
4  blue   No   No   No  Yes  Yes

Thanks to @Anton vBR, this can also be written a bit more concisely:

np.where((df1[['V3','V4']].eq('Yes')).any(1), 'Yes', 'No')
Sign up to request clarification or add additional context in comments.

3 Comments

great! This worked well. I can accept it in 8 minutes :)
This is elegant I think, would probably have written it as np.where((df1[['V3','V4']] == 'Yes').any(1), 'Yes', 'No')
I'll add that to the answer, I like that as well
2

Using np.where

Ex:

import pandas as pd
import numpy as np
df1 = pd.DataFrame({'Type':['red','blue','red','red','blue'], 'V1':['No','No','No','Yes','No'], 'V2':['Yes','Yes','No','Yes','No'], 'V3':['Yes','No','No','Yes','No'], 'V4':['No','No','No','Yes','Yes']})
df1["V3_4"] = np.where(df1["V3"] == "No", df1["V4"], df1["V3"])
print(df1)

Output:

   Type   V1   V2   V3   V4 V3_4
0   red   No  Yes  Yes   No  Yes
1  blue   No  Yes   No   No   No
2   red   No   No   No   No   No
3   red  Yes  Yes  Yes  Yes  Yes
4  blue   No   No   No  Yes  Yes

Comments

1
def build(a,b):
    if a =='Yes':
        return "Yes"
    elif b =='Yes':
        return "Yes"
    else:
        return "No"

df1['V3_4'] = df1[['V3','V4']].apply(lambda x : build(x),axis =1)

2 Comments

Cut out the second and third check, have your else condition return b
Sure, but try to avoid apply when you can!
0

It may seems trivial but we can replace 'Yes' to True and perform or operation

df1 = pd.DataFrame({'Type':['red','blue','red','red','blue'], 'V1':['No','No','No','Yes','No'], 'V2':['Yes','Yes','No','Yes','No'], 'V3':['Yes','No','No','Yes','No'], 'V4':['No','No','No','Yes','Yes']})

df1[['V3','V4']]=df1[['V3','V4']].replace({'Yes':True,'No':False})
x=df1.V4.astype('bool')|df1.V3.astype('bool')

df1[['V3','V4']]=df1[['V3','V4']].replace({True:'Yes',False:'No'})
df1['V3_4']=x.replace({True:'Yes',False:'No'})
df1

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.