0

If I have a df as below:

ID | Car | Plane | Tank | Scooter | Misc  | Day
4    Yes    No      Yes    No        32     Mon
2    No     No      No     No        22     Tues
1    Yes    No      No     No        11     Wed

How can I create a new column that says True or False if there is a value of Yes or No in any of the columns that is Car or Plane or Tank or Scooter? Thanks

1
  • 2
    df['new_column'] = df[['Car','Plane','Scooter']].eq("Yes").any(1) Commented Aug 26, 2020 at 4:27

2 Answers 2

1

You can use the .iloc to determine which column you want to use for your check. And you can. use the .any(1) to see if any of the value is a 'Yes' or 'No'

The code will be as follows. I added a 4th row with 'Maybe' as a value to show you that the record does not meet the 'Yes', 'No' condition.

#created the DataFrame with a few sample values
import pandas as pd
df = pd.DataFrame({'ID':[4,2,1,3],
                   'Car':['Yes','No','Yes','Maybe'],
                   'Plane':['No','No','No','Maybe'],
                   'Tank':['Yes','No','No','Maybe'],
                   'Scooter':['No','No','No','Maybe'],
                   'Misc':[32,22,11,44],
                   'Day':['Mon','Tues','Wed','Thu']})

#printing the full DataFrame to make sure the values are as expected
print(df)

#the iloc option can be used to filter the columns you want to checked
#printing it out for you to see which ones are being used for selection 
print(df.iloc[:,1:-2])

#if you want to check for 'Yes' or 'No', then use |. If either then it will set to 'True'
#if you want to check for only for 'Yes', then you dont need the second part
df['Check'] = ((df.iloc[:,1:-2] == 'Yes') | (df.iloc[:,1:-2] == 'No')).any(1)

#the DataFrame will have the new column with True or False
print (df)

The Output are as follows:

Initial DataFrame:

   ID    Car  Plane   Tank Scooter  Misc   Day
0   4    Yes     No    Yes      No    32   Mon
1   2     No     No     No      No    22  Tues
2   1    Yes     No     No      No    11   Wed
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu

Filtered Columns from the DataFrame are:

     Car  Plane   Tank Scooter
0    Yes     No    Yes      No
1     No     No     No      No
2    Yes     No     No      No
3  Maybe  Maybe  Maybe   Maybe

Final Results for you to use:

   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues   True
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False

If your condition changes to the following:

If any value in 'Car', 'Plane', 'Tank', 'Scooter' = 'Yes', set 'Check' to True. For all other cases, set 'Check' to False.

Then, the earlier code can be simplified as follows:

df['Check'] = (df.iloc[:,1:-2] == 'Yes').any(1)

The output for this will be as follows:

   ID    Car  Plane   Tank Scooter  Misc   Day  Check
0   4    Yes     No    Yes      No    32   Mon   True
1   2     No     No     No      No    22  Tues  False
2   1    Yes     No     No      No    11   Wed   True
3   3  Maybe  Maybe  Maybe   Maybe    44   Thu  False

In case your DataFrame is not structured with Car, Plane, Tank, and Scooter next to each other, you can always put these into a list and use that to filter and check.

For example, if your DataFrame is as shown below:

df = pd.DataFrame({'ID':[4,2,1,3],
                   'Car':['Yes','No','Yes','Maybe'],
                   'Plane':['No','No','No','Maybe'],
                   'Misc':[32,22,11,44],
                   'Tank':['Yes','No','No','Maybe'],
                   'Day':['Mon','Tues','Wed','Thu'],
                   'Scooter':['No','No','No','Maybe']})

Then it will look like this

   ID    Car  Plane  Misc   Tank   Day Scooter
0   4    Yes     No    32    Yes   Mon      No
1   2     No     No    22     No  Tues      No
2   1    Yes     No    11     No   Wed      No
3   3  Maybe  Maybe    44  Maybe   Thu   Maybe

You wont be able to use the .iloc[:,1:-2]. Instead you can put all the columns into a list and use that as follows.

cols = ['Car','Plane','Tank','Scooter']

print(df[cols])

df['Check'] = (df[cols] == 'Yes').any(1)

This will give you the same result as the iloc option we discussed earlier

Output will be:

   ID    Car  Plane  Misc   Tank   Day Scooter  Check
0   4    Yes     No    32    Yes   Mon      No   True
1   2     No     No    22     No  Tues      No  False
2   1    Yes     No    11     No   Wed      No   True
3   3  Maybe  Maybe    44  Maybe   Thu   Maybe  False
Sign up to request clarification or add additional context in comments.

Comments

1

the following code should give True in case any of the columns have a 'Yes' value per row

df['new col'] = df[['Car', 'Plane', 'Tank', 'Scooter']].apply(lambda x: any(x == 'Yes'), axis = 1)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.