How to create a column in a Pandas dataframe based on a conditional substring search of one or more OTHER columns

Question

I have the following data frame:

import pandas as pd

df = pd.DataFrame({'Manufacturer':['Allen Edmonds', 'Louis Vuitton 23', 'Louis Vuitton 8', 'Gulfstream', 'Bombardier', '23 - Louis Vuitton', 'Louis Vuitton 20'],
                   'System':['None', 'None', '14 Platinum', 'Gold', 'None', 'Platinum 905', 'None']
                  })

I would like to create another column in the data frame named Pricing, which contains the value "East Coast" if the following conditions hold:

a) if a substring in the Manufacturer column matches "Louis",

AND

b) if a substring in the System column matches "Platinum"

The following code operates on a single column:

df['Pricing'] = np.where(df['Manufacturer'].str.contains('Louis'), 'East Coast', 'None')

I tried to chain this together using AND:

df['Pricing'] = np.where(df['Manufacturer'].str.contains('Louis'), 'East Coast', 'None') and np.where(df['Manufacturer'].str.contains('Platimum'), 'East Coast', 'None')

But, I get the following error:

ValueError: The truth value of an array with more than one element is ambiguous. Use `a.any()` or `a.all()`

Can anyone help with how I would implement a.any() or a.all() given the two conditions "a" and "b" above? Or, perhaps there is a more efficient way to create this column without using np.where?

Thanks in advance!

jlb_gouveia · Accepted Answer · 2020-11-15 00:21:06Z

2

Using .loc to slice the dataframe, according to your conditions:

df.loc[(df['Manufacturer'].str.contains('Louis')) & 
       (df['System'].str.contains('Platinum')),
      'Pricing'] = 'East Coast'
df

    Manufacturer        System       Pricing
0   Allen Edmonds       None         NaN
1   Louis Vuitton 23    None         NaN
2   Louis Vuitton 8 14  Platinum     East Coast
3   Gulfstream          Gold         NaN
4   Bombardier          None         NaN
5   23 - Louis Vuitton  Platinum 905 East Coast
6   Louis Vuitton 20    None         NaN

answered Nov 15, 2020 at 0:21

jlb_gouveia

6134 silver badges11 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

trigonom · Accepted Answer · 2020-11-15 00:22:11Z

1

def contain(x):
    if 'Louis' in x.Manufacturer and 'Platinum' in x.System:
        return "East Coast" 

df['pricing'] = df.apply(lambda x:contain(x),axis = 1)

answered Nov 15, 2020 at 0:22

trigonom

5284 silver badges9 bronze badges

Collectives™ on Stack Overflow

How to create a column in a Pandas dataframe based on a conditional substring search of one or more OTHER columns

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related