Using multiple conditions in pandas DataFrame gives ValueError

Question

I have a data table like this:

       Item Colour    Item Range Item Size
789    COLOUR-BLUE    RANGE-PANT  SIZE-XXL
2507  COLOUR-BLACK   RANGE-OTHER  SIZE-XXL
2376  COLOUR-BLACK  RANGE-JACKET    SIZE-S
1378  COLOUR-WHITE   RANGE-OTHER    SIZE-L
598    COLOUR-BLUE  RANGE-JACKET    SIZE-M
1589   COLOUR-BLUE  RANGE-JACKET    SIZE-L
2580  COLOUR-BLACK   RANGE-SHIRT    SIZE-L
366    COLOUR-BLUE    RANGE-PANT  SIZE-XXL
2320  COLOUR-WHITE   RANGE-OTHER    SIZE-L
1247  COLOUR-GREEN    RANGE-PANT    SIZE-M
2224  COLOUR-BLACK  RANGE-JACKET    SIZE-L
3615  COLOUR-BLACK   RANGE-OTHER    SIZE-S
4176  COLOUR-GREEN    RANGE-PANT   SIZE-XL
1640  COLOUR-BLACK    RANGE-PANT    SIZE-S
1136  COLOUR-WHITE   RANGE-OTHER    SIZE-M
3437  COLOUR-BLACK  RANGE-JACKET    SIZE-S
4448  COLOUR-WHITE   RANGE-OTHER    SIZE-S
1188  COLOUR-WHITE   RANGE-SHIRT  SIZE-XXL
3332  COLOUR-GREEN   RANGE-OTHER    SIZE-M
1080  COLOUR-WHITE   RANGE-OTHER  SIZE-XXL

I want to get only the sub selection of data using the following mask:

mask = (df['Item Colour'] == 'COLOUR-WHITE') & (df['Item Range'] in ['RANGE-JACKET','RANGE-PANT']) & (df['Item Size'] not in ['SIZE-XXL'])

I tried df[mask] but it gives me the error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

How to avoid the error.

I have done this so far:

import numpy as np
import pandas as pd

df = pd.read_clipboard()
df.drop(['Item','Item.2','Size'], inplace=True,axis=1)
df.columns = ['Item Colour', 'Item Range', 'Item Size']
print(df)

mask = (df['Item Colour'] == 'COLOUR-WHITE') & (df['Item Range'] in ['RANGE-JACKET','RANGE-PANT']) & (df['Item Size'] not in ['SIZE-XXL'])

dff = df[mask]
dff

Update Still does not work.

mask =  (df['Item Colour'] == 'COLOUR-WHITE').all()\
      & (df['Item Range'] in ['RANGE-JACKET','RANGE-PANT']).all()\
      & ( ~df['Item Size'].isin(['SIZE-XXL']).all())

df[mask]

you need to change you in and not in statements to pd.series.isin() — It_is_Chris
– It_is_Chris, Commented Dec 8, 2018 at 21:31
why is @Vishnu Kunchur's answer not the accepted one? Bhishan's answer is practically the same, but later. — Zanshin
– Zanshin, Commented Dec 8, 2018 at 22:08

Vishnu Kunchur · Accepted Answer · 2018-12-08 21:31:19Z

2

The problem is coming from the way you're building your mask by checking whether items are in a list. You can do this with the pd.Series.isin([item1, item2, ...]) Series method. So, instead of:

df['Item Range'] in ['RANGE-JACKET','RANGE-PANT'],

do:

df['Item Range'].isin(['RANGE-JACKET','RANGE-PANT'])

To negate, for the 'not in':

df['Item Size'] not in ['SIZE-XXL'],

you can do:

~df['Item Size'].isin(['SIZE-XXL'])

answered Dec 8, 2018 at 21:31

Vishnu Kunchur

1,73611 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Using multiple conditions in pandas DataFrame gives ValueError

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related