1

I have a data table like this:

       Item Colour    Item Range Item Size
789    COLOUR-BLUE    RANGE-PANT  SIZE-XXL
2507  COLOUR-BLACK   RANGE-OTHER  SIZE-XXL
2376  COLOUR-BLACK  RANGE-JACKET    SIZE-S
1378  COLOUR-WHITE   RANGE-OTHER    SIZE-L
598    COLOUR-BLUE  RANGE-JACKET    SIZE-M
1589   COLOUR-BLUE  RANGE-JACKET    SIZE-L
2580  COLOUR-BLACK   RANGE-SHIRT    SIZE-L
366    COLOUR-BLUE    RANGE-PANT  SIZE-XXL
2320  COLOUR-WHITE   RANGE-OTHER    SIZE-L
1247  COLOUR-GREEN    RANGE-PANT    SIZE-M
2224  COLOUR-BLACK  RANGE-JACKET    SIZE-L
3615  COLOUR-BLACK   RANGE-OTHER    SIZE-S
4176  COLOUR-GREEN    RANGE-PANT   SIZE-XL
1640  COLOUR-BLACK    RANGE-PANT    SIZE-S
1136  COLOUR-WHITE   RANGE-OTHER    SIZE-M
3437  COLOUR-BLACK  RANGE-JACKET    SIZE-S
4448  COLOUR-WHITE   RANGE-OTHER    SIZE-S
1188  COLOUR-WHITE   RANGE-SHIRT  SIZE-XXL
3332  COLOUR-GREEN   RANGE-OTHER    SIZE-M
1080  COLOUR-WHITE   RANGE-OTHER  SIZE-XXL

I want to get only the sub selection of data using the following mask:

mask = (df['Item Colour'] == 'COLOUR-WHITE') & (df['Item Range'] in ['RANGE-JACKET','RANGE-PANT']) & (df['Item Size'] not in ['SIZE-XXL'])

I tried df[mask] but it gives me the error:

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

How to avoid the error.

I have done this so far:

import numpy as np
import pandas as pd

df = pd.read_clipboard()
df.drop(['Item','Item.2','Size'], inplace=True,axis=1)
df.columns = ['Item Colour', 'Item Range', 'Item Size']
print(df)

mask = (df['Item Colour'] == 'COLOUR-WHITE') & (df['Item Range'] in ['RANGE-JACKET','RANGE-PANT']) & (df['Item Size'] not in ['SIZE-XXL'])

dff = df[mask]
dff

Update Still does not work.

mask =  (df['Item Colour'] == 'COLOUR-WHITE').all()\
      & (df['Item Range'] in ['RANGE-JACKET','RANGE-PANT']).all()\
      & ( ~df['Item Size'].isin(['SIZE-XXL']).all())

df[mask]
2
  • 1
    you need to change you in and not in statements to pd.series.isin() Commented Dec 8, 2018 at 21:31
  • why is @Vishnu Kunchur's answer not the accepted one? Bhishan's answer is practically the same, but later. Commented Dec 8, 2018 at 22:08

1 Answer 1

2

The problem is coming from the way you're building your mask by checking whether items are in a list. You can do this with the pd.Series.isin([item1, item2, ...]) Series method. So, instead of:

df['Item Range'] in ['RANGE-JACKET','RANGE-PANT'],

do:

df['Item Range'].isin(['RANGE-JACKET','RANGE-PANT'])

To negate, for the 'not in':

df['Item Size'] not in ['SIZE-XXL'],

you can do:

~df['Item Size'].isin(['SIZE-XXL'])

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.