2

I'm sure this is in SO somewhere but I can't seem to find it. I'm trying to remove or select designated columns in a pandas df. But I want to keep certain values or strings from those deleted columns.

For the df below I want to keep 'Big','Cat' in Col B,C but delete everything else.

import pandas as pd

d = ({
    'A' : ['A','Keep','A','Value'],           
    'B' : ['Big','X','Big','Y'],
    'C' : ['Cat','X','Cat','Y'],
    })

df = pd.DataFrame(data=d)

If I do either the following it only selects that row.

Big = df[df['B'] == 'Big']
Cat = df[df['C'] == 'Cat']

My intended output is:

       A    B    C
0      A  Big  Cat
1   Keep          
2      A  Big  Cat
3  Value 

I need something like x = df[df['B','C'] != 'Big','Cat']

5 Answers 5

2

Seems like you want to keep only some values and have empty string on ohters

Use np.where

keeps = ['Big', 'Cat']
df['B'] = np.where(df.B.isin(keeps), df.B, '')
df['C'] = np.where(df.C.isin(keeps), df.C, '')


    A     B     C
0   A     Big   Cat
1   Keep        
2   A     Big   Cat
3   Value       

Another solution using df.where

cols = ['B', 'C']
df[cols] = df[cols].where(df.isin(keeps)).fillna('')

    A     B     C
0   A     Big   Cat
1   Keep        
2   A     Big   Cat
3   Value       
Sign up to request clarification or add additional context in comments.

Comments

2

IIUC

Update

df[['B','C']]=df[['B','C']][df[['B','C']].isin(['Big','Cat'])].fillna('')
df
Out[30]: 
       A    B    C
0      A  Big  Cat
1   Keep          
2      A  Big  Cat
3  Value          

1 Comment

Sorry @Wen. I need to keep everything in Col A and 'Big,Cat' in Col B,C in the one df
1

You can filter on column combinations via NumPy and np.ndarray.all:

mask = (df[['B', 'C']].values != ['Big', 'Cat']).all(1)

df.loc[mask, ['B', 'C']] = ''

print(df)

       A    B    C
0      A  Big  Cat
1   Keep          
2      A  Big  Cat
3  Value          

3 Comments

I think OP wants to 'keep' only some values in the columns, not filter rows
@RafaelC, Thank you, have updated (with a similar idea).
Nice one dude :-), I get confused at the start, sorry for the miss leading
1

Or this:

df[['B','C']]=df[['B','C']].apply(lambda row: row if row.tolist()==['Big','Cat'] else ['',''],axis=1)
print(df)

Output:

       A    B    C
0      A  Big  Cat
1   Keep          
2      A  Big  Cat
3  Value          

2 Comments

Little bit over kill but still work nice one man , vote for you
Yw: -) have a nice day
0

Perhaps a concise version:

df.loc[df['B'] != 'Big', 'B'] = ''
df.loc[df['C'] != 'Cat', 'C'] = ''
print(df)

Output:

       A    B    C
0      A  Big  Cat
1   Keep          
2      A  Big  Cat
3  Value     

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.