0

I am trying to drop some rows from a pandas DataFrame based on 4 conditions needing to be met in the same row. So I tried the following command:

my_data.drop(my_data[(my_data.column1 is None) & (my_data.column2 is None) & (my_data.column3 is None) & (my_data.column4 is None)].index, inplace=True)

And it throws this error: enter image description here

I've also tried:

my_data= my_data.loc[my_data[(my_data.column1 is None) & (my_data.column2 is None) & (my_data.column3 is None) & (my_data.column4 is None), :]

but without success

Can i get some help please :)

I'm working on python 3.5

1 Answer 1

5

Normally, a column is checked for nullness with the isnull method:

df.drop(df[df['column1'].isnull() 
          & df['column2'].isnull() 
          & df['column3'].isnull() 
          & df['column4'].isnull()].index)

However, there are more compact and idiomatic ways for that:

df.dropna(subset=['column1', 'column2', 'column3', 'column4'], how='all')

A demo:

prng = np.random.RandomState(0)
df = pd.DataFrame(prng.randn(100, 6), columns=['column{}'.format(i) for i in range(1, 7)])

df.head()
Out: 
    column1   column2   column3   column4   column5   column6
0  1.764052  0.400157  0.978738  2.240893  1.867558 -0.977278
1  0.950088 -0.151357 -0.103219  0.410599  0.144044  1.454274
2  0.761038  0.121675  0.443863  0.333674  1.494079 -0.205158
3  0.313068 -0.854096 -2.552990  0.653619  0.864436 -0.742165
4  2.269755 -1.454366  0.045759 -0.187184  1.532779  1.469359

df = df.mask(prng.binomial(1, 0.5, df.shape).astype('bool'), np.nan)

df.head()
Out: 
    column1   column2   column3   column4   column5   column6
0       NaN  0.400157       NaN  2.240893       NaN       NaN
1  0.950088 -0.151357 -0.103219  0.410599  0.144044       NaN
2  0.761038  0.121675       NaN       NaN       NaN -0.205158
3       NaN       NaN -2.552990       NaN  0.864436       NaN
4  2.269755 -1.454366  0.045759 -0.187184       NaN       NaN

The following drops rows only if columns 1, 3, 5 and 6 are null:

df.dropna(subset=['column1', 'column3', 'column5', 'column6'], how='all').head()
Out: 
    column1   column2   column3   column4   column5   column6
1  0.950088 -0.151357 -0.103219  0.410599  0.144044       NaN
2  0.761038  0.121675       NaN       NaN       NaN -0.205158
3       NaN       NaN -2.552990       NaN  0.864436       NaN
4  2.269755 -1.454366  0.045759 -0.187184       NaN       NaN
5  0.154947  0.378163 -0.887786 -1.980796 -0.347912       NaN
Sign up to request clarification or add additional context in comments.

1 Comment

I was just about to add the second one :-)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.