Dropping row in pandas if 2 columns in the same row have NAN value in it

Question

I am new to pandas and trying to complete the following:

I have a dataframe which look like this:

row    A     B     
1      abc   abc 
2      abc   
3            abc 
4
5      abc   abc

My desired output would look like this:

row    A     B     
1      abc   abc 
2      abc   
3            abc 
5      abc   abc

I am trying to drop rows if there is no value in both A and B columns:

if finalized_export_cf[finalized_export_cf['A']].str.len()<2:
    if finalized_export_cf[finalized_export_cf['B']].str.len()<2:
        finalized_export_cf[finalized_export_cf['B']].drop()

But that gives me the following error:

ValueError: cannot index with vector containing NA / NaN values

How could I drop values when both columns have an empty cell? Thank you for your suggestions.

KenHBS · Accepted Answer · 2019-12-09 09:03:50Z

4

You can check whether all rows have a null by using .isnull() and all() in a chain. isnull() produces a dataframe with booleans, and all(axis=1) checks whether all values in a given rows are true. If that's the case, that means that all values in the rows are nulls:

inds = df[["A", "B"]].isnull().all(axis=1)

You can then use inds to clean up all rows that have only nulls. First negate it using the tilda ~, or else you can only missing values:

df = df.loc[~inds, :]

edited Dec 9, 2019 at 9:03

answered Dec 9, 2019 at 9:00

KenHBS

7,2546 gold badges41 silver badges55 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Jonas Palačionis Over a year ago

I have 10 columns and only need to check 2 of those, how would I proceed with this set up?

oppressionslayer · Accepted Answer · 2019-12-09 09:09:50Z

2

For your use case you can create a mask and get the values where A & B are not True:

mask = df.isna()
df[~((mask.A == True) & (mask.B == True))]

output:

   row    A    B
0    1  abc  abc
1    2  abc  NaN
2    3  NaN  abc
4    5  abc  abc

answered Dec 9, 2019 at 9:09

oppressionslayer

7,2242 gold badges11 silver badges26 bronze badges

Comments

jezrael · Accepted Answer · 2019-12-09 09:14:53Z

2

If missing values are NaNs then use DataFrame.dropna with all and subset parameter:

print (df)
   row    A    B
0    1  abc  abc
1    2  abc  NaN
2    3  NaN  abc
3    4  NaN  NaN
4    5  abc  abc

df = df.dropna(how='all', subset=['A','B'])
print (df)
   row    A    B
0    1  abc  abc
1    2  abc  NaN
2    3  NaN  abc
4    5  abc  abc

Or if empty value is empty string use DataFrame.any with compare not equal '':

print (df)
   row    A    B
0    1  abc  abc
1    2  abc     
2    3       abc
3    4          
4    5  abc  abc


df = df[df[['A','B']].ne('').any(axis=1)]
print (df)
   row    A    B
0    1  abc  abc
1    2  abc     
2    3       abc
4    5  abc  abc

edited Dec 9, 2019 at 9:14

answered Dec 9, 2019 at 9:02

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

3 Comments

Jonas Palačionis Over a year ago

Hi jezrael, does this drop rows only if both A and B columns are empty or either one of those?

jezrael Over a year ago

@KenHBS - sure, rows are removed

jezrael Over a year ago

@JonasPalačionis - It test only A, B columns - all columns specified in subset parameter

Sammit · Accepted Answer · 2019-12-09 09:02:03Z

1

if you have only two columns - you can use the how attribute of the pandas.dataFrame.dropna by setting it to 'all':

df.dropna(how='all')

answered Dec 9, 2019 at 9:02

Sammit

1691 silver badge8 bronze badges

Comments

Umar.H · Accepted Answer · 2019-12-09 09:13:53Z

1

first of all we need to change the blank spaces to NaN

df = df.replace(r'^\s*$',np.nan,regex=True)

then drop na whilst sub-setting your rows

df.dropna(subset=['A','B'],how='all').fillna(' ') # if you want spaces for na
print(df)
    row    A    B
0    1  abc  abc
1    2  abc     
2    3  abc     
4    5  abc  abc

edited Dec 9, 2019 at 9:13

answered Dec 9, 2019 at 9:03

Umar.H

23.1k7 gold badges50 silver badges94 bronze badges

1 Comment

Sammit Over a year ago

@Jonas Palačionis - I suggest you use this answer - this will work

Collectives™ on Stack Overflow

Dropping row in pandas if 2 columns in the same row have NAN value in it

5 Answers 5

1 Comment

Comments

3 Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

1 Comment

Comments

3 Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related