Drop specific rows from pandas dataframe according to condition

Question

I have a dataframe with several columns [A, B, C, ..., Z]. I want to delete all rows from the dataframe which have the property that their values in columns [B, C, ..., Z] are equal to 0 (integer zero).

Example df:

  A B C ... Z
0 3 0 0 ... 0 
1 1 0 0 ... 0
2 2 1 2 ... 3    <-- keep only this as it has values other than zero

I tried to do this like so:

df = df[(df.columns[1:] != 0).all()]

I can't get it to work. I am not too experienced with conditions in indexers. I wanted to avoid a solution that chains a zero test for every column. I am sure that there is a more elegant solution to this.

Thanks!

EDIT: The solution worked for an artificially created dataframe, but when I used it on my df that I got from reading a csv, it failed. The file looks like this:

A;B;C;D;E;F;G;H;I;J;K;L;M;N;O;P;Q;R;S;T;U;V;W;X;Y;Z
0;25310;169;81;0;0;0;12291181;31442;246;0;0;0;0;0;0;0;0;0;251;31696;0;0;329;0;0
1;6252727;20480;82;0;0;0;31088;85;245;0;0;0;0;0;0;0;0;0;20567;331;0;0;329;0;0
2;6032184;10961;82;0;0;0;31024;84;245;0;0;0;0;0;0;0;0;0;11046;330;0;0;329;0;0
3;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0
4;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0
5;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0
6;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0
7;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0
8;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0
9;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0
10;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0

I read it using the following commands:

import pandas as pd

# retrieve csv file as dataframe
df = pd.read_csv('PATH/TO/FILE'), 
                 decimal=',', 
                 sep=';')    

df[list(df)] = df[list(df)].astype('int') 

print(df)

df = df[(df.iloc[:, 1:] != 0).all(axis=1)]

print(df)

The first print statement shows that the frame is read correctly, but the second print gives me an empty dataframe. How can this be?

jezrael · Accepted Answer · 2018-07-26 14:13:04Z

3

Use iloc for select all columns without first:

df = df[(df.iloc[:, 1:] != 0).all(axis=1)]
print (df)
   A  B  C  Z
2  2  1  2  3

EDIT:

df = df[(df.iloc[:, 1:] != 0).any(axis=1)]
print (df)
   A        B      C   D  E  F  G         H      I    J ...  Q  R  S      T  \
0  0    25310    169  81  0  0  0  12291181  31442  246 ...  0  0  0    251   
1  1  6252727  20480  82  0  0  0     31088     85  245 ...  0  0  0  20567   
2  2  6032184  10961  82  0  0  0     31024     84  245 ...  0  0  0  11046   

       U  V  W    X  Y  Z  
0  31696  0  0  329  0  0  
1    331  0  0  329  0  0  
2    330  0  0  329  0  0  

[3 rows x 26 columns]

edited Jul 26, 2018 at 14:13

answered Jul 26, 2018 at 13:17

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

DocDriven Over a year ago

Thanks, I have just a little issue with your example. I edited it into my original question. Maybe you could give me a hint there.

sundance Over a year ago

This is he correct answer. Your file as posted in your question has no row for which every column B...Z != 0....

DocDriven Over a year ago

@sundance: in my original question I asked for a piece of code that drops the row only if it purely consists of zeros. If at least a single entry is not equal to zero, it should be kept. Row 0, 1 and 2 do fulfill this criterion.

jezrael Over a year ago

@DocDriven - Sorry, I think need changed all with any.

Collectives™ on Stack Overflow

Drop specific rows from pandas dataframe according to condition

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related