1

I have a dataframe with several columns [A, B, C, ..., Z]. I want to delete all rows from the dataframe which have the property that their values in columns [B, C, ..., Z] are equal to 0 (integer zero).

Example df:

  A B C ... Z
0 3 0 0 ... 0 
1 1 0 0 ... 0
2 2 1 2 ... 3    <-- keep only this as it has values other than zero

I tried to do this like so:

df = df[(df.columns[1:] != 0).all()]

I can't get it to work. I am not too experienced with conditions in indexers. I wanted to avoid a solution that chains a zero test for every column. I am sure that there is a more elegant solution to this.

Thanks!

EDIT: The solution worked for an artificially created dataframe, but when I used it on my df that I got from reading a csv, it failed. The file looks like this:

A;B;C;D;E;F;G;H;I;J;K;L;M;N;O;P;Q;R;S;T;U;V;W;X;Y;Z
0;25310;169;81;0;0;0;12291181;31442;246;0;0;0;0;0;0;0;0;0;251;31696;0;0;329;0;0
1;6252727;20480;82;0;0;0;31088;85;245;0;0;0;0;0;0;0;0;0;20567;331;0;0;329;0;0
2;6032184;10961;82;0;0;0;31024;84;245;0;0;0;0;0;0;0;0;0;11046;330;0;0;329;0;0
3;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0
4;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0
5;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0
6;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0
7;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0
8;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0
9;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0
10;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0;0

I read it using the following commands:

import pandas as pd

# retrieve csv file as dataframe
df = pd.read_csv('PATH/TO/FILE'), 
                 decimal=',', 
                 sep=';')    

df[list(df)] = df[list(df)].astype('int') 

print(df)

df = df[(df.iloc[:, 1:] != 0).all(axis=1)]

print(df)

The first print statement shows that the frame is read correctly, but the second print gives me an empty dataframe. How can this be?

0

1 Answer 1

3

Use iloc for select all columns without first:

df = df[(df.iloc[:, 1:] != 0).all(axis=1)]
print (df)
   A  B  C  Z
2  2  1  2  3

EDIT:

df = df[(df.iloc[:, 1:] != 0).any(axis=1)]
print (df)
   A        B      C   D  E  F  G         H      I    J ...  Q  R  S      T  \
0  0    25310    169  81  0  0  0  12291181  31442  246 ...  0  0  0    251   
1  1  6252727  20480  82  0  0  0     31088     85  245 ...  0  0  0  20567   
2  2  6032184  10961  82  0  0  0     31024     84  245 ...  0  0  0  11046   

       U  V  W    X  Y  Z  
0  31696  0  0  329  0  0  
1    331  0  0  329  0  0  
2    330  0  0  329  0  0  

[3 rows x 26 columns]
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks, I have just a little issue with your example. I edited it into my original question. Maybe you could give me a hint there.
This is he correct answer. Your file as posted in your question has no row for which every column B...Z != 0....
@sundance: in my original question I asked for a piece of code that drops the row only if it purely consists of zeros. If at least a single entry is not equal to zero, it should be kept. Row 0, 1 and 2 do fulfill this criterion.
@DocDriven - Sorry, I think need changed all with any.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.