0

HI everybody i need some help with python.

I'm working with an excel with several rows, some of this rows has zero value in all the columns, so i need to delete that rows.

In 
 id a b c d 
 a  0 1 5 0 
 b  0 0 0 0
 c  0 0 0 0
 d  0 0 0 1 
 e  1 0 0 1

Out 
id a b c d 
a  0 1 5 0
d  0 0 0 1 
e  1 0 0 1

I think in something like show the rows that do not contain zeros, but do not work because is deleting all the rows with zero and without zero

path = '/Users/arronteb/Desktop/excel/ejemplo1.xlsx'
xlsx = pd.ExcelFile(path)
df = pd.read_excel(xlsx,'Sheet1')
df_zero = df[(df.OTC != 0) & (df.TM != 0) & (df.Lease != 0) & (df.Maint != 0) & (df.Support != 0) & (df.Other  != 0)]

Then i think like just show the columns with zero

In 
id a b c d 
a  0 1 5 0 
b  0 0 0 0
c  0 0 0 0
d  0 0 0 1 
e  1 0 0 1


Out 
id a b c d 
b  0 0 0 0
c  0 0 0 0   

So i make a little change and i have something like this

path = '/Users/arronteb/Desktop/excel/ejemplo1.xlsx'
xlsx = pd.ExcelFile(path)
df = pd.read_excel(xlsx,'Sheet1')
df_zero = df[(df.OTC == 0) & (df.TM == 0) & (df.Lease == 0) & (df.Maint == 0) & (df.Support == 0) & (df.Other  == 0)]

In this way I just get the column with zeros. I need a way to remove this 2 rows from the original input, and receive the output without that rows. Thanks, and sorry for the bad English, I'm working on that too

2 Answers 2

2

For this dataframe:

df
Out: 
  id  a  b  c  d  e
0  a  2  0  2  0  1
1  b  1  0  1  1  1
2  c  1  0  0  0  1
3  d  2  0  2  0  2
4  e  0  0  0  0  2
5  f  0  0  0  0  0
6  g  0  2  1  0  2
7  h  0  0  0  0  0
8  i  1  2  2  0  2
9  j  2  2  1  2  1

Temporarily set the index:

df = df.set_index('id')

Drop rows containing all zeros and reset the index:

df = df[~(df==0).all(axis=1)].reset_index()

df
Out: 
  id  a  b  c  d  e
0  a  2  0  2  0  1
1  b  1  0  1  1  1
2  c  1  0  0  0  1
3  d  2  0  2  0  2
4  e  0  0  0  0  2
5  g  0  2  1  0  2
6  i  1  2  2  0  2
7  j  2  2  1  2  1
Sign up to request clarification or add additional context in comments.

2 Comments

Maybe also in one step df[(df.drop("id", 1) == 0).all(axis=1)]? Not sure if any better though
Yes, that's definitely a cleaner way for this example. But I've gotten into the habit of setting the index because when you have multiple/complex conditions it starts to look cleaner (df[(df>0) & (df<5) | (df==-1)])
1

Given your input you can group by whether all the columns are zero or not, then access them, eg:

groups = df.groupby((df.drop('id', axis= 1) == 0).all(axis=1))
all_zero = groups.get_group(True)
non_all_zero = groups.get_group(False)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.