Display only filtered rows in Excel writer output using pandas

Question

I want to display only filtered rows in the Excel output. Below is example,

df = pd.DataFrame({'Data': [1, 2, 3, 4, 
                               5, 6, 7]}) 

writer = pd.ExcelWriter('pandasEx.xlsx',  
                   engine ='xlsxwriter') 

df.to_excel(writer, sheet_name ='Sheet1') 
writer.save()

For output I want to hide all rows where 'Data' < 5. How to do this? It is equivalent to applying filter and saving the excel.

I know how to remove dups in pandas or filter in pandas. But I do not want to remove them in pandas, I simply want to apply filter in excel. The use case is that user will have full data in excel but certain rows will be hidden, if user wants they can unhide them in excel and look at data. Hope this explains use case

Thank you

This isn't a dupe, at least of the linked question. The OP is asking how to filter rows in the XlsxWriter file that is created from Pandas, not how to filter in Pandas itself. — jmcnamara
– jmcnamara, Commented Jan 23, 2019 at 13:00
Having said that, the solution will involve finding rows that would be filtered in the dataframe and then applying them to the XlsxWriter file. So some parts of the linked question will be relevant. — jmcnamara
– jmcnamara, Commented Jan 23, 2019 at 13:02
@jezrael, I know how to remove dups in pandas. But I do not want to remove them in pandas, I simply want to apply filter in excel. The use case is that user will have full data in excel but certain rows will be hidden, if user wants they can unhide them in excel and look at data. Hope this explains use case. — nilesh
– nilesh, Commented Jan 23, 2019 at 13:29

Roelant · Accepted Answer · 2020-12-01 10:43:39Z

5

df = pd.DataFrame({'Data': [1, 2, 3, 4, 
                               5, 6, 7]}) 

writer = pd.ExcelWriter('pandasEx.xlsx',  
                   engine ='xlsxwriter') 

df.to_excel(writer, sheet_name ='Sheet1') 
workbook = writer.book
worksheet1 = writer.sheets['Sheet1']

# Activate autofilter
worksheet1.autofilter(f'B1:B{len(df)}')
worksheet1.filter_column('B', 'x < 5')

# Hide the rows that don't match the filter criteria.
for idx, row_data in df.iterrows():
    region = row_data['Data']
    if not (region < 5):
        # We need to hide rows that don't match the filter.
        worksheet1.set_row(idx + 1, options={'hidden': True})

writer.save()

edited Dec 1, 2020 at 10:43

Roelant

5,2295 gold badges43 silver badges77 bronze badges

answered Jan 24, 2019 at 5:58

nilesh

3574 silver badges18 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

jmcnamara Over a year ago

Good answer. One small correction. You need to filter on column B instead of A since the dataframe puts an index in column A and the data in column B.

albert · Accepted Answer · 2019-01-23 12:21:31Z

0

Create a dataframe with the filtered data and write this to the excel file:

import pandas as pd

df = pd.DataFrame({'Data': [1, 2, 3, 4, 5, 6, 7]}) 

writer = pd.ExcelWriter('pandasEx.xlsx') 

df_filtered = df.loc[df.Data >= 5]

df_filtered.to_excel(writer, sheet_name ='Sheet1') 
writer.save()

Remark: Had to remove xlswriter module since I do not have it on my system, but code should work with it as well.

answered Jan 23, 2019 at 12:21

albert

8,70111 gold badges59 silver badges90 bronze badges

3 Comments

anky Over a year ago

The OP wants all the data but a filter in the excel file i think

albert Over a year ago

@anky_91: If that's the case, xlsxwriter.readthedocs.io/working_with_autofilters.html might be worth a look.

nilesh Over a year ago

@albert I know how to filter in pandas. But the question is about how to filter in apply filter in excel writing.

Collectives™ on Stack Overflow

Display only filtered rows in Excel writer output using pandas

2 Answers 2

1 Comment

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related