2

I want to display only filtered rows in the Excel output. Below is example,

df = pd.DataFrame({'Data': [1, 2, 3, 4, 
                               5, 6, 7]}) 

writer = pd.ExcelWriter('pandasEx.xlsx',  
                   engine ='xlsxwriter') 

df.to_excel(writer, sheet_name ='Sheet1') 
writer.save() 

For output I want to hide all rows where 'Data' < 5. How to do this? It is equivalent to applying filter and saving the excel.

I know how to remove dups in pandas or filter in pandas. But I do not want to remove them in pandas, I simply want to apply filter in excel. The use case is that user will have full data in excel but certain rows will be hidden, if user wants they can unhide them in excel and look at data. Hope this explains use case

Thank you

6
  • If dupe is wrong, let me know. Commented Jan 23, 2019 at 12:22
  • 1
    This isn't a dupe, at least of the linked question. The OP is asking how to filter rows in the XlsxWriter file that is created from Pandas, not how to filter in Pandas itself. Commented Jan 23, 2019 at 13:00
  • Having said that, the solution will involve finding rows that would be filtered in the dataframe and then applying them to the XlsxWriter file. So some parts of the linked question will be relevant. Commented Jan 23, 2019 at 13:02
  • @jezrael, I know how to remove dups in pandas. But I do not want to remove them in pandas, I simply want to apply filter in excel. The use case is that user will have full data in excel but certain rows will be hidden, if user wants they can unhide them in excel and look at data. Hope this explains use case. Commented Jan 23, 2019 at 13:29
  • @nilesh - reopened Commented Jan 23, 2019 at 13:38

2 Answers 2

5
df = pd.DataFrame({'Data': [1, 2, 3, 4, 
                               5, 6, 7]}) 

writer = pd.ExcelWriter('pandasEx.xlsx',  
                   engine ='xlsxwriter') 

df.to_excel(writer, sheet_name ='Sheet1') 
workbook = writer.book
worksheet1 = writer.sheets['Sheet1']

# Activate autofilter
worksheet1.autofilter(f'B1:B{len(df)}')
worksheet1.filter_column('B', 'x < 5')

# Hide the rows that don't match the filter criteria.
for idx, row_data in df.iterrows():
    region = row_data['Data']
    if not (region < 5):
        # We need to hide rows that don't match the filter.
        worksheet1.set_row(idx + 1, options={'hidden': True})

writer.save() 


   
Sign up to request clarification or add additional context in comments.

1 Comment

Good answer. One small correction. You need to filter on column B instead of A since the dataframe puts an index in column A and the data in column B.
0

Create a dataframe with the filtered data and write this to the excel file:

import pandas as pd

df = pd.DataFrame({'Data': [1, 2, 3, 4, 5, 6, 7]}) 

writer = pd.ExcelWriter('pandasEx.xlsx') 

df_filtered = df.loc[df.Data >= 5]

df_filtered.to_excel(writer, sheet_name ='Sheet1') 
writer.save() 

Remark: Had to remove xlswriter module since I do not have it on my system, but code should work with it as well.

3 Comments

The OP wants all the data but a filter in the excel file i think
@anky_91: If that's the case, xlsxwriter.readthedocs.io/working_with_autofilters.html might be worth a look.
@albert I know how to filter in pandas. But the question is about how to filter in apply filter in excel writing.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.