1

I have a bunch of worksheets in single excel file in the same format. I need to take just one column out of each table and to combine those data in new worksheet and new excel file.

Example of goal

I have struggled to find solution for my question. But still I have not reached any results.

I will appreciate any help or link to helpful materials.

1 Answer 1

2

You can use pd.read_excel() and iterate over each sheet in the workbook.

import pandas as pd
data = pd.read_excel( YOUR_FILE, sheet_name=None )
ouput_dict = dict()
for i,sheet in enumerate( data.keys() ):
    temp_df = data[sheet]
    new_col_name = col + str(i)
    output_dict[ new_col_name ] = temp_df[ col ]
final_df = pd.DataFrame( data=output_dict )

Inside the for loop, temp_df will be a pandas DataFrame based on that individual sheet. In this example you could name the column new_col_name, taking the column col from that sheet. In your example, col would be "Open".

Sign up to request clarification or add additional context in comments.

3 Comments

Thank you for your help, but I still experience some problems. I have an issue with this code - AttributeError: 'dict' object has no attribute 'sheet_names'. And the same for data.parse. Do you know how to deal with this? Also I am quite new in Python so detailed explanation will be a plus))))
Fixed the code in the answer to correctly treat 'data' as a dictionary
somewhat broadly, it's much (orders of magnitude) more efficient to use the pandas/numpy methods rather than creating intermediate python objects, though it may not matter for this case! (especially where readability/understandability is critical and there's probably not a huge number of rows or some webclient waiting for completion)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.