Iterating excel sheets with python

Question

I have a big excel (137MB), where I have 50 consumers and each of them have a separate excel sheet. Trying not to create 50 variables I would like to know how I can take from each sheet a specific column. For example, I want to save in a different variable the values in row 3. Or if it is easier save them in a variable and then taking into account the number of rows dividing it into 50 variables.

don't use different variables but list or dictionary - and then you can use for-loop to work with all values — furas
– furas, Commented Aug 5, 2022 at 16:59

mv1999 · Accepted Answer · 2022-08-05 15:54:27Z

Someone recommended Pandas, but I have used openpyxl to do similar things.

First you can import openpyxl and setup the worksheet like this:

import openpyxl #openpyxl will need to be installed via pip
workbook = openpyxl.load_workbook(PATHTOYOURWORKSHEET)

to access specific sheets of the workbook, as you mentioned each consumer has a seperate sheet, you do this:

worksheet1 = workbook["THE NAME OF YOUR SHEET"]
#you can call each whatever you want, this is how you get into a specific sheet.

Then to select a specific column you could do something like this although this is for a row, not a column, but if you read openpyxl docs i think its just iter_col instead, which just iterates every single cell in the row under your parameters:

sheet1Row2 = worksheet1.iter_rows(min_row=2, max_row=2, min_col=1, max_col=sheet1.max_column, values_only=True)

then using this, you could do something like for each 'sheet' in the workbook, target the row you want and then do whatever you want with it, add it to an array, save each to a new variable, etc.

Hope it helps

Umar.H · Accepted Answer · 2022-08-06 13:31:34Z

1

you can read the entire dataframe without specifying the sheet which will give you a dictionary dataframe.

you can then use a list comprehension and loop.

MCVE

import pandas as pd 

df1 = pd.get_dummies(list('ABCD'))

df2 = df1
df2['B'] = 5

writer = pd.ExcelWriter('workbook.xlsx')
df1.to_excel(writer,sheet_name='test1')
df2.to_excel(writer,sheet_name='test2')
writer.save()

print(df1)

   A  B  C  D
0  1  0  0  0
1  0  1  0  0
2  0  0  1  0
3  0  0  0  1

print(df2)

   A  B  C  D
0  1  5  0  0
1  0  5  0  0
2  0  5  1  0
3  0  5  0  1

df = pd.read_excel('workbook.xlsx',sheet_name=None)

print(df)

{'test1':    Unnamed: 0  A  B  C  D
0           0  1  5  0  0
1           1  0  5  0  0
2           2  0  5  1  0
3           3  0  5  0  1, 'test2':    Unnamed: 0  A  B  C  D
0           0  1  5  0  0
1           1  0  5  0  0
2           2  0  5  1  0
3           3  0  5  0  1}


new = pd.concat(df.values())[["A", "C"]]

print(new)

   A  C
0  1  0
1  0  0
2  0  1
3  0  0
0  1  0
1  0  0
2  0  1
3  0  0

edited Aug 6, 2022 at 13:31

answered Aug 5, 2022 at 15:34

Umar.H

23.1k8 gold badges50 silver badges94 bronze badges

2 Comments

Tams Over a year ago

Umar.H, do you know how to only have the values in the array. So, not having A and C such as 01230123. The array should just have in your example([[1,0,0,0,1,0,0,0,], [0,0,1,0,0,0,1,0]]

Umar.H Over a year ago

sorry @Tams I don't understand

Collectives™ on Stack Overflow

Iterating excel sheets with python

2 Answers 2

Comments

MCVE

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

MCVE

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related