Python: How to read multiple Excel sheets into a list?

Question

From Excel, how can I parse 'Sheet1' and 'Sheet2' in to a list? I'm currently using xlrd, as shown in the code below.

Sheet1:

Sheet2:

My code:

#!/usr/bin/env python2
# -*- coding: utf-8 -*-
from __future__ import print_function
import xlrd
import sys

loc = 'excel.xlsx'
wb = xlrd.open_workbook(loc, encoding_override="iso-8859-5, cyrillic")
wb_name = wb.sheet_names()
count = len(wb_name)
data = []
column_excel =('Name', 'Course', 'Cost', 'level')
count_column = len(column_excel)
for i in range(count):
    ow = xlrd.open_workbook('excel.xlsx').sheet_by_index(i)
    for x in range (0, 100):
        for i in range(2):
            try:
                if ow.cell_value(0, x) == column_excel[i]:
                    ips = ow.col_values(x, 1)
                    data.append(ips)
                    break
            except IndexError:
                continue
print(data)

My results:

[['Andre'], [1], [200], [5],
['Sam'], [2], [100], [8],
[7], ['Antony'], [4], [150],
[9], ['Ben'], [3], [500]]

Expected output:

[['Andre'], [1], [200], [5],
['Sam'], [2], [100], [8],
['Antony'], [4], [7], [150],
['Ben'], [3], [9], [500]]

s3dev · Accepted Answer · 2020-11-30 09:09:58Z

2

If you'd like to use pandas to read the XLSX file, rather than xlrd, things become much simpler, from a coding perspective. Additionally, as the .append() function is quite clever in its design, the columns are auto-aligned (providing the column name is the same) - which can be helpful since the sheets have different column order.

The official pandas read_excel docs can be found here.

Sample code:
When calling the .read_excel() using multiple sheets, a dict of DataFrames (df_) is returned. The second line of code is used to combine the DataFrames.

import pandas as pd

df_ = pd.read_excel('courses.xlsx', sheet_name=['Sheet1', 'Sheet2'])
df = pd.DataFrame().append([df_[i] for i in df_]).reset_index(drop=True)

Output (as a DataFrame):

      Name  Course  Cost  level
0    Andre       1   200      5
1      Sam       2   100      8
2  Anthony       4   150      7
3      Ben       3   500      9

Output (as a list):

>>> df.to_numpy().tolist()

[['Andre', 1, 200, 5],
 ['Sam', 2, 100, 8],
 ['Anthony', 4, 150, 7],
 ['Ben', 3, 500, 9]]

Acknowledgement:
This list output is not identical (in format) to the expected output in the question. I presume this is a design flaw, as a the output of this answer provides a list of records rather than a list of individual fields - which may later become difficult to manage.

edited Nov 30, 2020 at 9:09

answered Nov 30, 2020 at 8:56

s3dev

9,8813 gold badges34 silver badges49 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Dustin Over a year ago

Additional information: If you are using multiple excel files, you can read them in in a loop and append the resulting dataframe to a list. This list you can later unite into a big dataframe with big_df = pd.concat(list_variable)

s3dev Over a year ago

@Dustin - Yes, absolutely correct, thank you for the additional information.

Collectives™ on Stack Overflow

Python: How to read multiple Excel sheets into a list?

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related