2

Given the following input file ("ToSplit2.xlsx"):

+-----------------+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Section One     |     |     |     |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 1   | 100 |     |     |     |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 2   | 100 | 200 |     |     |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 3   | 100 | 200 | 300 |     |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 4   | 100 | 200 | 300 | 400 |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 5   | 100 | 200 | 300 | 400 | 500 |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 6   | 100 | 200 | 300 | 400 | 500 | 600 |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 7   | 100 | 200 | 300 | 400 | 500 | 600 | 700 |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 8   | 100 | 200 | 300 | 400 | 500 | 600 | 700 | 800 |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 9   | 100 | 200 | 300 | 400 | 500 | 600 | 700 | 800 | 900 |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 10  | 100 | 200 | 300 | 400 | 500 | 600 | 700 | 800 | 900 | 1000 |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
|           |     |     |     |     |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Section Two     |     |     |     |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 1   | 100 |     |     |     |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 2   | 100 | 200 |     |     |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 3   | 100 | 200 | 300 |     |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 4   | 100 | 200 | 300 | 400 |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 5   | 100 | 200 | 300 | 400 | 500 |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 6   | 100 | 200 | 300 | 400 | 500 | 600 |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 7   | 100 | 200 | 300 | 400 | 500 | 600 | 700 |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 8   | 100 | 200 | 300 | 400 | 500 | 600 | 700 | 800 |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 9   | 100 | 200 | 300 | 400 | 500 | 600 | 700 | 800 | 900 |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 10  | 100 | 200 | 300 | 400 | 500 | 600 | 700 | 800 | 900 | 1000 |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
|           |     |     |     |     |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Section   Three |     |     |     |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 1   | 100 |     |     |     |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 2   | 100 | 200 |     |     |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 3   | 100 | 200 | 300 |     |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 4   | 100 | 200 | 300 | 400 |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 5   | 100 | 200 | 300 | 400 | 500 |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 6   | 100 | 200 | 300 | 400 | 500 | 600 |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 7   | 100 | 200 | 300 | 400 | 500 | 600 | 700 |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 8   | 100 | 200 | 300 | 400 | 500 | 600 | 700 | 800 |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 9   | 100 | 200 | 300 | 400 | 500 | 600 | 700 | 800 | 900 |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 10  | 100 | 200 | 300 | 400 | 500 | 600 | 700 | 800 | 900 | 1000 |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+

And the following Python code:

import pandas as pd
import numpy as np

spreadsheetPath = "ToSplit2.xlsx"
xls = pd.ExcelFile(spreadsheetPath)

# Iterate through worksheets in opened Excel file
for sheet in xls.sheet_names:
    # Create a Pandas dataframe from the Excel worksheet (with no headers)
    excel_data_df = pd.read_excel(
        spreadsheetPath, sheet_name=sheet, header=None)

    # Return a list of dataframe index values where entire row is blank
    indexList = excel_data_df[excel_data_df.isnull().all(1)].index.tolist()

    # Prints [11, 23]
    print(indexList)

    # Initiate a dictionary
    dataframeDictionary = {}

    # For every index value in the list
    for index in indexList:
        # Split and add the result to the dictionary of Panda's dataframes
        dataframeDictionary = np.array_split(excel_data_df, index)

    # For every pandas dataframe in the dataframe dictionary
    for dataframe in dataframeDictionary:
        # Write the pandas dataframe to Excel with a worksheet name equal to dataframe address 0,0
        dataframe.to_excel("output.xlsx",sheet_name=str(dataframe.iloc[0][0]))

I am trying to split the Excel worksheet into multiple spreadsheets based on the blank rows. E.g.:

Section One: (there would also be Section Two and Section Three worksheets)

+-----------------+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Section One     |     |     |     |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 1   | 100 |     |     |     |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 2   | 100 | 200 |     |     |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 3   | 100 | 200 | 300 |     |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 4   | 100 | 200 | 300 | 400 |     |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 5   | 100 | 200 | 300 | 400 | 500 |     |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 6   | 100 | 200 | 300 | 400 | 500 | 600 |     |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 7   | 100 | 200 | 300 | 400 | 500 | 600 | 700 |     |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 8   | 100 | 200 | 300 | 400 | 500 | 600 | 700 | 800 |     |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 9   | 100 | 200 | 300 | 400 | 500 | 600 | 700 | 800 | 900 |      |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+
| Label 10  | 100 | 200 | 300 | 400 | 500 | 600 | 700 | 800 | 900 | 1000 |
+-----------+-----+-----+-----+-----+-----+-----+-----+-----+-----+------+

I believe I am really close, but seem to be slipping up on the data frame splitting.

1
  • you can use a loop to find the spaces and then terminating the worksheet and simultaneously creating a new worksheet. Commented Sep 21, 2020 at 19:35

1 Answer 1

3

Make changes according to your file name.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Read excel file
df = pd.read_excel('ToSplit2.xlsx', skip_blank_lines=False, header=None)

# Split by blank rows
df_list = np.split(df, df[df.isnull().all(1)].index)

# Create new excel to write the dataframes
writer = pd.ExcelWriter('Excel_one.xlsx', engine='xlsxwriter')
for i in range(1, len(df_list) + 1):
    df_list[i - 1] = df_list[i - 1].dropna(how='all')
    df_list[i - 1].to_excel(writer, sheet_name='Sheet{}'.format(i), header=None, index=False)
    
# Save the excel file
writer.save()
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.