How to replace existing data in a particular sheet of an existing excel file using pyspark dataframe?

Question

I am using Azure Databricks and Azure Data Storage Explorer for my operations. I have an excel file of under 30 MB containing multiple sheets. I want to replace the data in one sheet every month when I run this code. The rest of the sheets contain pivot tables that are used for reporting, based on the data sheet. I want to overwrite this sheet alone every month which will automatically refresh the other sheets.

I am completely new to pyspark and Azure. This seems to be possible using pandas and openpyxl, but it does not recognize the file path pointing to Azure Data Lake. So far, from what I have read, it doesn't seem possible to overwrite part of an existing file using pyspark.pandas.DataFrame. I believe I have 2 options:

Find a way to make pandas recognize adls path.
Overwrite part of an excel file using pyspark.

Please correct me if I am wrong. I would be grateful for any pointers.

Johnson Jebaseelan · Accepted Answer · 2025-07-31 10:01:17Z

1

Step-by-Step (No Programming):

Download the Excel file from Azure Data Lake to your computer.
Open the file in Excel — you’ll see multiple sheets, including the one you want to update.
Replace only the data in the sheet you want (e.g., "DataSheet").
- You can delete the old rows and paste in the new data.
- Make sure the column headers and structure match exactly.
Save the file — don’t change the file name or move sheets around.
Upload the updated file back to Azure Data Lake using Azure Storage Explorer or another Azure tool.

Important Tips:

The rest of your Excel sheets (like pivot tables) will stay intact as long as you don’t rename or delete them.
Always keep a backup before replacing data.
Use consistent formatting (same column names, same data types) to avoid breaking any reports.

edited Jul 31 at 10:01

answered Jul 31 at 9:59

Johnson Jebaseelan

114 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Community Jul 31 at 10:00

Your answer could be improved with additional supporting information. Please edit to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers in the help center.

Collectives™ on Stack Overflow

How to replace existing data in a particular sheet of an existing excel file using pyspark dataframe?

1 Answer 1

Step-by-Step (No Programming):

Important Tips:

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Step-by-Step (No Programming):

Important Tips:

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related