python + pandas - dump df to excel in a loop

Question

I have a dict with 100s of panda dfs.

I want to loop through each df in the dict and dump it into excel, all on a single sheet, one after another with 1 blank row inbetween.

My attempt:

writer = pd.ExcelWriter('output.xlsx', engine='xlsxwriter')
workbook = writer.book

for key, values in dd.iteritems():
    df = dd[key]['chart_data']
    df.to_excel(writer, sheet_name='Sheet 1', index=False)

writer.save()
workbook.close()

I think it overwrites the dfs.

Any suggestions?

It is not that the df is overwritten, it's that you paste into the same part of the excel sheet every time. Thus you are physically overwriting the information in Excel. Assign the startrow in .toexcel(). By default it pastes from the top-left, which is not what you want here. — Mark_Anderson
– Mark_Anderson, Commented Nov 16, 2018 at 14:33

Charles Landau · Accepted Answer · 2018-11-16 14:41:52Z

5

startrow sounds like your solution:

start_row = 0
for key, values in dd.iteritems():
    df = dd[key]['chart_data']
    df.to_excel(writer, sheet_name='Sheet 1', index=False, startrow=start_row)
    # Edited to respect your requirement for 1 blank row
    # after each df
    start_row = start_row + len(df) + 1 # or df.shape[0] et cetera

It simply picks which row to start dumping into. You may also want to specify startcol, which works on the same principle, but I think this works as-is.

Edit: another, perhaps better way is to concat. Something like:

df = pd.concat([dd[key]["chart_data"] for key, values in dd.iteritems()])
df.to_excel(...)

But that would only work if your df fits in memory.

edited Nov 16, 2018 at 14:41

answered Nov 16, 2018 at 14:19

Charles Landau

4,2751 gold badge13 silver badges25 bronze badges

Sign up to request clarification or add additional context in comments.

6 Comments

Ahmad Khan Over a year ago

If it doesn't fit in memory, how come it is in dict already?

Charles Landau Over a year ago

Because now the dict and the df must fit in memory @MuhammadAhmad

Mark_Anderson Over a year ago

start_row = start_row + len(df)+1 to get the blank row between df dumps? Or start_row += len(df)+1 for a tiny, tiny gain in speed and memory.

Charles Landau Over a year ago

I don't see any reason for a blank row between dumps and it wasn't in the question @Mark_Anderson. Although if the answer as written overwrites the last row of the previous df on each iteration then your edit would be mandatory. I don't think it does but I can't test right now

Mark_Anderson Over a year ago

It's in the question, second sentence: "I want to loop through each df in the dict and dump it into excel, all on a single sheet, one after another with 1 blank row inbetween."

|

Collectives™ on Stack Overflow

python + pandas - dump df to excel in a loop

1 Answer 1

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related