Python Pandas not creating multiple tabs in excel file?

Question

I have a python script that pulls from a 3 rd party API. The script runs for 3 different cities in loop and creates a data frame for each city. Then I transfer the data frame to an excel sheet as a tab. Below is the code.

    sublocation_ids = [
                {
                  "id": 163,
                  "name": "Atlanta, GA"
                },
                {
                  "id": 140,
                  "name": "Austin, TX"
                },
                {
                  "id": 164,
                  "name": "Baltimore, MD"
                } 
             ]
filter_text = "(headline:coronavirus OR summary:coronavirus OR headline:covid-19 OR summary:covid-19) AND categories:{}"

writer = pd.ExcelWriter(excel_path)
    for sub in sublocation_ids:
        city_num_int = sub['id']
        city_num_str = str(city_num_int)
        city_name = sub['name']
        filter_text_new = filter_text.format(city_num_str)
        data = json.dumps({"filters": [filter_text_new], "sort_by":"created_at", "size":2})
        r = requests.post(url = api_endpoint, data = data).json()
        articles_list = r["articles"] 
        articles_list_normalized = json_normalize(articles_list)
        df = articles_list_normalized
        df['publication_timestamp'] = pd.to_datetime(df['publication_timestamp'])
        df['publication_timestamp'] = df['publication_timestamp'].apply(lambda x: x.now().strftime('%Y-%m-%d'))
        df.to_excel(writer, sheet_name = city_name)
        writer.save()

The current issue I am facing is only one tab is getting created in the excel sheet for the first city "Atlanta,GA" I pull the data for from the API. How to create the tab for each and every city in the directory or does my code has any issue?

i see two possible errors, first where is writer initalised? outside of loop? two your calling writer.save() with every loop thus overwriting the sheet each time. call it at the end of your loop — Umar.H
– Umar.H, Commented Apr 6, 2020 at 0:53

Arne · Accepted Answer · 2020-04-06 01:02:41Z

2

See this bit from the df.to_excel() documentation:

If you wish to write to more than one sheet in the workbook, it is necessary to specify an ExcelWriter object:

df2 = df1.copy()
with pd.ExcelWriter('output.xlsx') as writer:  
    df1.to_excel(writer, sheet_name='Sheet_name_1')
    df2.to_excel(writer, sheet_name='Sheet_name_2')

So you may need to pull writer.save() outside of the loop.

answered Apr 6, 2020 at 1:02

Arne

10.6k2 gold badges22 silver badges31 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Piyush Patil Over a year ago

Sorry I forgot to include I have already defined the object check above edited code.

Umar.H · Accepted Answer · 2020-04-06 01:12:29Z

1

I can't speak for your code as I can't run it 'filter_text' seems to be a function you've written but not included.

essentially you have one of two errors I can see,

first it's not clear where you are initialising the writer object.

2nd you're overwriting the sheet with each loop - move it outside of the loop.

pd.ExcelFile can be used as a context manager - so you need to close/save it.

def close(self):
    """synonym for save, to make it more file-like"""
    return self.save()

writer = pd.ExcelWriter('file.xlsx')

for sub in sublocation_ids:
    city_num_int = sub['id']
    city_num_str = str(city_num_int)
    city_name = sub['name']
    filter_text_new = filter_text.format(city_num_str)
    data = json.dumps({"filters": [filter_text_new], "sort_by":"created_at", "size":2})
    r = requests.post(url = api_endpoint, data = data).json()
    articles_list = r["articles"] 
    articles_list_normalized = json_normalize(articles_list)
    df = articles_list_normalized
    df['publication_timestamp'] = pd.to_datetime(df['publication_timestamp'])
    df['publication_timestamp'] = df['publication_timestamp'].apply(lambda x: x.now().strftime('%Y-%m-%d'))
    df.to_excel(writer, sheet_name = city_name)

writer.save() # move this after you've finished writing to your writer object.

Sheets as dictionaries

if you're curious of the innards of the class, use .__dict__. on the object so you can see the metadata.

writer = pd.ExcelWriter('file.xlsx')

df.to_excel(writer,sheet_name='Sheet1')
df.to_excel(writer,sheet_name='Sheet2')
print(writer.__dict__)

{'path': 'file.xlsx',
 'sheets': {'Sheet1': <xlsxwriter.worksheet.Worksheet at 0x11a05a79a88>,
  'Sheet2': <xlsxwriter.worksheet.Worksheet at 0x11a065218c8>},
 'cur_sheet': None,
 'date_format': 'YYYY-MM-DD',
 'datetime_format': 'YYYY-MM-DD HH:MM:SS',
 'mode': 'w',
 'book': <xlsxwriter.workbook.Workbook at 0x11a064ff1c8>}

edited Apr 6, 2020 at 1:12

answered Apr 6, 2020 at 1:03

Umar.H

23.1k7 gold badges50 silver badges94 bronze badges

5 Comments

Piyush Patil Over a year ago

Sorry I forgot to include I have already defined the object check above edited code.

Umar.H Over a year ago

the solution is clear, @error2007s just move the save outside of the loop.

Piyush Patil Over a year ago

Nope still the same issue only one tab is getting created

Piyush Patil Over a year ago

Also if this was a loop issue the Tab that the Excel file should have must be Baltimore MD right. But the current Excel file has tab of Atlanta GA @Datanovice

Piyush Patil Over a year ago

Ok yours is a the correct answer writer.save() needs to be outside loop. I was adding it at other place. It is working perfectly now

Collectives™ on Stack Overflow

Python Pandas not creating multiple tabs in excel file?

2 Answers 2

1 Comment

Sheets as dictionaries

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Sheets as dictionaries

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related