Write list of nested dictionaries to excel file in python

Question

I have a list of nested dictionaries that looks like this:

[{'posts': {'item_1': 1,
                            'item_2': 8,
                            'item_3': 105,
                            'item_4': 324,
                            'item_5': 313, }},
                 {'edits': {'item_1': 1,
                            'item_2': 8,
                            'item_3': 61,
                            'item_4': 178,
                            'item_5': 163}},
                 {'views': {'item_1': 2345,
                            'item_2': 330649,
                            'item_3': 12920402,
                            'item_4': 46199102,
                            'item_5': 43094955}}]

I would like to write it to an excel file in this format:

+--------+-------+-------+-----------+
|        | posts | edits |   views   |
+--------+-------+-------+-----------+
| item_1 |     1 |     1 |      2345 |
| item_2 |     8 |     8 |    330649 |
| item_3 |   105 |    61 |  12920402 |
| item_4 |   324 |   178 |  46199102 |
| item_5 |   313 |   163 | 430949955 |
+--------+-------+-------+-----------+

I am using the xlsxwriter library and trying the following and variations on the following without success:

for item in data:
    for col_name, data in item.iteritems():
        col += 1
        worksheet.write(row, col, col_name)
        for row_name, row_data in data.iteritems():
            col += 1
            worksheet.write(row, col, row_name)
            worksheet.write(row + 1, col, row_data)

I'm wondering if it makes sense to rework my nested dictionary object or is it possible to write to excel in it's current form?

When I say without much success i mean, that I can get it to write certain thigns to the excel file, like column names or row or the data, but I am unable to get it to write like pictured above. I'm not getting errors, I suspect i jsut don't know how to unpack this object properly to loop through it. In the code above, I am given a combination of row and column names on row 1 and all of the values on row 2.

My output for the code above is:

+--+-------+--------+--------+--------+--------+--------+-------+--------+--------+--------+--------+--------+-------+----------+----------+--------+----------+--------+
|  | posts | item_4 | item_5 | item_2 | item_3 | item_1 | edits | item_4 | item_5 | item_2 | item_3 | item_1 | views |  item_4  |  item_5  | item_2 |  item_3  | item_1 |
+--+-------+--------+--------+--------+--------+--------+-------+--------+--------+--------+--------+--------+-------+----------+----------+--------+----------+--------+
|  |       |    324 |    313 |      8 |    105 |      1 |       |    178 |    163 |      8 |     61 |      1 |       | 46199102 | 43094955 | 330649 | 12920402 |   2345 |
+--+-------+--------+--------+--------+--------+--------+-------+--------+--------+--------+--------+--------+-------+----------+----------+--------+----------+--------+

Can you be more specific than "without success"? See minimal reproducible example — jonrsharpe
– jonrsharpe, Commented Sep 8, 2015 at 17:21
Have you considered processing the dictionary to look more like the output you want to start with (i.e. {'item1': {'posts': 8, 'edits': 1, ...}, ...})? — jonrsharpe
– jonrsharpe, Commented Sep 8, 2015 at 17:36

Martin Evans · Accepted Answer · 2015-09-09 14:48:30Z

2

As an alternative, this could be solved using csv as follows:

import csv
import itertools

nested = [
    {'posts': {'item_1': 1, 'item_2': 8, 'item_3': 105, 'item_4': 324, 'item_5': 313,}},
    {'edits': {'item_1': 1, 'item_2': 8, 'item_3': 61, 'item_4': 178, 'item_5': 163}},
    {'views': {'item_1': 2345, 'item_2': 330649, 'item_3': 12920402, 'item_4': 46199102, 'item_5': 43094955}}]

headings = [d.keys()[0] for d in nested]
entries = [sorted(nested[index][col].items()) for index, col in enumerate(headings)]    

with open('output.csv', 'wb') as f_output:
    csv_output = csv.writer(f_output)
    csv_output.writerow(['items'] + headings)

    for cols in itertools.izip_longest(*entries, fillvalue=['<n/a>']*len(entries[0])):
        csv_output.writerow([cols[0][0]] + [col[1] for col in cols])

This would give you output.csv as follows:

items,posts,edits,views
item_1,1,1,2345
item_2,8,8,330649
item_3,105,61,12920402
item_4,324,178,46199102
item_5,313,163,43094955

edited Sep 9, 2015 at 14:48

answered Sep 8, 2015 at 17:38

Martin Evans

46.9k17 gold badges88 silver badges104 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

metersk Over a year ago

thanks for this! it works almost perfectly, but is there any way to shift the headings to the right by one column? currently the headings overlap with the row names.

metersk Over a year ago

I am trying this on a larger data set where there are more rows than columns, and in this case some of the rows seem to be getting left out of the csv writing. Do you think this is a side effect of your code above?

Martin Evans Over a year ago

Yes, the original version was hard coded to your 3 sample columns. I have updated it to build the columns automatically, so that should fix it, providing the overall structure remains the same.

metersk Over a year ago

In cases where I have more rows than columns it still seems to be leaving off columns. I think that the fact that the enumeration on the the keys does this. Is there another method to not have the row creation reliant on the number of columns?

Martin Evans Over a year ago

I've not been able to reproduce it, can you give me a link to some sample data that causes it?

|

David Zemens · Accepted Answer · 2015-09-08 18:07:51Z

Presently you have a dict each of posts, edits, and views which are each keyed to your "items", seems redundant.

Alternatively, create a single dictionary keyed to your "items", and have the value of each item be a dictionary of posts, edits, views, like:

items = {}
items = {{'item_1': {'posts':1, 'edits':0, 'views':2345}
         {'item_2': {'posts':2, 'edits':8, 'views':330649}}

This way you can simply refer to items['item_2']['edits'] (which should yield 8) or items['item_1']['views'] (which should yield 2345), etc.

In your case, then something like:

# write the headers -- this could be refined
row = 0
worksheet.write(0, 1, 'posts')
worksheet.write(0, 2, 'edits')
worksheet.write(0, 3, 'views')

# write the data:
for itm in items:
    row += 1
    worksheet.write(row, 0, itm)
    for col, prop in enmumerate(items[itm]):
        worksheet.write(row, col+1, prop)

Chankey Pathak · Accepted Answer · 2017-08-23 07:27:51Z

import pandas as pd

data = [{'posts': {'item_1': 1,
                   'item_2': 8,
                   'item_3': 105,
                   'item_4': 324,
                   'item_5': 313, }
         },
        {'edits': {'item_1': 1,
                   'item_2': 8,
                   'item_3': 61,
                   'item_4': 178,
                   'item_5': 163}
         },
        {'views': {'item_1': 2345,
                   'item_2': 330649,
                   'item_3': 12920402,
                   'item_4': 46199102,
                   'item_5': 43094955}
         }]

final_df = pd.DataFrame()

for id in range(0,len(data)):
    df = pd.DataFrame.from_dict(data[id])
    final_df = pd.concat([final_df, df], axis=1)

print (final_df)

final_df.to_excel('data.xlsx')

Collectives™ on Stack Overflow

Write list of nested dictionaries to excel file in python

3 Answers 3

7 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

7 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related