exporting csv from nested python dictionary

Question

I have a dictionary that looks like this:

{u'results': [{u'bucket': u'Table',
           u'data': [{u'Geography_dst': u'PE',
                      u'avg_bps': 5054970470.588235,
                      u'device': u'taco',
                      u'as': u'Telephone Company',
                      u'next_hop': u'Telephone Companu',
                      u'key': blah,
                      u'max_bps': 6613494000,
                      u'p95th_bps': 6280622000,
                      u'timeSeries': {}},

[truncated for brevity]

I can't seem to figure out how to parse this dictionary into a csv. I am having trouble figuring out how to make a column out of each key in the 'data' tuple and have the rows populate from the appropriate values:

 device,as,nexthop,Geography_dst,max_bps,p95th_bps,avg_bps

(and yeah, I'd prefer not to have the 'key' or the timeseries tuples in the csv at all, but I figure that will be apparent once I figure out how to work with this data structure).

Thanks!

You only want to save the 'data' part into a file, and ignore what is above? — sandor
– sandor, Commented Jan 24, 2017 at 7:15

niemmi · Accepted Answer · 2017-01-24 07:34:02Z

You can use csv.DictWriter that writes fields from dict based on the instructions given to constructor:

import csv
COLUMNS = 'device,as,next_hop,Geography_dst,max_bps,p95th_bps,avg_bps'

d = {
    u'results': [{
        u'bucket': u'Table',
        u'data': [{
            u'Geography_dst': u'PE',
            u'avg_bps': 5054970470.588235,
            u'device': u'taco',
            u'as': u'Telephone Company',
            u'next_hop': u'Telephone Companu',
            u'key': None,
            u'max_bps': 6613494000,
            u'p95th_bps': 6280622000,
            u'timeSeries': {}
        }]
    }]
}

with open('output.csv', 'w') as f:
    writer = csv.DictWriter(f, extrasaction='ignore', fieldnames=COLUMNS.split(','))
    writer.writeheader()
    rows = (row for bucket in d['results'] for row in bucket['data'])
    writer.writerows(rows)

Output in output.csv:

device,as,next_hop,Geography_dst,max_bps,p95th_bps,avg_bps
taco,Telephone Company,Telephone Companu,PE,6613494000,6280622000,5054970470.588235

In above csv.DictWriter(f, extrasaction='ignore', fieldnames=COLUMNS.split(',')) creates a writer object. extrasaction instructs it to skip the keys which are not present in fieldnames. fieldnames is ordered list of keys that you want to write from each dict. writeheader just writes the column names, you can skip this if columns are not needed.

rows is a generator expression that iterates over the results and objects within a result. It returns the dicts you want to write one by one. Finally the generator is given to writerows that writes all the dicts returned by generator to the file.

sandor · Accepted Answer · 2017-01-24 07:30:19Z

0

I took the the assumption that you want to save only the data part and that it consists of multiple dictionaries with the same key. Here is the code that can convert and save the 'data'.

That would solve the problem:

import csv

big_dict = {
    'reslts': [{
        'bcket': 'Table',
        'data': [{
            'Geography_dst': 'PE',
            'avg_bps': 5054970470.588235,
            'device': 'taco',
            'as': 'Telephone Company',
            'next_hop': 'Telephone Compan',
            'key': 'blah',
            'max_bps': 6613494000,
            'p95th_bps': 6280622000,
            'timeSeries': {}
        },
        {
            'avg_bps': 5054970470.588235,
            'device': 'taco',
            'as': 'Telephone Company',
            'next_hop': 'Telephone Compan',
            'key': 'blah',
            'p95th_bps': 6280622000,
            'timeSeries': {},
            'Geography_dst': 'XE',
            'max_bps': 6613494000
        }]
    }]
}

my_dicts = big_dict['reslts'][0]['data']

with open('mycsvfile.csv', 'w') as f:  # Just use 'w' mode in 3.x
    keys_saved = False
    for my_dict in my_dicts:
        w = csv.DictWriter(f, my_dict.keys())
        if not keys_saved:
            w.writeheader()
            keys_saved = True
        w.writerow(my_dict)

Please note that this handles the case when you have the same keys everywhere, but not necessarily in the same order.

answered Jan 24, 2017 at 7:30

sandor

6716 silver badges16 bronze badges

2 Comments

PM 2Ring Over a year ago

Generally, the order of keys in a dict is not predictable, and can change each time the program is run, so you shouldn't pass my_dict.keys() to ` csv.DictWriter`. Instead, you should pass a fixed list of keys.

nigel222 Over a year ago

The order of keys in a dict is not random-per-run, but it might change if Python were upgraded. Alternatives to a fixed list: pass sorted(my_dict.keys()) to fix an ordering, or use collections.OrderedDict to ensure that keys are always in the order they are created (which is defined by the data source).

Collectives™ on Stack Overflow

exporting csv from nested python dictionary

2 Answers 2

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related