1

I have a dictionary that looks like this:

{u'results': [{u'bucket': u'Table',
           u'data': [{u'Geography_dst': u'PE',
                      u'avg_bps': 5054970470.588235,
                      u'device': u'taco',
                      u'as': u'Telephone Company',
                      u'next_hop': u'Telephone Companu',
                      u'key': blah,
                      u'max_bps': 6613494000,
                      u'p95th_bps': 6280622000,
                      u'timeSeries': {}},

[truncated for brevity]

I can't seem to figure out how to parse this dictionary into a csv. I am having trouble figuring out how to make a column out of each key in the 'data' tuple and have the rows populate from the appropriate values:

 device,as,nexthop,Geography_dst,max_bps,p95th_bps,avg_bps

(and yeah, I'd prefer not to have the 'key' or the timeseries tuples in the csv at all, but I figure that will be apparent once I figure out how to work with this data structure).

Thanks!

2
  • You only want to save the 'data' part into a file, and ignore what is above? Commented Jan 24, 2017 at 7:15
  • yep, that is correct. Commented Jan 25, 2017 at 1:00

2 Answers 2

1

You can use csv.DictWriter that writes fields from dict based on the instructions given to constructor:

import csv
COLUMNS = 'device,as,next_hop,Geography_dst,max_bps,p95th_bps,avg_bps'

d = {
    u'results': [{
        u'bucket': u'Table',
        u'data': [{
            u'Geography_dst': u'PE',
            u'avg_bps': 5054970470.588235,
            u'device': u'taco',
            u'as': u'Telephone Company',
            u'next_hop': u'Telephone Companu',
            u'key': None,
            u'max_bps': 6613494000,
            u'p95th_bps': 6280622000,
            u'timeSeries': {}
        }]
    }]
}

with open('output.csv', 'w') as f:
    writer = csv.DictWriter(f, extrasaction='ignore', fieldnames=COLUMNS.split(','))
    writer.writeheader()
    rows = (row for bucket in d['results'] for row in bucket['data'])
    writer.writerows(rows)

Output in output.csv:

device,as,next_hop,Geography_dst,max_bps,p95th_bps,avg_bps
taco,Telephone Company,Telephone Companu,PE,6613494000,6280622000,5054970470.588235

In above csv.DictWriter(f, extrasaction='ignore', fieldnames=COLUMNS.split(',')) creates a writer object. extrasaction instructs it to skip the keys which are not present in fieldnames. fieldnames is ordered list of keys that you want to write from each dict. writeheader just writes the column names, you can skip this if columns are not needed.

rows is a generator expression that iterates over the results and objects within a result. It returns the dicts you want to write one by one. Finally the generator is given to writerows that writes all the dicts returned by generator to the file.

Sign up to request clarification or add additional context in comments.

Comments

0

I took the the assumption that you want to save only the data part and that it consists of multiple dictionaries with the same key. Here is the code that can convert and save the 'data'.

That would solve the problem:

import csv

big_dict = {
    'reslts': [{
        'bcket': 'Table',
        'data': [{
            'Geography_dst': 'PE',
            'avg_bps': 5054970470.588235,
            'device': 'taco',
            'as': 'Telephone Company',
            'next_hop': 'Telephone Compan',
            'key': 'blah',
            'max_bps': 6613494000,
            'p95th_bps': 6280622000,
            'timeSeries': {}
        },
        {
            'avg_bps': 5054970470.588235,
            'device': 'taco',
            'as': 'Telephone Company',
            'next_hop': 'Telephone Compan',
            'key': 'blah',
            'p95th_bps': 6280622000,
            'timeSeries': {},
            'Geography_dst': 'XE',
            'max_bps': 6613494000
        }]
    }]
}

my_dicts = big_dict['reslts'][0]['data']

with open('mycsvfile.csv', 'w') as f:  # Just use 'w' mode in 3.x
    keys_saved = False
    for my_dict in my_dicts:
        w = csv.DictWriter(f, my_dict.keys())
        if not keys_saved:
            w.writeheader()
            keys_saved = True
        w.writerow(my_dict)

Please note that this handles the case when you have the same keys everywhere, but not necessarily in the same order.

2 Comments

Generally, the order of keys in a dict is not predictable, and can change each time the program is run, so you shouldn't pass my_dict.keys() to ` csv.DictWriter`. Instead, you should pass a fixed list of keys.
The order of keys in a dict is not random-per-run, but it might change if Python were upgraded. Alternatives to a fixed list: pass sorted(my_dict.keys()) to fix an ordering, or use collections.OrderedDict to ensure that keys are always in the order they are created (which is defined by the data source).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.