-1

Trying to convert a nested json to csv

json data

{"LOG_28MAY":[{"pk":"22","venue_name":"manchester","venue_code":"03839",
"fields":{"codename":"L01","name":"Can add log entry","content_type":"8","DAILY_LIST":["LOG_ID:12309","HOST_ID:1293123"]}},{"pk":"23","venue_name":"Birmingham","fields":{"codename":"Edit Log entry","content_type":"9","DAILY_LIST":["LOG_ID:230912309","HOST_ID:2494569","LOG_LOCATION_ID:20190627"]}}]}
1
  • Please edit your question to also show the output that you expect to get in the output.csv file for that JSON data. Commented Jun 6, 2019 at 17:58

1 Answer 1

1
import json
import csv

def get_leaves(item, key=None):
    if isinstance(item, dict):
        leaves = {}
        for i in item.keys():
            leaves.update(get_leaves(item[i], i))
        return leaves
    elif isinstance(item, list):
        leaves = {}
        for i in item:
            leaves.update(get_leaves(i, key))
        return leaves
    else:
        return {key : item}


with open('json.txt') as f_input:
    json_data = json.load(f_input)['LOG_28MAY']

# First parse all entries to get the complete fieldname list
fieldnames = set()

for entry in json_data:
    fieldnames.update(get_leaves(entry).keys())

with open('output.csv', 'w', newline='') as f_output:
    csv_output = csv.DictWriter(f_output, fieldnames=sorted(fieldnames))
    csv_output.writeheader()
    csv_output.writerows(get_leaves(entry) for entry in json_data)

Output:

fields.DAILY_LIST.HOST_ID,fields.DAILY_LIST.LOG_ID,fields.DAILY_LIST.LOG_LOCATION_ID,fields.codename,fields.content_type,fields.name,pk,venue_code,venue_name
1293123,12309,,L01,8,Can add log entry,22,03839,manchester
2494569,230912309,20190627,Edit Log entry,9,,23,,Birmingham
Sign up to request clarification or add additional context in comments.

4 Comments

@MartinEvans gave helpful directions in this post stackoverflow.com/questions/41180960/…
with the root tag the elements of the array are not retrieved. i.e elements of DAILY LIST are not fully populated. 1 missing from first row and 2 missing from second row. output below
DAILY_LIST,codename,content_type,name,pk,venue_code,venue_name HOST_ID:1293123,L01,8,Can add log entry,22,03839,manchester LOG_LOCATION_ID:20190627,Edit Log entry,9,,23,,Birmingham
if you look closely DAILY_LIST's JSON Objects format is incorrect.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.