I am working with a non-nested json file, the data is from reddit. I am trying to convert it to csv file using python. Each row is not having the same fields and therefore keep getting the error as:
JSONDecodeError: Extra data: line 2 column 1
Here is the code:
import csv
import json
import os
os.chdir('c:\\Users\\Desktop')
infile = open("data.json", "r")
outfile = open("outputfile.csv", "w")
writer = csv.writer(outfile)
for row in json.loads(infile.read()):
writer.writerow(row)
Here are few lines from the data:
{"author":"i_had_an_apostrophe","body":"\"It's not your fault.\"","author_flair_css_class":null,"link_id":"t3_5c0rn0","subreddit":"AskReddit","created_utc":1478736000,"subreddit_id":"t5_2qh1i","parent_id":"t1_d9t3q4d","author_flair_text":null,"id":"d9tlp0j"}
{"id":"d9tlp0k","author_flair_text":null,"parent_id":"t1_d9tame6","link_id":"t3_5c1efx","subreddit":"technology","created_utc":1478736000,"subreddit_id":"t5_2qh16","author":"willliam971","body":"9/11 inside job??","author_flair_css_class":null}
{"created_utc":1478736000,"subreddit_id":"t5_2qur2","link_id":"t3_5c44bz","subreddit":"excel","author":"excelevator","author_flair_css_class":"points","body":"Have you tried stepping through the code to analyse the values at each step?\n\n","author_flair_text":"442","id":"d9tlp0l","parent_id":"t3_5c44bz"}
{"created_utc":1478736000,"subreddit_id":"t5_2tycb","link_id":"t3_5c384j","subreddit":"OldSchoolCool","author":"10minutes_late","author_flair_css_class":null,"body":"**Thanks Hillary**","author_flair_text":null,"id":"d9tlp0m","parent_id":"t3_5c384j"}
I am thinking of getting all the fields that are available in csv file (as header) and if data is not available for that particular field, just fill it with NA.