1

I am working with JSON dataset( reddit data) and size of data is 5GB. My JSON data block looks like this.

{"subreddit":"languagelearning","parent_id":"t1_cn9nn8v","retrieved_on":1425123427,"ups":1,"author_flair_css_class":"","gilded":0,"author_flair_text":"Lojban (N)","controversiality":0,"subreddit_id":"t5_2rjsc","edited":false,"score_hidden":false,"link_id":"t3_2qulql","name":"t1_cnau2yv","created_utc":"1420074627","downs":0,"body":"I played around with the Japanese Duolingo for awhile and basically if you're not near Fluency you won't learn much of anything.\n\nAs was said below, the only one that really exists is Chineseskill.","id":"cnau2yv","distinguished":null,"archived":false,"author":"Pennwisedom","score":1}

I am using python to list every "subreddit" from this data. But I am getting memory error. Below are my python code and error.

import json
data=json.loads(open('/media/RC_2015-01').read())
for item in data:
   name = item.get("subreddit")
   print name

Traceback (most recent call last): File "name_python.py", line 4, in data=json.loads(open('/media/RC_2015-01').read()) MemoryError

What is know is that , I am trying to load very big data that why I am getting Memory Error. Could anyone suggest any other workaround.

1 Answer 1

1

You need to use an iterative parser like ijson to parse to each record at a time rather than loading the entire file into memory.

Regarding your error message, make sure you data is valid JSON and has square bracket around the records. This structure will parse correctly

[
 {...},
 {...}
]

whereas the following structure will raise the 'Additional data' exception

{....}
{....}
Sign up to request clarification or add additional context in comments.

2 Comments

thank you Jaco !! i am using below code, using ijson but getting error raise common.JSONError('Additional data') could you help me with this. >import ijson file_name="/media/RC_2015-01" with open(file_name) as file: parser = ijson.parse(file) for prefix, event, value in parser: if prefix=="subreddit": print value
See my updated answer, it suggest your data file is not valid JSON. Does your JSON start and end with square brackets ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.