I am working with JSON dataset( reddit data) and size of data is 5GB. My JSON data block looks like this.
{"subreddit":"languagelearning","parent_id":"t1_cn9nn8v","retrieved_on":1425123427,"ups":1,"author_flair_css_class":"","gilded":0,"author_flair_text":"Lojban (N)","controversiality":0,"subreddit_id":"t5_2rjsc","edited":false,"score_hidden":false,"link_id":"t3_2qulql","name":"t1_cnau2yv","created_utc":"1420074627","downs":0,"body":"I played around with the Japanese Duolingo for awhile and basically if you're not near Fluency you won't learn much of anything.\n\nAs was said below, the only one that really exists is Chineseskill.","id":"cnau2yv","distinguished":null,"archived":false,"author":"Pennwisedom","score":1}
I am using python to list every "subreddit" from this data. But I am getting memory error. Below are my python code and error.
import json
data=json.loads(open('/media/RC_2015-01').read())
for item in data:
name = item.get("subreddit")
print name
Traceback (most recent call last): File "name_python.py", line 4, in data=json.loads(open('/media/RC_2015-01').read()) MemoryError
What is know is that , I am trying to load very big data that why I am getting Memory Error. Could anyone suggest any other workaround.