Python - iterate through nested json and save values

Question

I have a nested JSON (API) webstie which i want to parse and save items to file (using Scrapy framework).

I want to access each subelement of given elements, those are in following format

0   {…}
1   {…}
2   {…}
3   {…}
4   {…}
5   {…}
6   {…}
7   {…}
8   {…}
9   {…}
10  {…}

If I expand element 0 i get following values, where {...} exapnds further

id  6738
date    "2018-06-14T09:38:51"
date_gmt    "2018-06-14T09:38:51"
guid    
     rendered   "https:example.com"
modified    "2019-03-19T20:43:50"
modified_gmt    "2019-03-19T20:43:50"

How does it look like in reality

How do I access, consecutively, each element, first 0, then 1, then 2 ... up to total of 350 and grab value of, for example

guid   
    rendered "https//:example.com"

and save it to item.

What I have:

       results = json.loads(response.body_as_unicode())
       item = DataItem()
       for var in results:
           item['guid'] = results["guid"]
       yield item

This fails with

TypeError: list indices must be integers, not str

I know that i can access it with

item['guid'] = results[0]["guid"]

But this only gives me [0] index of the whole list and I want to iterate through all of indexes. How do I pass index number inside of the list?

But this only gives me [0] index of the whole list how about replacing 0 with something uhm, like a variable? or the length? — DirtyBit
– DirtyBit, Commented Mar 21, 2019 at 14:33

Mojtaba Kamyabi · Accepted Answer · 2019-03-21 15:46:08Z

1

Replace results["guid"] in your for loop to var["guid"]:

for var in results:
    item['guid'] = var["guid"]
    # do whatever you want with item['guid'] here

when you can access guid like results[0]["guid"] it means that you have list of dictionaries that every dictionary contains key named guid. in your for loop you use results (that is list) instead of var (that contain every dictionary in each iteration) that throws TypeError because list indices must be integers not strings (like "guid").

UPDATE: if you want to save each var["guid"] you can save them in a dictionary like this:

guid_holder = {"guid": []}
for var in results:
    guid_golder["guid].append(var["guid"])
for guid in guid_holder["guid"]:
    print(guid)

now guid_holder holds all elements.

edited Mar 21, 2019 at 15:46

answered Mar 21, 2019 at 14:40

Mojtaba Kamyabi

3,6703 gold badges33 silver badges52 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Alex16237 Over a year ago

I've done this. It behaves weirdly. Gives only one results from 10th index. results[0]["guid'] behaves correctly, prints guid for element [0]. results keeps whole json webpage in variable, i can print it too by using print(results). I don't know how to iterate through every [0,1,2,3...] and get guid for each.

Mojtaba Kamyabi Over a year ago

@Alex16237 What exactly results contains ? please add it as an example to your question

Alex16237 Over a year ago

I've posted a picture (edited post). Can't get formatting right with this one as there are too many elements.

Mojtaba Kamyabi Over a year ago

@Alex16237 I updated my answer see section UPDATE for saving all elements

Alex16237 Over a year ago

Unfortunately, it doesn't work. Maybe i phrase it badly. How do I pass variable/lenght of an array to index inside of the loop? I think this is how I solve this problem, ie. item['guid'] = results[*]["guid"] where * is variable passed by a loop. results reads whole page, if I call it with print i get ful JSON parsed page.

|

Collectives™ on Stack Overflow

Python - iterate through nested json and save values

1 Answer 1

7 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

7 Comments

Your Answer

Sign up or log in

Post as a guest

Related