python pandas - TypeError when parsing JSON: string indices must be integers

Question

The records in the JSON file look like this (please note what "nutrients" looks like):

{
"id": 21441,
"description": "KENTUCKY FRIED CHICKEN, Fried Chicken, EXTRA CRISPY,
Wing, meat and skin with breading",
"tags": ["KFC"],
"manufacturer": "Kentucky Fried Chicken",
"group": "Fast Foods",
"portions": [
{
"amount": 1,
"unit": "wing, with skin",
"grams": 68.0
},
...
],
"nutrients": [
{
"value": 20.8,
"units": "g",
"description": "Protein",
"group": "Composition"
},
{'description': 'Total lipid (fat)',
'group': 'Composition',
'units': 'g',
'value': 29.2}
...
]
}

The following is the code from the book exercise*. It includes some wrangling and assembles the nutrients for each food into a single large table:

import pandas as pd
import json

db = pd.read_json("foods-2011-10-03.json")

nutrients = []

for rec in db:
     fnuts = pd.DataFrame(rec["nutrients"])
     fnuts["id"] = rec["id"]
     nutrients.append(fnuts)

However, I get the following error and I can't figure out why:

TypeError                                 Traceback (most recent call last)
<ipython-input-23-ac63a09efd73> in <module>()
      1 for rec in db:
----> 2     fnuts = pd.DataFrame(rec["nutrients"])
      3     fnuts["id"] = rec["id"]
      4     nutrients.append(fnuts)
      5

TypeError: string indices must be integers

*This is an example from the book Python for Data Analysis

Your JSON is not valid (and even when one corrects the quotes and removes the dots, it cannot be loaded by pd.read_json). Please submit data we can actually see your problem on. — Amadan
– Amadan, Commented Aug 30, 2017 at 9:52
@Amadan, here is the link to the data: github.com/wesm/pydata-book/blob/master/ch07/… — P. Prunesquallor
– P. Prunesquallor, Commented Aug 30, 2017 at 9:56

Amadan · Accepted Answer · 2017-08-30 10:24:20Z

1

for rec in db iterates over column names. To iterate over rows,

for id, rec in db.iterrows():
    fnuts = pd.DataFrame(rec["nutrients"])
    fnuts["id"] = rec["id"]
    nutrients.append(fnuts)

This is a bit slow though (all the dicts that need constructing). itertuples is faster; but since you only care about two series, iterating over series directly is probably fastest:

for id, value in zip(db['id'], db['nutrients']):
    fnuts = pd.DataFrame(value)
    fnuts["id"] = id
    nutrients.append(fnuts)

answered Aug 30, 2017 at 10:24

Amadan

200k23 gold badges253 silver badges321 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

P. Prunesquallor Over a year ago

Thanks, that works fine! Have there been changes in how this iteration works since the book was written or should this be added to book's errata?

Amadan Over a year ago

Sorry, I don't know too much about Pandas history, and I haven't read the book.

zipa · Accepted Answer · 2017-08-30 10:01:42Z

0

The code works perfectly fine but the json should look something like this for code to work:

[{
"id": 21441,
"description": "KENTUCKY FRIED CHICKEN, Fried Chicken, EXTRA CRISPY,Wing, meat and skin with breading",
"tags": ["KFC"],
"manufacturer": "Kentucky Fried Chicken",
"group": "Fast Foods",
"portions": [
{"amount": 1,
"unit": "wing, with skin",
"grams": 68.0}],
"nutrients": [{
"value": 20.8,
"units": "g",
"description": "Protein",
"group": "Composition"
},
{'description': 'Total lipid (fat)',
'group': 'Composition',
'units': 'g',
'value': 29.2}]}]

This is example with one record only.

answered Aug 30, 2017 at 10:01

zipa

28k6 gold badges45 silver badges62 bronze badges

Comments

P. Prunesquallor · Accepted Answer · 2017-08-30 10:54:02Z

0

Amadan answered the question, but I managed to solve it like this prior to seeing his answer:

for i in range(len(db)):
    rec = db.loc[i]
    fnuts = pd.DataFrame(rec["nutrients"])
    fnuts["id"] = rec["id"]
    nutrients.append(fnuts)

answered Aug 30, 2017 at 10:54

P. Prunesquallor

5712 gold badges11 silver badges27 bronze badges

Collectives™ on Stack Overflow

python pandas - TypeError when parsing JSON: string indices must be integers

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related