json_normalize to split multiple json object: python

Question

I have a json response like this

order_response = {
"orders": [
    {
        "id": '1',
        "email": "[email protected]",
        "location_id": 9,
        "line_items": [
            {
                "id": 5,
                "product_id": 6,
            }, {
                "id": 7,
                "product_id": 8,
            }
        ]
    }, {
        "id": '2',
        "email": "[email protected]",
        "location_id": 10,
        "line_items": {
            "id": 3,
            "product_id": 4,
        }
    },
]

}

And I wanted the output like this

id email      location_id line_items_id line_items_product_id
1  [email protected] 9           5             6
1  [email protected] 9           7             8
1  [email protected] 10          3             4

I want to split the rows as per the number of objects in the line_items. So my approach is to use the json_normalize feature of Pandas I am able to spilt if I specify the column names in the code as shown below.

pd.io.json.json_normalize(report_json, ['line_items'], ['id', 'email'], record_prefix='line_items_')

but there may be other columns apart from id, email. I want this to be dynamic i.e. it should be able to do with any number of objects provided without explicitly defining Any help in this regard is highly appreciated.

jezrael · Accepted Answer · 2019-02-07 10:00:54Z

2

First add list to one element dictionaries and also extract all keys of dictionaries:

L = []
keys = []
for x in report_json['orders']:
    d = {}
    for k, v in x.items():
        if isinstance(v, dict) and k =='line_items':
            d[k] = [v]
        else:
            d[k] = v
        if k !='line_items':
            keys.append(k)
    L.append(d)

print (L)

[
    {
        "id": '1',
        "email": "[email protected]",
        "location_id": 9,
        "line_items": [
            {
                "id": 5,
                "product_id": 6,
            }, {
                "id": 7,
                "product_id": 8,
            }
        ]
    }, {
        "id": '2',
        "email": "[email protected]",
        "location_id": 10,
        "line_items": [{
            "id": 3,
            "product_id": 4,
        }]
    }
]

from pandas.io.json import json_normalize

#get unique keys and pass to json_normalize
L1 = list(set(keys))
print (L1)
['location_id', 'id', 'email']

df = json_normalize(L,  ['line_items'],  L1, record_prefix='line_items_')
print (df)
   line_items_id  line_items_product_id  location_id id       email
0              5                      6            9  1  [email protected]
1              7                      8            9  1  [email protected]
2              3                      4           10  2  [email protected]

edited Feb 7, 2019 at 10:00

answered Feb 7, 2019 at 9:15

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

Nikhil Gupta Over a year ago

This is the question, I dont want to explicitly define the column name(id, email), I want all the columns provided in the json response, it may be 1000 thats why I dont want to define it in the code.

jezrael Over a year ago

@NikhilGupta - Can you check edit? You can get dynamically all values in list and pass to json_normalize

Nikhil Gupta Over a year ago

yes it worked, thanks man. this will be the last question. what if in some cases line_items is dictionary instead of list. how to troubleshoot that?

jezrael Over a year ago

@NikhilGupta - It is list if multiple values and dict if one value like in question, not like in my data?

Nikhil Gupta Over a year ago

thanks buddy, it worked. just a small update, keys variable is undefined. it needs to be initialized and k should be appended to it in the for loop. Thank you so much

Collectives™ on Stack Overflow

json_normalize to split multiple json object: python

1 Answer 1

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related