Convert nested json response to dataframe in Python pandas

Question

I have the following JSON response.

[{'a': 
     [{'b': 1,
       'c': 'ok',
       'result': 
               [{'1': '2',
                 '3': 4,
                 '5': 6}]},
      {'b': 11,
       'c': 'ok1',
       'result': 
               [{'1': '21',
                 '3': 41,
                 '5': 61},
                {'1': '211',
                 '3': '411'}]}],     

  'Id': 'd0',
  'col': 16}]

I want to normalize this to a dataframe in Python pandas. I know about json_normalize and have seen a couple of other SO posts on the same. However, mine seems to be more deeply nested than others and I am not able to get my head around it.

What I am expecting in my output is as follows:

===============================================================
a.b | a.c | a.result.1 | a.result.3 | a.result.5 |  Id  | col 
===============================================================
1   | ok  |    2       |      4     |      6     |  d0  |  16
11  | ok1 |    21      |      41    |      61    |  d0  |  16
11  | ok1 |    211     |      411   |      None  |  d0  |  16

Any help would be greatly appreciated! Stuck with this for more than a day now!

Thank you!!

Yes, at a glance it's really hard to tell what the structure is. (Or maybe it's just me) — Aaron Brock
– Aaron Brock, Commented May 16, 2018 at 17:48
@AaronN.Brock: I made certain changes to it now. Is it readable now? — Gingerbread
– Gingerbread, Commented May 16, 2018 at 17:55

Aaron Brock · Accepted Answer · 2018-05-16 18:01:28Z

From my experience, pandas doesn't have a good way to handle lists in dicts in lists too well. But you can handle just lists in dicts using the record_path argument in json_normalize. So, this is not pretty, & a rather brittle way of solving the problem... but here's a solution:

data = # That mess
frames = []

for entry in data:
    frame = json_normalize(
        data[0]['a'], 
        record_path=('result'), 
        record_prefix='a.result.', 
        meta=['b', 'c'], 
        meta_prefix='a.'
    )

    frame['Id'] = entry['Id']
    frame['col'] = entry['col']
    frames.append(frame)

frame = pd.concat(frames)

Output:

  a.result.1 a.result.3  a.result.5  a.b  a.c  Id  col
0          2          4         6.0    1   ok  d0   16
1         21         41        61.0   11  ok1  d0   16
2        211        411         NaN   11  ok1  d0   16

Collectives™ on Stack Overflow

Convert nested json response to dataframe in Python pandas

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related