0

I have the following JSON response.

[{'a': 
     [{'b': 1,
       'c': 'ok',
       'result': 
               [{'1': '2',
                 '3': 4,
                 '5': 6}]},
      {'b': 11,
       'c': 'ok1',
       'result': 
               [{'1': '21',
                 '3': 41,
                 '5': 61},
                {'1': '211',
                 '3': '411'}]}],     

  'Id': 'd0',
  'col': 16}]

I want to normalize this to a dataframe in Python pandas. I know about json_normalize and have seen a couple of other SO posts on the same. However, mine seems to be more deeply nested than others and I am not able to get my head around it.

What I am expecting in my output is as follows:

===============================================================
a.b | a.c | a.result.1 | a.result.3 | a.result.5 |  Id  | col 
===============================================================
1   | ok  |    2       |      4     |      6     |  d0  |  16
11  | ok1 |    21      |      41    |      61    |  d0  |  16
11  | ok1 |    211     |      411   |      None  |  d0  |  16

Any help would be greatly appreciated! Stuck with this for more than a day now!

Thank you!!

4
  • My poor eyes cannot read json in that form... Commented May 16, 2018 at 17:43
  • @AaronN.Brock: Ah, do you mean the indentation? Commented May 16, 2018 at 17:46
  • Yes, at a glance it's really hard to tell what the structure is. (Or maybe it's just me) Commented May 16, 2018 at 17:48
  • @AaronN.Brock: I made certain changes to it now. Is it readable now? Commented May 16, 2018 at 17:55

1 Answer 1

1

From my experience, pandas doesn't have a good way to handle lists in dicts in lists too well. But you can handle just lists in dicts using the record_path argument in json_normalize. So, this is not pretty, & a rather brittle way of solving the problem... but here's a solution:

data = # That mess
frames = []

for entry in data:
    frame = json_normalize(
        data[0]['a'], 
        record_path=('result'), 
        record_prefix='a.result.', 
        meta=['b', 'c'], 
        meta_prefix='a.'
    )

    frame['Id'] = entry['Id']
    frame['col'] = entry['col']
    frames.append(frame)

frame = pd.concat(frames)

Output:

  a.result.1 a.result.3  a.result.5  a.b  a.c  Id  col
0          2          4         6.0    1   ok  d0   16
1         21         41        61.0   11  ok1  d0   16
2        211        411         NaN   11  ok1  d0   16
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.