0

I need to convert a JSON dictionary to a Pandas DataFrame, but the embedding is tripping me up.

Here is basically what the JSON dict looks like.

{
  "report": "{'name':{'data':[{'key 1':'value 1','key 2':'value 2'},{'key 1':'value 1','key 2':'value 2'}]}}"
}

In the DataFrame, I want the keys to be the column headers and values in the rows below them.

The extra layer of embedding is throwing me off somewhat from all the usual methods of doing this.

One tricky part is 'name' will change each time I get this JSON dict, so I can't use an exact sting value for 'name'.

0

2 Answers 2

1

Your JSON looks a bit odd. It looks more like a Python dict converted to a string, so you can use ast.literal_eval (a built-in function) to convert it to a real dict, and then use pd.json_normalize to get it into a dataframe form:

import ast

j = ...
parsed_json = ast.literal_eval(j['report'])
df = pd.json_normalize(parsed_json, record_path=[list(parsed_json)[0], 'data'])

Output:

>>> df
     key 1    key 2
0  value 1  value 2
1  value 1  value 2
Sign up to request clarification or add additional context in comments.

8 Comments

that works, thank you, but one tricky part which I only added a few minutes after the question is, 'name' will change each time I look at this data. Is there a way to do the above, not knowing what the 'name' will be. The nesting will always be the same.
Try it now, @Jack. It'll dynamically use the first key found in the object instead of just 'name' :)
that works like a charm. Many thanks. I would have been all day figuring that out. I always find converting JSON to dfs tough. It's never nicely nested in the real world like the examples I always see given. Thanks again.
I kinda agree with you actually. There are a ton of questions here about stuff like pivots, merges, and JSON parsing. They should provide more examples.
Basically ast.literal_eval says to "evaluate this code as if it were Python code". However, it's perfectly safe to run, but it doesn't just execute any code. Only code such as string or number or dict literals; hence literal_eval.
|
0

The error suggests that you're trying to index the strings (because the value under report is a string) using another string.

You just need ast.literal_eval to parse the string and a DataFrame constructor. If the "name" is unknown, you can iterate over the dict.values after you parse the string.

import ast
out = pd.DataFrame(y for x in ast.literal_eval(your_data['report']).values() for y in x['data'])

Output:

     key 1    key 2
0  value 1  value 2
1  value 1  value 2

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.