1

My list/dictionary is nested with lists for different items in it like this:

scores = [{"Student":"Adam","Subjects":[{"Name":"Math","Score":85},{"Name":"Science","Score":90}]},
     {"Student":"Bec","Subjects":[{"Name":"Math","Score":70},{"Name":"English","Score":100}]}]

If I use pd.DataFrame directly on the dictionary I get:

enter image description here

What should I do in order to get a data frame that looks like this:

Student   Subject.Name   Subject.Score
 Adam         Math            85
 Adam         Science         90
 Bec          Math            70
 Bec          English         100

?

Thanks very much

1 Answer 1

3

Use json_normalize with rename:

df = (pd.json_normalize(scores, 'Subjects','Student')
        .rename(columns={'Name':'Subject.Name','Score':'Subject.Score'}))
print (df)
  Subject.Name  Subject.Score Student
0         Math             85    Adam
1      Science             90    Adam
2         Math             70     Bec
3      English            100     Bec

Or list with dict comprehension and DataFrame constructor:

df = (pd.DataFrame([{**x, **{f'Subject.{k}': v for k, v in y.items()}} 
                     for x in scores for y in x.pop('Subjects')]))
print (df)
  Student Subject.Name  Subject.Score
0    Adam         Math             85
1    Adam      Science             90
2     Bec         Math             70
3     Bec      English            100
Sign up to request clarification or add additional context in comments.

4 Comments

Thanks json_normalize works for me well. Took me a while to figure out that I need to put the group-by item at last but apart from that it works like a charm.
@YingdongZhai - . Took me a while to figure out that I need to put the group-by item - do you need first column Student ? Then use this solution.
Hi @jezrael, what if I have another field like gender? Is it possible to normalise it as well? like the list changes to : scores = [{"Student":"Adam","Gender":,"M","Subjects":[{"Name":"Math","Score":85},{"Name":"Science","Score":90}]}, {"Student":"Bec","Gender":"F","Subjects":[{"Name":"Math","Score":70},{"Name":"English","Score":100}]}], and is it possible to show in gender column in the output df? Thanks
@YingdongZhai - Then use second solution

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.