1

Got a pandas dataframe with the below structure

0    [{'review_id': 4873356, 'rating': '5.0'}, {'review_id': 4973356, 'rating': '4.0'}]
1    [{'review_id': 4635892, 'rating': '5.0'}, {'review_id': 4645839, 'rating': '3.0'}] 
....
....

I would like to flatten into a dataframe with the following columns review_id and rating

I was trying out pd.DataFrame(df1.values.flatten()) but looks like I'm getting something basic not right, need help!!!

2
  • How to recreate the original dataframe? Commented Feb 11, 2019 at 18:09
  • This is the original json { "status": true, "total_reviews": 323, "data": [ { "review_id": 4873356, "rating": "5.0" }, { "review_id": 4854192, "rating": "5.0" }, { "review_id": 4848567, "rating": "5.0" } ] } Commented Feb 11, 2019 at 18:13

2 Answers 2

3

You wind up getting an array of lists of dicts so need:

import pandas as pd
pd.DataFrame([x for y in df1.values for x in y])

  rating  review_id
0    5.0    4873356
1    4.0    4973356
2    5.0    4635892
3    3.0    4645839

Or if willing to use itertools:

from itertools import chain

pd.DataFrame(chain.from_iterable(df1.values.ravel()))
Sign up to request clarification or add additional context in comments.

Comments

2

1st unnesting , then re build your dataframe (assuming you have columns name 0)

pd.DataFrame(unnesting(df,[0])[0].values.tolist())
Out[61]: 
  rating  review_id
0    5.0    4873356
1    4.0    4973356
2    5.0    4635892
3    3.0    4645839

def unnesting(df, explode):
    idx=df.index.repeat(df[explode[0]].str.len())
    df1=pd.concat([pd.DataFrame({x:np.concatenate(df[x].values)} )for x in explode],axis=1)
    df1.index=idx
    return df1.join(df.drop(explode,1),how='left')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.