I have a dataframe that looks like this
| Title | Ratings |
|---|---|
| Do schools kill creativity? | [{'id': 7, 'name': 'Funny', 'count': 19645}, {'id': 1, 'name': 'Beautiful', 'count': 4573}, {'id': 9, 'name': 'Ingenious', 'count': 6073}, {'id': 3, 'name': 'Courageous', 'count': 3253}, {'id': 11, 'name': 'Longwinded', 'count': 387}, {'id': 2, 'name': 'Confusing', 'count': 242}, {'id': 8, 'name': 'Informative', 'count': 7346}, {'id': 22, 'name': 'Fascinating', 'count': 10581}, {'id': 21, 'name': 'Unconvincing', 'count': 300}, {'id': 24, 'name': 'Persuasive', 'count': 10704}, {'id': 23, 'name': 'Jaw-dropping', 'count': 4439}, {'id': 25, 'name': 'OK', 'count': 1174}, {'id': 26, 'name': 'Obnoxious', 'count': 209}, {'id': 10, 'name': 'Inspiring', 'count': 24924}] |
| Simple designs to save a life | [{'id': 9, 'name': 'Ingenious', 'count': 269}, {'id': 3, 'name': 'Courageous', 'count': 92}, {'id': 7, 'name': 'Funny', 'count': 131}, {'id': 2, 'name': 'Confusing', 'count': 42}, {'id': 1, 'name': 'Beautiful', 'count': 91}, {'id': 8, 'name': 'Informative', 'count': 446}, {'id': 10, 'name': 'Inspiring', 'count': 397}, {'id': 22, 'name': 'Fascinating', 'count': 515}, {'id': 11, 'name': 'Longwinded', 'count': 45}, {'id': 21, 'name': 'Unconvincing', 'count': 49}, {'id': 24, 'name': 'Persuasive', 'count': 1234}, {'id': 25, 'name': 'OK', 'count': 73}, {'id': 23, 'name': 'Jaw-dropping', 'count': 139}, {'id': 26, 'name': 'Obnoxious', 'count': 21}] |
I want to parse the data from Ratings to look like
| Title | Rating | Count |
|---|---|---|
| Do schools kill creativity? | Funny | 19645 |
| Do schools kill creativity? | Beautiful | 4573 |
I've tried exploding the data using } as a delimeter
#explode ratings by title
df['ratings'] = df['ratings'].str.split('}')
df_explode_ratings = df.explode('ratings').reset_index(drop=True)
cols = list(df_explode_ratings.columns)
cols.append(cols.pop(cols.index('title')))
df_explode_ratings = df_explode_ratings[cols]
df_explode_cols = ['title', 'ratings']
df_explode_ratings = df_explode_ratings.drop(columns=[col for col in df_explode_ratings if col not in df_explode_cols])
this works but then I still need to parse it farther, I was going to split again on , but wound up up with NaN values in the Ratings column.
Ratingsusing thejsonmodule.