0
class_id class code id
8 XYZ A 1
8 XYZ B 2
9 ABC C 3

I have a dataframe like above. I want to transform it so the 'codes' column below collects all the unique (code, id) pairs into a JSON format that a class contains.

class_id class codes
8 XYZ [{'code: 'A', 'id': 1}, {'code': 'B', 'id': 2}]
9 ABC [{'code: 'C', 'id': 3}]

1 Answer 1

4

You could use groupby.apply where you pass in a lambda that uses the to_dict method:

out = df.groupby(['class_id','class'])[['code','id']].apply(lambda x: x.to_dict('records')).reset_index(name='codes')

Output:

   class_id class                                             codes
0         8   XYZ  [{'code': 'A', 'id': 1}, {'code': 'B', 'id': 2}]
1         9   ABC                          [{'code': 'C', 'id': 3}]
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the answers. What would be the best way to rename 'code' and 'id' inside the JSON to something else. So something like {'C': 'B', 'I': 2}
@hedebyhedge you could rename columns, then use groupby.apply

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.