I'm trying to produce a customer json string from the values grouped by another column.
Consider following data:
>>> df = pd.DataFrame(
[
["c1", "a1", "123"],
["c1", "a2", "456"],
["c2", "a1", "789"]
],
index=["row1", "row2", "row3"],
columns=["col1", "col2", "col3"],
)
From the above dataframe, I want to generate rows grouped by col1 where the next column is a json string representing a list of the items from col2, and col3.
For example,
>>> new_df
key_id json_string
row1 c1 '{"values":[{"col2":"a1", "col3":"123"}, {"col2":"a2", "col3":"456"}]'
row2 c2 '{"values":[{"col2":"a1", "col3":"789"}]'
I'm new to pandas but it appears that a combination of apply and to_json will achieve what I want. Can someone help me figure this out?
Thanks!