I need to aggregate several columns by one column. I have the following code that works but for now column and I am struggling to modify it to several columns.
import pandas as pd
# Sample DataFrame
data = {
'Group': ['A', 'A', 'B', 'B', 'A', 'B'],
'Value': [1, 2, 3, 4, 5, 6],
'Qty': [100, 202, 403, 754, 855, 1256]
}
df = pd.DataFrame(data)
print (df)
result = df.groupby('Group')['Value'].apply(lambda x: pd.Series([', '.join(map(str, x))])).reset_index()
print(result)
This produces a table with the column "Group" (the groupby) and one column for "Value", but I need another column with the aggregate output for the variable Qty. Actually, my dataset has 12 variables that I need to aggregate. Any suggestion?
Thank you in advance and Happy 2024!!
df.groupby(...).agg(...)function? Read its docs here: DataFrameGroupBy.aggdf.astype(str).groupby('Group', as_index=False).agg(', '.join)