hello i am working with a dataset in .csv with python and am having errors grouping the columns. the code i'm working with is:
import pandas as pd
df=pd.read_excel('filepath')
df['Items'].str.split(',', expand=True)
df=df.groupby(['Items0', 'Items1','Items2', 'Items3', 'Items4', 'Items5' ]).size()
print(df)
when i run the print(df) i get values such as Items0-1, Items1-1, Items2-1 and so on
this is the sample data i am working with and how i am trying to organize it is below.
can someone direct me to how to solve this?
sample data:
| Name | Date | Items |
|---|---|---|
| johnny smith | 09/1/2021 | bread, oranges, peanut butter, apples, celery, peanuts |
| granny smith | 08/31/2021 | oranges, peanut butter, apples, bread |
| jane doe | 09/01/2021 | oranges, apples, celery, peanut butter |
| jack frost | 08/01/2021 | bread, oranges, apples |
| cinderella | 08/16/2021 | apples, peanuts, bread |
what i am attempting to achieve:
| Name | Date | Items0 | Items1 | Items2 | Items3 | Items4 | Items5 |
|---|---|---|---|---|---|---|---|
| johnny smith | 09/1/2021 | bread | oranges | peanut butter | apples | celery | peanuts |
| granny smith | 08/31/2021 | bread | oranges | peanut butter | apples | ||
| jane doe | 09/01/2021 | oranges | peanut butter | apples | |||
| jack frost | 08/01/2021 | bread | oranges | apples | |||
| cinderella | 08/16/2021 | bread | apples | peanuts |