1

I have the below dataset from a survey that give each participant a list of foods and asks them to rank how likely they are to eat them this week. I want to plot the count for each likelhood of each food type on a graph.

Person    Food     Label
John      Pizza    Likely
John      Chinese  Unlikely
John      French   Very Unlikely
Debbie    Pizza    Unlikely
Debbie    Chinese  Very Likely
Debbie    French   Very Unlikely

For example:

Pizza     Likely         1
Pizza     Unlikely       1
Chinese   Unlikely       1
Chinese   Very Unlikely  1
French    Very Unlikely  2

So far I have read my file into a dataframe and done some basic cleaning.

import pandas as pd

raw_data = pd.read_excel('my_file_path')

#cleaning code
clean_data = raw_data(clean)

results = clean_data.groupby(['Food', 'Label']).count()

1 Answer 1

4

I believe you need to add column Person after groupby, reshape by unstack and plot by DataFrame.plot.bar:

results = clean_data.groupby(['Food', 'Label'])['Person'].count().unstack(fill_value=0)

Another solution with crosstab:

results = pd.crosstab(clean_data['Food'], clean_data['Label'])

print (results)
Label    Likely  Unlikely  Very Likely  Very Unlikely
Food                                                 
Chinese       0         1            1              0
French        0         0            0              2
Pizza         1         1            0              0

results.plot.bar()

plot

Sign up to request clarification or add additional context in comments.

1 Comment

when I unstack, I can no longer access rows and columns by their label. i.e. results['Food'] or results['Label']

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.