4

I spent a few hours searching for an answer, but I can't seem to get one.

Long story short, I have a dataframe. The following code will produce the dataframe in question (albeit anonymised with random numbers):

variable1 = ["Attribute 1","Attribute 1","Attribute 1","Attribute 1","Attribute 1","Attribute 1","Attribute 2","Attribute 2",
         "Attribute 2","Attribute 2","Attribute 2","Attribute 2","Attribute 3","Attribute 3","Attribute 3","Attribute 3",
         "Attribute 3","Attribute 3","Attribute 4","Attribute 4","Attribute 4","Attribute 4","Attribute 4","Attribute 4",
         "Attribute 5","Attribute 5","Attribute 5","Attribute 5","Attribute 5","Attribute 5"]


variable2 = ["Property1","Property2","Property3","Property4","Property5","Property6","Property1","Property2","Property3",
         "Property4","Property5","Property6","Property1","Property2","Property3",
         "Property4","Property5","Property6","Property1","Property2","Property3","Property4",
         "Property5","Property6","Property1","Property2","Property3","Property4","Property5","Property6"]

number = [93,224,192,253,186,266,296,100,135,169,373,108,211,194,164,375,211,71,120,334,59,164,348,50,249,18,251,343,172,41]

bar = pd.DataFrame({"variable1":variable1, "variable2":variable2, "number":number})

bar_grouped = bar.groupby(["variable1","variable2"]).sum()

The outcome should look like:

enter image description here

And the second one:

enter image description here

I have been trying to plot them with a bar chart and having the Properties as the groups and the different Attributes as the bars. Similar to this (plotted in Excel manually though). I would prefer to do it in the grouped datafarme, as to be able to plot with different groupings without the need to reset the index each time.

enter image description here

I hope this is clear.

Any help on this is hugely appreciated.

Thanks! :)

1
  • 1
    Try bar_grouped['number'].unstack(0).plot(kind='bar') Commented Aug 6, 2019 at 12:49

3 Answers 3

4

I wouldn't bother creating your groupby result (since you aren't aggregating anything). This is a pivot


bar.pivot('variable2', 'variable1', 'number').plot(kind='bar')

plt.tight_layout()
plt.show()

enter image description here


If aggregation is required, you can still start with your bar and use pivot_table

bar.pivot_table(index='variable2', columns='variable1', values='number', aggfunc='sum')
Sign up to request clarification or add additional context in comments.

1 Comment

Aggregation is required as this is only part of a larger DataFrame that has more values that can bee aggregated, eventually, so thanks for that! :)
3

Use unstack first:

bar_grouped['number'].unstack(0).plot(kind='bar')

[out]

enter image description here

Comments

2

Below code will do what you trying to establish :

import numpy as np
import matplotlib.pyplot as plt

# set width of bar
barWidth = 0.25
f = plt.figure(figsize=(15,8))

bars={}
bar_pos={}
for i,proprty in enumerate(bar_grouped.unstack().columns.droplevel(0).tolist()):
    bars[i] = bar_grouped.unstack()['number',proprty].tolist()
    if(i==0):
        bar_pos[i]=2*np.arange(len(bars1))
    else:
        bar_pos[i]=[x + barWidth for x in bar_pos[i-1]] 
    plt.bar(bar_pos[i], bars[i], width=barWidth, edgecolor='white', label=proprty, figure=f)

# Add xticks on the middle of the group bars
plt.xlabel('group', fontweight='bold')
plt.xticks([2*r + 2*barWidth for r in range(len(bars[0]))], bar_grouped.unstack().index.tolist())
# plt.figure(figsize=(10,5))

# Create legend & Show graphic
plt.legend(loc=0)
plt.show()

I took the solution from here and modified it to fit your need. Hope this helps!

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.