How to draw a bar plot in python using matplot lib with special design

Question

I want to display a bar plot comparing algorithms from different publications.

The data has the following properties:

Year of publication ( This is what I want my x axis to be)
Score (This is the bar height)
Data type (this will set the color for each bar)

I am having trouble making this happen (haven't gotten to enforce the 3rd demand).

Here is an example data and code :

import numpy as np
import matplotlib.pyplot as plt

dtypes = ['type1', 'type2', 'type3']
names = ['name1','name2','name3', 'name4']
score = [89.2, 95.54, 85, 86]
years = [2016, 2017, 2016, 2015]
methods_dtype = ['type1', 'type2', 'type1', 'type1']
pub_years = np.unique(years)

fig, ax = plt.subplots()
barplot = ax.bar(years, score)
plt.show()

The first problem here is that the two bars of 2016 are on top of each other ( I saw some examples that move the bars incrementally using the width, however, in this case, I do not know beforehand how many methods would be in that year).
The second problem is coding the colors.

Note that the input is just a subset of the data. There may be a year with multiple entries (more than one publication for a specific year). There may also be a data type with multiple entries (more than one method that operates on this data type).

If you have multiple datapoints for a year, how do you want that displayed? — wwii
– wwii, Commented Aug 31, 2017 at 23:14
@wwii multiple data points for a year should be displayed as bars which are side by side color coded by their data type property — itzik Ben Shabat
– itzik Ben Shabat, Commented Sep 1, 2017 at 6:20

Julien · Accepted Answer · 2017-09-01 05:00:22Z

2

Here's an example of what you can do, up to you to adapt it to your exact needs:

score = range(1,7)
years = [2015, 2016, 2017]*2
methods_dtype = ['type1', 'type2']*3

color = {'type1': 'b', 'type2': 'g'}
offset = {'type1': -0.2, 'type2': 0}

plt.figure(1).clf()
for s, y, m in zip(score, years, methods_dtype):
    x = y + offset[m]
    plt.bar(x, s, color=color[m], width=0.2)
plt.xticks([2015, 2016, 2017], [2015, 2016, 2017])

edited Sep 1, 2017 at 5:00

answered Sep 1, 2017 at 1:05

Julien

15.3k6 gold badges33 silver badges58 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

itzik Ben Shabat Over a year ago

This is very helpful but there is still a problem with the offset. There may be cases where in the same year there will be more than one method of a single type. so changing methods_dtype = ['type1', 'type2', 'type1']*2 doesn't give the desired output.

itzik Ben Shabat · Accepted Answer · 2017-09-01 07:53:44Z

So I eventually solved it and wanted to post the solution for future reference for others :

It was inspired by the answer by Julien.

What I did was plot each bar separately while keeping track of the spacing using a 2D array of years and data types. I also made it prettier.

import numpy as np
import matplotlib.pyplot as plt

def autolabel(rects):
    """
    Attach a text label above each bar displaying its height
    """
    for rect in rects:
        height = rect.get_height()
        ax.text(rect.get_x() + rect.get_width()/2., height,
                '%d' % int(height),
                ha='center', va='bottom')

methods  =[{"name": 'name1', "score": 89.2, "year":2016, "dtype": 'type1'},
            {"name": 'name2', "score": 95.54, "year":2017, "dtype": 'type2'},
            {"name": 'name3', "score": 85, "year":2016, "dtype": 'type1'},
            {"name": 'name4', "score": 86, "year":2015, "dtype": 'type1'},
           ]

pub_years = np.unique([method["year"] for method in methods])
dtypes =  np.unique([method["dtype"] for method in methods])
n_years =  len(pub_years)
n_dtypes = len(dtypes)
offset = np.zeros([n_years, n_dtypes])
width = 0.2
spacing = 0.01
color_list = plt.cm.Set3(np.linspace(0, 1, n_dtypes))
# colors = {'type1':'b', 'type2':'g'}
colors = {dtype: color_list[i] for i, dtype in enumerate(dtypes)}

legend_bars = []
fig, ax = plt.subplots()
for m in methods:
    i = int(np.squeeze(np.where(pub_years==m['year'])))
    j = int(np.squeeze([i for i, type in enumerate(dtypes) if type == m['dtype']]))
    x = m["year"] + offset[i][j]
    rect = ax.bar(x, m['score'], color=colors[m['dtype']], width=width)
    autolabel(rect)
    if offset[i][j]==0:
        legend_bars.append(rect)
    offset[i][j] = offset[i][j] + width + spacing


# add some text for labels, title and axes ticks
ax.set_ylabel('Accuracy')
ax.set_xlabel('Year of Publication')
ax.set_yticks(np.arange(0,105,5))
ax.set_ylim([0, 105])
ax.set_xticks(pub_years)
ax.set_xticklabels(pub_years)
ax.set_xlim([np.min(pub_years)- 1, np.max(pub_years) + 1])
ax.legend(legend_bars, dtypes)

plt.show()

Collectives™ on Stack Overflow

How to draw a bar plot in python using matplot lib with special design

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related