1

I want to display a bar plot comparing algorithms from different publications.

The data has the following properties:

  1. Year of publication ( This is what I want my x axis to be)
  2. Score (This is the bar height)
  3. Data type (this will set the color for each bar)

I am having trouble making this happen (haven't gotten to enforce the 3rd demand).

Here is an example data and code :

import numpy as np
import matplotlib.pyplot as plt

dtypes = ['type1', 'type2', 'type3']
names = ['name1','name2','name3', 'name4']
score = [89.2, 95.54, 85, 86]
years = [2016, 2017, 2016, 2015]
methods_dtype = ['type1', 'type2', 'type1', 'type1']
pub_years = np.unique(years)

fig, ax = plt.subplots()
barplot = ax.bar(years, score)
plt.show()

The first problem here is that the two bars of 2016 are on top of each other ( I saw some examples that move the bars incrementally using the width, however, in this case, I do not know beforehand how many methods would be in that year).
The second problem is coding the colors.

Note that the input is just a subset of the data. There may be a year with multiple entries (more than one publication for a specific year). There may also be a data type with multiple entries (more than one method that operates on this data type).

3
  • 1
    If you have multiple datapoints for a year, how do you want that displayed? Commented Aug 31, 2017 at 23:14
  • @wwii multiple data points for a year should be displayed as bars which are side by side color coded by their data type property Commented Sep 1, 2017 at 6:20
  • I think the easiest solution would be seaborn.barplot Commented Sep 2, 2017 at 21:48

2 Answers 2

2

Here's an example of what you can do, up to you to adapt it to your exact needs:

score = range(1,7)
years = [2015, 2016, 2017]*2
methods_dtype = ['type1', 'type2']*3

color = {'type1': 'b', 'type2': 'g'}
offset = {'type1': -0.2, 'type2': 0}

plt.figure(1).clf()
for s, y, m in zip(score, years, methods_dtype):
    x = y + offset[m]
    plt.bar(x, s, color=color[m], width=0.2)
plt.xticks([2015, 2016, 2017], [2015, 2016, 2017])
Sign up to request clarification or add additional context in comments.

1 Comment

This is very helpful but there is still a problem with the offset. There may be cases where in the same year there will be more than one method of a single type. so changing methods_dtype = ['type1', 'type2', 'type1']*2 doesn't give the desired output.
0

So I eventually solved it and wanted to post the solution for future reference for others :

It was inspired by the answer by Julien.

What I did was plot each bar separately while keeping track of the spacing using a 2D array of years and data types. I also made it prettier.

import numpy as np
import matplotlib.pyplot as plt

def autolabel(rects):
    """
    Attach a text label above each bar displaying its height
    """
    for rect in rects:
        height = rect.get_height()
        ax.text(rect.get_x() + rect.get_width()/2., height,
                '%d' % int(height),
                ha='center', va='bottom')

methods  =[{"name": 'name1', "score": 89.2, "year":2016, "dtype": 'type1'},
            {"name": 'name2', "score": 95.54, "year":2017, "dtype": 'type2'},
            {"name": 'name3', "score": 85, "year":2016, "dtype": 'type1'},
            {"name": 'name4', "score": 86, "year":2015, "dtype": 'type1'},
           ]

pub_years = np.unique([method["year"] for method in methods])
dtypes =  np.unique([method["dtype"] for method in methods])
n_years =  len(pub_years)
n_dtypes = len(dtypes)
offset = np.zeros([n_years, n_dtypes])
width = 0.2
spacing = 0.01
color_list = plt.cm.Set3(np.linspace(0, 1, n_dtypes))
# colors = {'type1':'b', 'type2':'g'}
colors = {dtype: color_list[i] for i, dtype in enumerate(dtypes)}

legend_bars = []
fig, ax = plt.subplots()
for m in methods:
    i = int(np.squeeze(np.where(pub_years==m['year'])))
    j = int(np.squeeze([i for i, type in enumerate(dtypes) if type == m['dtype']]))
    x = m["year"] + offset[i][j]
    rect = ax.bar(x, m['score'], color=colors[m['dtype']], width=width)
    autolabel(rect)
    if offset[i][j]==0:
        legend_bars.append(rect)
    offset[i][j] = offset[i][j] + width + spacing


# add some text for labels, title and axes ticks
ax.set_ylabel('Accuracy')
ax.set_xlabel('Year of Publication')
ax.set_yticks(np.arange(0,105,5))
ax.set_ylim([0, 105])
ax.set_xticks(pub_years)
ax.set_xticklabels(pub_years)
ax.set_xlim([np.min(pub_years)- 1, np.max(pub_years) + 1])
ax.legend(legend_bars, dtypes)

plt.show()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.