1

Given this data frame:

xlabel = list('xxxxxxyyyyyyzzzzzz')
fill= list('abc'*6)
val = np.random.rand(18)
df = pd.DataFrame({ 'xlabel':xlabel, 'fill':fill, 'val':val})

This is what I'm aiming at: http://matplotlib.org/mpl_examples/pylab_examples/barchart_demo.png

Applied to my example, Group would be x, y and z, Gender would be a, b and c, and Scores would be val.

I'm aware that in pandas plotting integration with matplotlib is still work in progress, so is it possible to do it directly in matplotlib?

4
  • note sure what you mean by work in progress: pandas.pydata.org/pandas-docs/stable/…, saying that not sure how to do the quantile ticks (someone will though) :) Commented Sep 3, 2013 at 18:24
  • @Andy Hayden: I was under the impression that not all matplotlib functionalities were working in pandas yet, given the Note at the top of the page you mentioned Commented Sep 3, 2013 at 18:46
  • You didn't need the quantile ticks? Commented Sep 3, 2013 at 19:25
  • @Andy Hayden: I posted an answer including the error bars, although I'm sure there's a better way to do it. Commented Sep 5, 2013 at 18:01

2 Answers 2

2

Is this what you want?

df.groupby(['fill', 'xlabel']).mean().unstack().plot(kind='bar')

or

df.pivot_table(rows='fill', cols='xlabel', values='val').plot(kind='bar')

You can brake it apart and fiddle with the labels and columns and title, but I think this basically gives you the plot you wanted.

For the error bars currently you'll have to go to the mpl directly.

mean_df = df.pivot_table(rows='fill', cols='xlabel',
                         values='val', aggfunc='mean')
err_df = df.pivot_table(rows='fill', cols='xlabel',
                        values='val', aggfunc='std')

rows = len(mean_df)
cols = len(mean_df.columns)
ind = np.arange(rows)
width = 0.8 / cols
colors = 'grb'

fig, ax = plt.subplots()
for i, col in enumerate(mean_df.columns):
    ax.bar(ind + i * width, mean_df[col], width=width,
           color=colors[i], yerr=err_df[col], label=col)

ax.set_xticks(ind + cols / 2.0 * width)
ax.set_xticklabels(mean_df.index)
ax.legend()

But there will be an enhancement, probably in the 0.13: issue 3796

Sign up to request clarification or add additional context in comments.

2 Comments

Is it possible to extend your code by including the error bars? I've posted an answer with possible solution, but there must be a better way.
@user2635863 For the error bars you'll have to use the matplotlib library directly. I updated the answer.
1

This was the only solution I found for displaying the error bars:

means = df.groupby(['fill', 'xlabel']).mean().unstack()
x_mean,y_mean,z_mean = means.val.x, means.val.y,means.val.z

sems = df.groupby(['fill','xlabel']).aggregate(stats.sem).unstack()
x_sem,y_sem,z_sem = sems.val.x, sems.val.y,sems.val.z

ind = np.array([0,1.5,3])
fig, ax = plt.subplots()
width = 0.35
bar_x = ax.bar(ind, x_mean, width, color='r', yerr=x_sem, ecolor='r')
bar_y = ax.bar(ind+width, y_mean, width, color='g', yerr=y_sem, ecolor='g')
bar_z = ax.bar(ind+width*2, z_mean, width, color='b', yerr=z_sem, ecolor='b')

ax.legend((bar_x[0], bar_y[0], bar_z[0]), ('X','Y','Z'))

I'd be happy to see a neater approach to tackle the problem though, possibly as an extension of Viktor Kerkez answer.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.