6

Following up my previous question: Sorting datetime objects by hour to a pandas dataframe then visualize to histogram

I need to plot 3 bars for one X-axis value representing viewer counts. Now they show those under one minute and above. I need one showing the overall viewers. I have the Dataframe but I can't seem to make them look right. With just 2 bars I have no problem, it looks just like I would want it with two bars: I need to plot 3 bars for one X-axis value representing viewer counts.

The relevant part of the code for this:

# Time and date stamp variables
allviews = int(df['time'].dt.hour.count())
date = str(df['date'][0].date())
hours = df_hist_short.index.tolist()
hours[:] = [str(x) + ':00' for x in hours]

The hours variable that I use to represent the X-axis may be problematic, since I convert it to string so I can make the hours look like 23:00 instead of just the pandas index output 23 etc. I have seen examples where people add or subtract values from the X to change the bars position.

fig, ax = plt.subplots(figsize=(20, 5))
short_viewers = ax.bar(hours, df_hist_short['time'], width=-0.35, align='edge')
long_viewers = ax.bar(hours, df_hist_long['time'], width=0.35, align='edge')

Now I set the align='edge' and the two width values are absolutes and negatives. But I have no idea how to make it look right with 3 bars. I didn't find any positioning arguments for the bars. Also I have tried to work with the plt.hist() but I couldn't get the same output as with the plt.bar() function.

So as a result I wish to have a 3rd bar on the graph shown above on the left side, a bit wider than the other two.

1
  • "I didn't find any positioning arguments for the bars." - this is because you have complete control over the positions of the bars in the first argument (your hours). This seems like a weird hassle compared with something like Excel, until you try and create a bar chart with uneven spacing and unequal bar widths in Excel :/ Commented May 10, 2019 at 13:05

2 Answers 2

5

In pure matplotlib, instead of using the width parameter to position the bars as you've done, you can adjust the x-values for your plot:

import numpy as np
import matplotlib.pyplot as plt

# Make some fake data:
n_series = 3
n_observations = 5
x = np.arange(n_observations)
data = np.random.random((n_observations,n_series))


# Plotting:

fig, ax = plt.subplots(figsize=(20,5))

# Determine bar widths
width_cluster = 0.7
width_bar = width_cluster/n_series

for n in range(n_series):
    x_positions = x+(width_bar*n)-width_cluster/2
    ax.bar(x_positions, data[:,n], width_bar, align='edge')

enter image description here

In your particular case, seaborn is probably a good option. You should (almost always) try keep your data in long-form so instead of three separate data frames for short, medium and long, it is much better practice to keep a single data frame and add a column that labels each row as short, medium or long. Use this new column as the hue parameter in Seaborn's barplot

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks Dan for following up on this question too, much appreciated! I just got into the habit of using pandas for literally everything, I'm working on picking up the syntax so I would come up with solutions in dataframes rather than pure python. :)
Pandas is great, just don't split your DataFrames unnecessarily. Seaborn works very well with pandas. Lastly, unless you want a lot of control over the look of the charts, there's probably no reason for you to stick with the pure matplotlib method.
5

pandas will do this alignment for you, if you make the bar plot in one step rather than two (or three). Consider this example (adapted from the docs to add a third bar for each animal).

import pandas as pd
import matplotlib.pyplot as plt

speed = [0.1, 17.5, 40, 48, 52, 69, 88]
lifespan = [2, 8, 70, 1.5, 25, 12, 28]
height = [1, 5, 20, 3, 30, 6, 10]
index = ['snail', 'pig', 'elephant',
         'rabbit', 'giraffe', 'coyote', 'horse']
df = pd.DataFrame({'speed': speed,
                   'lifespan': lifespan,
                   'height': height}, index=index)
ax = df.plot.bar(rot=0)

plt.show()

enter image description here

1 Comment

Thanks for pointing me in directions! Actually I have tried out with this example from the docs a few times yesterday but now you reassured me that this is the direction, so I pushed a little harder. The problem was obvious, and this simple and pragmatic solution worked after I converted my dataframe slices to lists.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.