13

I generate a barplot with geom_col() with two classes separated by color. Then I try to add a mean line for each class.

Here is what I'd like to get:

Desired output

But with the code below the mean line is for each bar independently what I put to group argument.

Here is a reproducible example:

library(tidyverse)

df = data.frame(
  x = 1:10,
  y = runif(10),
  class = sample(c("a","b"),10, replace=T) %>% factor()
) %>% 
  mutate(x = factor(x, levels=x[order(class, -y)]))

ggplot(df, aes(x, y, fill=class)) +
geom_col() +
stat_summary(fun.y = mean, geom = "errorbar", 
             aes(ymax = ..y.., ymin = ..y.., group = class),
             width = 1, linetype = "solid")

What I get

Please tell me what I'm doing wrong. Or any other way (with ggplot) to achieve this?

4 Answers 4

17

I combined the solution from @bouncyball with my original approach using `geom_errorbar.

Here is the code:

df.mean = df %>% 
  group_by(class) %>% 
  mutate(ymean = mean(y))

ggplot(df, aes(x, y, fill=class)) +
  geom_col() +
  geom_errorbar(data=df.mean, aes(x, ymax = ymean, ymin = ymean),
               size=0.5, linetype = "longdash", inherit.aes = F, width = 1)

enter image description here

The only problem is that instead of single line this approach generate a lot of line objects which can be seen when editing the plot, for example, in Adobe Illustrator. But I can live with it.

UPDATE

Another solution - simpler and without the above problem. Again based on the code from @bouncyball.

df.mean = df %>% 
  group_by(class) %>% 
  summarise(ymean = mean(y), x1 = x[which.min(x)], x2 = x[which.max(x)]) %>% 
  ungroup()

ggplot(df) +
  geom_col(aes(x, y, fill = class)) +
  geom_segment(data = df.mean,
               aes(x = as.integer(x1) - 0.5, xend = as.integer(x2) + 0.5,
                   y = ymean, yend = ymean),
               size=1, linetype = "longdash", inherit.aes = F)
Sign up to request clarification or add additional context in comments.

Comments

3

Create a new data.frame (adding a group mean) and do some manipulations on it (using top_n and cbind), then use those to supply the necessary aesthetics to geom_segment:

# add group mean
df_m <- df %>%
  group_by(class) %>%
  mutate(my = mean(y)) %>%
  arrange(class) # added from comment by @Yuk

# select top and bottom x for each class group
# use cbind to keep one row per group
df_m2 <- df_m %>%
  top_n(1, x) %>%
  cbind(top_n(df_m, -1, x))

ggplot(df) +
  geom_col(aes(x, y, fill=class))+
  geom_segment(data = df_m2,
               aes(x = x, xend = x1,
                   y = my, yend = my1,
                   group = class))

enter image description here

2 Comments

Thank you! Nice solution! Two issues. First, top_n can pickup classes in different order and cause incorrect result from cbind. Need to add dm %>% arrange(class) %>%. Second, is there a way to expand the lines to cover whole bars on the left and right?
@yuk you're right about arrange, it would probably be safer to use inner_join instead of top_n...
1

With your existing ggplot, Try This Code:

+
geom_hline(data = [*name of data frame*], aes(yintercept = mean(*name of the variable*), color = "red")

4 Comments

Have you actually tried this code? geom_hline draws the line for each class across the whole plot. I've tested it first. Sorry, it's not a solution.
@yuk, actually it worked for me. I have suggested edits in the code to make it more explicit.
@Tiny_hopper: I gave it another try, thought ggplot2 may have some updated behavior. But no, it's the same. If it works for you, can you please post the whole code to reproduce? Maybe as another answer with the resulted figure.
@yuk see the answer below.
0

I am adding this as an answer as the previous answer given by @Ryan seems to be a partial answer and does not contain the whole code chunk as requested by @yuk.

If df2 is your dataframe that contains site and spCount_site columns as used in the code below:

library (ggplot2)
p <- ggplot(data = df2, aes(x = site, y = spCount_site)) +
  geom_bar(stat = "identity", fill = rainbow(nrow(df2))) +
  geom_hline(yintercept = mean(df2$spCount_site), color="black") # a horizontal line of black color will be drawn at a height using the mean of `spCount_site` column
p

The image below I created using the codes above based on my own data

enter image description here

2 Comments

See, this is exactly what I was saying - you can get only one horizontal line for whole dataset, not for individual groups.
@yuk This code chunk is meant for one horizontal bar, but if you need multiple bar for multiple groups then the first (at the top) answer should work. This is just a matter of manipulating your codes.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.