5

I have a dataset e.g.

outcome <- c(rnorm(500, 45, 10), rnorm(250, 40, 12), rnorm(150, 38, 7), rnorm(1000, 35, 10), rnorm(100, 30, 7))
group <- c(rep("A", 500), rep("B", 250), rep("C", 150), rep("D", 1000), rep("E", 100))
reprex <- data.frame(outcome, group)

I can plot this as a "dynamite" plot with:

graph <- ggplot(reprex, aes(x=group, y=outcome, fill=..y..)) +
  stat_summary(geom = "bar", fun.y = mean) +
  stat_summary(geom = "errorbar", fun.data = mean_cl_normal, width = 0.1)

giving:

picture of graph

I would also like to add beneath each column a label specifying how many observations were in that group. However I can't work out how to do this. I tried:

graph + geom_label (aes(label=paste(..count.., "Obs.", sep=" ")), y=-0.75, size=3.5, color="black", fontface="bold")

which returns

Error in paste(count, "Obs.", sep = " ") : 
  cannot coerce type 'closure' to vector of type 'character'

I've also tried

  graph + stat_summary(aes(label=paste(..y.., "Obs.", sep=" ")), fun.y=count, geom="label")

but this returns:

Error: stat_summary requires the following missing aesthetics: y

I know that I can do this if I just make a dataframe of summary statistics first but that will result in me creating a new dataframe every time I need a graph and therefore I'd ideally like to be able to plot this using stat_summary() from the original dataset.

Does anyone know how to do this?

1
  • graph + geom_label(aes(label=stat(y), group = group), stat = "summary", fun.y = mean) places the values on top of the bars, but it is a start Commented Feb 3, 2020 at 15:51

2 Answers 2

5

Without to create a new dataframe, you can get the count by using dplyr and calculating it ("on the fly") as follow:

library(dplyr)
library(ggplot2)
ggplot(reprex, aes(x=group, y=outcome, fill=..y..)) +
  stat_summary(geom = "bar", fun.y = mean) +
  stat_summary(geom = "errorbar", fun.data = mean_cl_normal, width = 0.1)+
  geom_label(inherit.aes = FALSE, data = . %>% group_by(group) %>% count(), 
            aes(label = paste0(n, " Obs."), x = group), y = -0.5)

enter image description here

Sign up to request clarification or add additional context in comments.

Comments

2

You cannot use stat="count" when there's already a y variable declared.. I would say the easiest way would be to create a small dataframe for counts:

label_df = reprex %>% group_by(group) %>% summarise(outcome=mean(outcome),n=n())

Then plot using that

ggplot(reprex, aes(x=group, y=outcome, fill=..y..)) +
  stat_summary(geom = "bar", fun.y = mean) +
  stat_summary(geom = "errorbar", fun.data = mean_cl_normal, width = 0.1)+
  geom_text(data=label_df,aes(label=paste(n, "Obs.", sep=" ")), size=3.5, color="black", fontface="bold",nudge_y =1)

enter image description here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.