3

I want to add annotations such as n=5, n=4 with the number of data points in each boxplot at the top edge of my geom_boxplot plot.

I am aware I can do this with geom_text by precomputing the counts, but it seems that ggplot2, having all these wonderful binning and summarizing functionality, ought to be able to do this itself?

Let's assume we have these data:

library(tidyverse)

dd = tribble(
    ~val, ~kind,
    1,    'A',
    3,    'A',
    5,    'A',
    5,    'A',
    6,    'A',
    3,    'B',
    4,    'B',
    4,    'B',
    5,    'B'
)

I have tried this:

> base = ggplot(dd, aes(x=kind, y=val)) + geom_boxplot()
> base + geom_text(y=6, label=..count.., stat='count')

Error in layer(data = data, mapping = mapping, stat = stat, geom = GeomText,  : 
  object '..count..' not found

Presumably, geom_text has simply ignored my stat parameter?

Next, I tried this:

> base + stat_count(aes(y=6, label=..count..), geom='text')

Error: stat_count() must not be used with a y aesthetic.

Shouldn't it be my own problem whether I can do anything useful with the resulting ..count.., "y aesthetic" or not?

Both of these attempts appear sensible to me.
Can anybody explain conceptually why ggplot2 does not accept these commands?
And whether there is any approach with ggplot2-supplied counting that will work?

1
  • You can get relatively close with your stat_count example if you don't inherit the y aesthetic. The text will be plotted at the count value. You can move those via position_nudge or y = ..count.. - somevalue but, of course, that means the labels end up not being lined up if there are different counts per group. stat_count( aes(x = kind, label = paste0("n = ", ..count..) ), geom = "text", position = position_nudge(y = -2), inherit.aes = FALSE ) Commented Dec 12, 2017 at 16:43

1 Answer 1

3

This is a design limitation of ggplot2. If Hadley rewrote it now he'd probably implement it differently. Conceptually, you'd want to have two separate mappings, one for the stat and one for the geom. However, ggplot2 doesn't work that way. It only has one set of mappings that for the most part is applied to both the stat and the geom. There's a bit of a workaround in that you can use ..variable.. to refer in the geom to variables calculated in the stat, but the mappings are still all thrown together.

There is no functionality currently that allows you to specify that the y aesthetic is only meant for geom_text and that stat_count should ignore it.

Another scenario where this comes up all the time is vertical or horizontal versions of stats that otherwise are horizontal or vertical. There's an entire package for that, ggstance. Conceptually, this doesn't make much sense. Why can't I calculate a density using stat_density(), and then map the "x" variable of the density curve (i.e., the variable the density is calculated over) to the y aesthetic and the "y" variable (i.e., the height of the density) to the x aesthetic. Instead, I need to use stat_xdensity() which is identical to stat_density() except it swaps x and y.

I've been thinking that it might be possible to extend ggplot2 without breaking it by adding a separate layer()-type function that takes two aesthetics arguments, one for the stat and one for the geom. I.e., something like:

layer2(aes_geom(y = ..x.., x = ..y..),
       aes_stat(x = variable),
       geom = "line", stat = "density")

(This would draw a vertical density line, similar to the outline of half a violin plot.)

One other non-intuitive limitation we often run into is that calculations in the aes transformations don't respect the data grouping. For example, let's say we want to mark the median line of boxplots with a red dot. We might try:

ggplot(iris, aes(x = Species, y = Sepal.Length)) + 
  geom_boxplot() +
  geom_point(aes(x = Species, y = median(Sepal.Length)), size = 3, color = "red")

This is the result:

enter image description here

The median is calculated over the entire data column, not separately by species.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.