28
library(ggplot2)

orderX <- c("A" = 1, "B" = 2, "C" = 3)
y <- rnorm(20)
x <- as.character(1:20)
group <- c(rep("A", 5), rep("B", 7), rep("C", 5), rep("A", 3))
df <- data.frame(x, y, group)
df$lvls <- as.numeric(orderX[df$group])

ggplot(data = df, aes(x=reorder(df$x, df$lvls), y=y)) + 
geom_point(aes(colour = group)) + 
geom_line(stat = "hline", yintercept = "mean", aes(colour = group))

I want to create a graph like this: graph with averages for each group

This does work, when I do not need to reorder the values of X, however, when I do use reorder, it doesn't work anymore.

5
  • I think your use of reorder is mistaken here, since it will just reorder X, not groups or Y. This will plot the wrong x with the wrong y! Commented Nov 22, 2010 at 11:41
  • Unless X doesn't mean anything but index, in which case, don't use it in the plot (use jitter instead?) Commented Nov 22, 2010 at 11:53
  • Then my use of reorder is mistaken. In my real data the values on x are labels for each individual measurement, which I do want to see. The ordering of these labels within the groups does not matter. Commented Nov 22, 2010 at 12:20
  • Maybe another reason why it does not work in my case is, because my x-values are not numeric, but character. Commented Nov 22, 2010 at 12:51
  • 1
    +1 for a concise question, with sample data and a picture. I'd give +1 for each of those if I could. Commented Nov 22, 2010 at 15:24

2 Answers 2

18

From your question, I don't this df$x is relevant to your data at all, especially if you can re-order it. How about just using group as x, and jitter the actual x position to separate the points:

ggplot(data=df, aes(x=group,y=y,color=group)) + geom_point() +
geom_jitter(position = position_jitter(width = 0.4)) +
geom_errorbar(stat = "hline", yintercept = "mean",
  width=0.8,aes(ymax=..y..,ymin=..y..))

I have used errorbar instead of h_line (and collapsed the ymax and ymin to y) since hline is complex. If anyone has a better solution to that part, I'd love to see.

alt text


update

If you want to preserve the order of X, try this solution (with modified X)

df$x = factor(df$x)

ggplot(data = df, aes(x, y, group=group)) + 
facet_grid(.~group,space="free",scales="free_x") + 
geom_point() + 
geom_line(stat = "hline", yintercept = "mean")

alt text

Sign up to request clarification or add additional context in comments.

6 Comments

This is indeed almost what I want, however, I do want to be able to see the original x-values on the x-scale.
When you do the re-order above, your data gets mixed up. You should sort on the original data frame, not just the x values. Do you want the x values interleaved in your chart? If they are, where do you want to place the mean values?
where did you find the documentation on geom_line(stat="hline", yintercept="mean")? That's really cool and I haven't seen it before.
I actually can't remember, will look it up tomorrow on my machina at work. Must be somewhere in the browser history. :)
This is were I found that: learnr.wordpress.com/2009/07/02/…
|
8

As of ggplot2 2.x this approach is unfortunately broken.

The following code provides exactly what I wanted, with some extra calculations up front:

library(ggplot2)
library(data.table)

orderX <- c("A" = 1, "B" = 2, "C" = 3)
y <- rnorm(20)
x <- as.character(1:20)
group <- c(rep("A", 5), rep("B", 7), rep("C", 5), rep("A", 3))
dt <- data.table(x, y, group)
dt[, lvls := as.numeric(orderX[group])]
dt[, average := mean(y), by = group]
dt[, x := reorder(x, lvls)]
dt[, xbegin := names(which(attr(dt$x, "scores") == unique(lvls)))[1], by = group]
dt[, xend := names(which(attr(dt$x, "scores") == unique(lvls)))[length(x)], by = group]

ggplot(data = dt, aes(x=x, y=y)) + 
    geom_point(aes(colour = group)) +
    facet_grid(.~group,space="free",scales="free_x") + 
    geom_segment(aes(x = xbegin, xend = xend, y = average, yend = average, group = group, colour = group))

The resulting image:

enter image description here

2 Comments

I'm not sure whether this will help in your exact situation, but the new solution I found with ggplot2 v2.1.0 for a similar problem is stat_summary(fun.y = "mean", fun.ymin = "mean", fun.ymax= "mean", size= 0.3, geom = "crossbar").
I tried that, that creates horizontal lines per item on the x-axis. The reason for that is, that the x-axis is discrete.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.