5

I would like to added a marginal space between groups of box plots by using the stats_summary method.

Here is a small example of my problem

library(ggplot2)
library(reshape2)
data1 <- (lapply(letters[1:5], function(l1) return(matrix(rt(5*3, 1), nrow = 5, ncol = 3, dimnames = list(cat2=letters[6:10], cat3=letters[11:13])))))
names(data1) <- letters[1:5]
data2 <- melt(data1)

customstats <- function(x) {
  xs <- sort(x)
  return(c(ymin=min(x), lower= mean(xs[xs < mean(x)]), middle = mean(x) , upper = mean(xs[xs > mean(x)]), ymax=max(x)))
}

ggplot(data2, aes(x=cat2, y=value, fill=cat3), width=2) + 
  stat_summary(fun.data = customstats, geom = "boxplot", 
    alpha = 0.5, position = position_dodge(1), mapping = aes(fill=cat3))

The result is the following picture. boxplots

I would like to achieve a visual separation for each "cat2" and add a "space" between the group of boxplots (I'm retricted to using the stats_summary since I have a custom statistic). How can I do it?

3
  • Probably the safest bet is to use geom_bar on manually precomputed data. Actually, this is the only possible method I see. Commented Jun 17, 2015 at 12:13
  • not ideal but you could use facets: + facet_grid(.~cat2, scales = "free_x") Commented Jun 17, 2015 at 12:29
  • That would work on one graph, yes. In my particular case (not in the example) I already have a facet_grid based on additional variables and thus cannot use your solution. Commented Jun 17, 2015 at 12:39

1 Answer 1

4

I have fixed a similar problem in an ugly (but effective for me) way by creating a dataframe with the same plotting variables as my original data, but with x (or y) positioned or factored that it fits between the two points I want to separate and missing values for y (or x). For your problem, I added the following code and got an image with spacial separation of clusters.

library(plyr)

empties <- data.frame(cat2_orig=unique(data2$cat2)[-length(unique(data2$cat2))])
#no extra space needed between last cluster and edge of plot
empties$cat2 <- paste0(empties$cat2_orig,empties$cat2_orig)
empties$value <- NA


data2_space <- rbind.fill(data2,empties)

ggplot(data2_space, aes(x=cat2, y=value, fill=cat3), width=2) + 
  stat_summary(fun.data = customstats, geom = "boxplot", 
           alpha = 0.5, position = position_dodge(1), mapping =     aes(fill=cat3)) +
#remove tickmarks for non-interesting points on x-axis
  scale_x_discrete(breaks=unique(data2$cat2))

Before & after

enter image description here

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks, it seems to work, but i don't get why. Unfortunately does not work for my actual data, but i will try to adapt the technique.
You're welcome! It works because it creates extra factor levels and plots them (but there's nothing to plot --> empty space). The scale_x_discrete then only adds ticks and labels to the original factor levels. If you have an example of your actual data, I'm willing to have a look at it to adapt it.
Thanks again. The problem i got is that the plot consists of faceset and that it showed the NA x NA empty facets. The solution was to expand.grid(unique(facet1), unique(facet2), additional_empty_factors) and then to rbind them to the data_space. That worked.
Good to know! Are you happy with the spacing? I also have tricks for that, but it gets even uglier.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.