3

I've created a histogram/density plot function where I want the y axis to be count rather than density, but am having problems parameterizing its binwidth.

I am using examples based on http://docs.ggplot2.org/current/geom_histogram.html to illustrate my attempts.

Here's the successful plotMovies1 function. I followed the referenced url to make the y axis ..count.. instead of ..density.. Note that it uses a hardcoded .5 binwidth in two places, which is what I want to parameterize ...

# I want y axis as count, rather than density, and followed
# https://stat.ethz.ch/pipermail/r-help/2011-June/280588.html
plotMovies1 <- function() {
  m <- ggplot(movies, aes(x = rating))
  m <- m + geom_histogram(binwidth = .5)
  m <- m + geom_density(aes(y = .5 * ..count..))
}

histogram/density with count as y axis and hardcoded binwidth

My first, failed naive attempt at parameterizing binwidth in a local bw in plotMovies2 ...

# Failed first attempt to parameterize binwidth
plotMovies2 <- function() {
  bw <- .5
  m <- ggplot(movies, aes(x = rating))
  m <- m + geom_histogram(binwidth = bw)
# Error in eval(expr, envir, enclos) : object 'bw' not found 
  m <- m + geom_density(aes(y = bw * ..count..))
}
> print(plotMovies2())
Error in eval(expr, envir, enclos) : object 'bw' not found

I see discussion about passing the local environment to aes in ggplot at https://github.com/hadley/ggplot2/issues/743, but plotMovies3 also fails in the same fashion, failing to find the bw object ...

# Failed second attempt to parameterize binwidth, even after establishing
# aes environment, per https://github.com/hadley/ggplot2/issues/743
plotMovies3 <- function() {
  bw <- .5
  m <- ggplot(movies, aes(x = rating), environment = environment())
  m <- m + geom_histogram(binwidth = bw)
# Error in eval(expr, envir, enclos) : object 'bw' not found 
  m <- m + geom_density(aes(y = bw * ..count..))
}
> print(plotMovies3())
Error in eval(expr, envir, enclos) : object 'bw' not found

I finally try setting a global, but it still fails to find the object ...

# Failed third attempt using global binwidth
global_bw <<- .5
plotMovies4 <- function() {
  m <- ggplot(movies, aes(x = rating), environment = environment())
  m <- m + geom_histogram(binwidth = global_bw)
# Error in eval(expr, envir, enclos) : object 'global_bw' not found 
  m <- m + geom_density(aes(y = global_bw * ..count..))
}
> print(plotMovies4())
Error in eval(expr, envir, enclos) : object 'global_bw' not found

Given plotMovies3 and plotMovies4, I am guessing it is not a straightforward environment issue. Can anyone shed any light on how I might resolve this? Again, my goal was to be able to create a histogram/density plot function where

  1. Its y axis is count rather than density, and
  2. Its binwidth could be parameterized (e.g., for manipulate)
5
  • Small note: running global_bw <<- 0.5 in no way creates a "global" variable. Using <- in this last example would have the same effect. <<- is simply a way of making a variable assignment in a different scope. If you had included that line inside your function you would have created an object in the global environment rather than the local one in your function. Commented Jun 10, 2015 at 16:28
  • your function doesn't return any object. if you put return(m) at the end, it might make things run more smoothly. Commented Jun 10, 2015 at 16:47
  • a minimal example would be bw= 0.5; m <- ggplot(movies, aes(x = rating)); m + geom_density(aes(y = bw * ..count..)) Commented Jun 10, 2015 at 17:05
  • Of potential interest (committed yesterday): github.com/hadley/ggplot2/commit/… Commented Jun 10, 2015 at 17:06
  • @joran, thanks for the commit link which seems to address github.com/hadley/ggplot2/issues/743 Commented Jun 10, 2015 at 22:52

3 Answers 3

3

By no means beautiful but if you need a workaround you can use the regular density function

plotMovies5 <- function(binw=0.5) {
  m <- ggplot(movies, aes(x = rating))
  m <- m + geom_histogram(binwidth = binw)
  wa <- density(x=movies$rating, bw = binw)
  wa <- as.data.frame(cbind(xvals = wa$x, yvals = wa$y * wa$n * binw))
  m <- m + geom_point(data = wa, aes(x = xvals, y = yvals))
}
print(plotMovies5(binw=0.25))

Note that you still have to do some tinkering with variables as the density estimates are not exactly equal as the following will show you:

binw = 0.5
m <- ggplot(movies, aes(x = rating))
m <- m + geom_density(aes(y = 0.5 * ..count..))
wa <- density(x=movies$rating, bw = binw)
wa <- as.data.frame(cbind(xvals = wa$x, yvals = wa$y * wa$n * binw))
m <- m + geom_point(data = wa, aes(x = xvals, y = yvals))
m
Sign up to request clarification or add additional context in comments.

Comments

1

An alternative is the use of predefined bins with aes_string. Histograms then may be created by a loop with variable binwidths:

bins <<- list()
bins[["Variable1"]] <- 2
bins[["Variable2"]] <- 0.5
bins[["Variable3"]] <- 1
print(ggplot(movies, aes(x = rating))+
aes_string(x = "rating", y=paste("..density..*",bins[[i]],sep="")), na.rm=TRUE, position='dodge', binwidth=bins[[i]])

Comments

1

This is a follow-up on mts. It is intended as a long comment: first, the dataset is obtained by loading library("ggplot2movies"). Secondly, it may be of interest to loop over several values of the binw to produce a series of figures to be used together for, e.g. an animation. So what the code below does is simply to put mts's code into a loop for this purpose. A minor contribution indeed.

    ### Data
    library("ggplot2movies")

    ### Histograms
    ggplotMovieHistogram <- function(binw = 0.5) {
        require('ggplot2movies')
        p <- ggplot(movies, aes(x = rating)) + 
            geom_histogram(binwidth = binw)
        wa <- density(x = movies$rating, bw = binw)
        wa <- as.data.frame(cbind(xvals = wa$x, yvals = wa$y * wa$n * binw))
        p <- p + geom_point(data = wa, aes(x = xvals, y = yvals))
        return(p)
    }

    ggsaveMovieHistogram <- function(binw = 0.5, file = 'test.pdf') {
        pdf(file, width = 8, height = 8)
            print(ggplotMovieHistogram(binw = binw))
        dev.off()
    }

    for(i in seq(0.2, 0.8, by = 0.2)) {
        ggsaveMovieHistogram(binw = i, 
                    file = paste0('ggplot-barchart-loop-histogram-', 
                                  format(i, decimal.mark = '-'), 
                                  '.pdf'))
    }


    ### Densities
    library("ggplot2movies")
    ggplotMovieDensity <- function(binw = 0.5) {
        require('ggplot2movies')
        p <- ggplot(movies, aes(x = rating)) + 
            geom_density(aes(y = 0.5 * ..count..))
        wa <- density(x = movies$rating, bw = binw)
        wa <- as.data.frame(cbind(xvals = wa$x, yvals = wa$y * wa$n * binw))
        p <- p + geom_point(data = wa, aes(x = xvals, y = yvals))
        return(p)
    }

    ggsaveMovieDensity <- function(binw = 0.5, file = 'test.pdf') {
        pdf(file, width = 8, height = 8)
            print(ggplotMovieDensity(binw = binw))
        dev.off()
    }

    for(i in seq(0.2, 0.8, by = 0.2)) {
        ggsaveMovieDensity(binw = i, 
                    file = paste0('ggplot-barchart-loop-density-', 
                                  format(i, decimal.mark = '-'), 
                                  '.pdf'))
    }

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.