0

I need to aggregate a data.table and create a table with counts, means and other statistics for several variables. The format for the output table should always be the same, but I need to aggregate by various methods. How can I set the output columns and aggregate statistics once and use for different by= choices?

# Create data.table
library(data.table)
DT <- data.table(iris)

# This works, but is long and needs to be updated in multiple
# place whenever I update the output format
DT[,list(theCount=.N,
        meanSepalWidth=mean(Sepal.Width),
        meanPetalWidth=mean(Petal.Width)), 
   by=Species]

# This does not work. How could I achieve what I'm trying to do here?
col.list <-  list(theCount=.N,
        meanSepalWidth=mean(Sepal.Width),
        meanPetalWidth=mean(Petal.Width))
DT[,col.list,  by=Species]
2
  • 1
    You can write col.list = quote(yada yada); DT[, eval(col.list), by=Species]. This was documented in an earlier version of the data.table FAQ (... not sure where it went) Commented Jun 19, 2017 at 14:14
  • Perhaps, this answer is helpful which picks up a suggestion of Matt Dowles on how to deal with eval(). Commented Jun 19, 2017 at 14:24

0

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.