Apologies if this is a duplicate. I am very new to data.table, and have seen very similar questions asked here but none that exactly answered my question.
I would like to find a concise syntax to aggregate multiple columns of a data.table with the same aggregation function, with customized names of the resulting aggregated columns.
setup
library(data.table)
data(mtcars)
setDT(mtcars)
If I call
mtcars[, lapply(.SD, sum, na.rm = TRUE), by = .(am, gear), .SDcols = c('mpg','cyl')]
The result is
am gear mpg cyl
1: 1 4 210.2 36
2: 0 3 241.6 112
3: 0 4 84.2 20
4: 1 5 106.9 30
This is great but I want the last two columns to be called by customized names that I define ahead of time.
I can achieve the desired result with
mtcars[, .(sum_of_mpg = sum(mpg, na.rm = TRUE), sum_of_cyl = sum(cyl, na.rm = TRUE)), by = .(am, gear)]
This results in
am gear sum_of_mpg sum_of_cyl
1: 1 4 210.2 36
2: 0 3 241.6 112
3: 0 4 84.2 20
4: 1 5 106.9 30
But this result cannot be generalized to allow me to define the custom names beforehand.
I've tried the code below and various variants of it, but nothing gives this result in one step.
custom_names <- c('sum_of_mpg','sum_of_cyl')
mtcars[, (custom_names) = lapply(.SD, sum, na.rm = TRUE), by = .(am, gear), .SDcols = c('mpg','cyl')]
Is there a way to do this concisely? This is necessary because the code may be embedded in a function and may need to work on an indefinite number of columns.