I wrote a small function to aggregate several columns by a discrete variable:
library(data.table)
onewayfn <- function(df, x, weight = NULL, displacement = NULL, by = NULL){
.x <- deparse(substitute(x))
.weight <- deparse(substitute(weight))
.displacement <- deparse(substitute(displacement))
.by <- deparse(substitute(by)) # Does not work with multiple variables!
cols <- c(.weight, .displacement)
cols <- cols[cols != "NULL"]
.xby <- c(.x, .by)
.xby <- .xby[.xby != "NULL"]
data.table::data.table(df)[, lapply(.SD, sum, na.rm = TRUE), by = .xby, .SDcols = cols][]
}
The sums of the variables wt and disp are returned (grouped by cyl, and am):
onewayfn(mtcars, cyl, weight = wt, displacement = disp, by = am)
#> cyl am wt disp
#> 1: 6 1 8.265 465.0
#> 2: 4 1 16.338 748.9
#> 3: 6 0 13.555 818.2
#> 4: 8 0 49.249 4291.4
#> 5: 4 0 8.805 407.6
#> 6: 8 1 6.740 652.0
The following also returns the correct result:
onewayfn(mtcars, cyl, weight = wt, displacement = disp)
#> cyl wt disp
#> 1: 6 21.820 1283.2
#> 2: 4 25.143 1156.5
#> 3: 8 55.989 4943.4
However, the function returns an error if I add multiple variables to by:
onewayfn(mtcars, cyl, weight = wt, displacement = disp, by = list(am,vs))
I would like to obtain the same result as above but now grouped by cyl, am, and vs. How can I rewrite onewayfn() to do this?