I am trying to learn data.table syntax. I have most of the basics of simple summarizations but I am not getting how to use data.table to generate new columns from an existing column and summarize.
Here's a MWE example where I use dplyr and base tools to make multiple columns from one and thn summarize by grouping variables:
Current Input
## fact1 fact2 X0
## 1 b 2 9
## 2 a 2 6
## 3 b 1 7
## 4 c 2 3
## 5 a 1 8
## 6 a 1 4
## 7 a 1 5
## 8 a 1 1
## 9 b 1 2
## 10 b 2 10
Base + dlyr Code
set.seed(10)
dat <- data.frame(
fact1 = factor(sample(c('a', 'b', 'c'), 10, TRUE)),
fact2 = factor(sample(1:2, 10, TRUE)),
X0 = sample(1:10, 10)
)
add <- function(x, y) x + y
z <- sample(1:10, 6, FALSE)
library(dplyr)
z %>%
lapply(., add, dat[, 'X0']) %>%
do.call(cbind, .) %>%
cbind(dat, .) %>%
data.frame() %>%
group_by(fact1, fact2) %>%
summarise_each(funs(sum))
Desired output
## Source: local data frame [5 x 9]
## Groups: fact1
##
## fact1 fact2 X0 X1 X2 X3 X4 X5 X6
## 1 a 1 18 42 22 26 46 30 34
## 2 a 2 6 12 7 8 13 9 10
## 3 b 1 9 21 11 13 23 15 17
## 4 b 2 19 31 21 23 33 25 27
## 5 c 2 3 9 4 5 10 6 7
While I'm asking for a data.table specific solution I think seeing base and dplyr etc. solutions that are clever may make this question appeal to a broader reader.
addfunction, you might try magrittr, which includes it alongside similar fns