I still have a difficult time thinking about how one works with R data.table columns which are lists.
Here is an R data.table:
library(data.table)
dt = data.table(
numericcol = rep(42, 8),
listcol = list(c(1, 22, 3), 6, 1, 12, c(5, 6, 1123), 3, 42, 1)
)
> dt
numericcol listcol
1: 42 1,22, 3
2: 42 6
3: 42 1
4: 42 12
5: 42 5, 6,1123
6: 42 3
7: 42 42
8: 42 1
I would like to create a column for the absolute values between the elements of numericcol and listcol:
> dt
numericcol listcol absvals
1: 42 1,22, 3 41, 20, 39
2: 42 6 36
3: 42 1 41
4: 42 12 30
5: 42 5, 6,1123 37, 36, 1081
6: 42 3 39
7: 42 42 0
8: 42 1 41
So, my first thought would be to use sapply() as follows:
dt[, absvals := sapply(listcol, function(x) abs(x-numericcol))]
This outputs the following:
> dt
numericcol listcol absvals
1: 42 1,22, 3 41
2: 42 6 20
3: 42 1 39
4: 42 12 41
5: 42 5, 6,1123 20
6: 42 3 39
7: 42 42 41
8: 42 1 20
So, absvals is now a column of unlisted elements, with an individual element in each row, and is a different dimension than the data.table.
(1) How would one create absvals to retain the list structure of listcol?
(2) In cases like these, if I am only interested in a vector of the values, how do R data.table users create such a data structure?
Maybe
vec = as.vector(dt[, absvals := sapply(listcol, function(x) abs(x-numericcol))])
?