When using a list column of data.tables in a nested data.table it is easy to apply a function over the column. Example:
dt<- data.table(mtcars)[, list(dt.mtcars = list(.SD)), by = gear]
We can use:
dt[ ,list(length = nrow(dt.mtcars[[1]])), by = gear]
dt[ ,list(length = nrow(dt.mtcars[[1]])), by = gear]
gear length
1: 4 12
2: 3 15
3: 5 5
or
dt[, list( length = lapply(dt.mtcars, nrow)), by = gear]
gear length
1: 4 12
2: 3 15
3: 5 5
I would like to do the same process and apply a modification by reference using the operator := to each data.table of the column.
Example:
modify_by_ref<- function(d){
d[, max_hp:= max(hp)]
}
dt[, modify_by_ref(dt.mtcars[[1]]), by = gear]
That returns the error:
Error in `[.data.table`(d, , `:=`(max_hp, max(hp))) :
.SD is locked. Using := in .SD's j is reserved for possible future use; a tortuously flexible way to modify by group. Use := in j directly to modify by group by reference.
Using the tip in the error message do not works in any way for me, it seems to be targeting another case but maybe I am missing something. Is there any recommended way or flexible workaround to modify list columns by refence?
:=directly in your j-expression:=directly in the j-expression is that is possible only if the data.table is unnested first.by=operations, which are optimized formaxand other common summary functions...