A common task of mine is filtering (subsetting) datasets in the data.tables format. I want to subset rows in i in a complex sort of way with multiple column-specific boolean conditions. When I get a new dataset, it will have the same type of columns and I will want to filter them in the same way for all datasets.
To illustrate my task, let me first create an example data.table.
library(data.table)
dt <- data.table(a = seq(1,6), b = letters[seq(1,6)], c = rep(c(4,3,2)))
This yields
a b c
1: 1 a 4
2: 2 b 3
3: 3 c 2
4: 4 d 4
5: 5 e 3
6: 6 f 2
. Suppose I want to apply the following filtering criteria to the columns:
dt[b != 'd'][c < 4][a < 6]
yielding
a b c
1: 2 b 3
2: 3 c 2
3: 5 e 3
. Is there a way to convert that filtering criteria into a variable so that I can just tag it onto the end of the data.table?
I tried
x <- [b != 'd'][c < 4][a < 6]
dt[x]
but this throws the error
Error: unexpected '[' in "x <- ["
. This would be great because I could update the filtering strategy by changing just the variable x and have this filter then apply to all data.tables.
i1 <- dt[, b != 'd' & c < 4 & a < 6]; dt[i1]