1

A common task of mine is filtering (subsetting) datasets in the data.tables format. I want to subset rows in i in a complex sort of way with multiple column-specific boolean conditions. When I get a new dataset, it will have the same type of columns and I will want to filter them in the same way for all datasets.

To illustrate my task, let me first create an example data.table.

library(data.table)

dt <- data.table(a = seq(1,6), b = letters[seq(1,6)], c = rep(c(4,3,2)))

This yields

   a b c
1: 1 a 4
2: 2 b 3
3: 3 c 2
4: 4 d 4
5: 5 e 3
6: 6 f 2

. Suppose I want to apply the following filtering criteria to the columns:

 dt[b != 'd'][c < 4][a < 6]

yielding

 a b c
1: 2 b 3
2: 3 c 2
3: 5 e 3

. Is there a way to convert that filtering criteria into a variable so that I can just tag it onto the end of the data.table?

I tried

x <- [b != 'd'][c < 4][a < 6]
dt[x]

but this throws the error

Error: unexpected '[' in "x <- ["

. This would be great because I could update the filtering strategy by changing just the variable x and have this filter then apply to all data.tables.

1
  • 1
    You need i1 <- dt[, b != 'd' & c < 4 & a < 6]; dt[i1] Commented Apr 9, 2019 at 16:25

1 Answer 1

2

If it is to applied on different dataset, quote the expression and evaluate it on each dataset

i1 <- quote(b != 'd' & c < 4 & a < 6)
dt[dt[, eval(i1)]]
#   a b c
#1: 2 b 3
#2: 3 c 2
#3: 5 e 3
Sign up to request clarification or add additional context in comments.

1 Comment

and also: dt[eval(i1)]

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.