Combining vector and binary search in data.table

Question

Sometimes, I have a keyed data.table which I'd like to subset according to its key and an unkeyed column. What's the simplest/fastest way to do this?

What feels most natural is an error:

dt <- data.table(id = 1:100, var = rnorm(100), key = "id")
dt[.(seq(1, 100, 2)) & var > 0, ]

The next cleanest thing is to chain:

dt[.(seq(1, 100, 2))][var > 0, ]

And of course we can ditch using binary search at all (I think this is clearly to be avoided):

dt[id %in% seq(1, 100, 2) & var > 0, ]

Is there an approach I'm missing? Also, any particular reason why the first is an error? The syntax seems clear enough to me.

I'm betting on the "clean" chain. If your second condition is an inequality, I doubt the current system of indexing can help. There is "auto indexing" on equality conditions now, but I'm not sure about the details. It's mentioned in the news: github.com/Rdatatable/data.table If you need to do a by=.EACHI with your subset, you'll have to switch the chain around, I guess. dt[var>2][.(seq(1,100,2)),...do stuff...,by=.EACHI] — Frank
– Frank, Commented May 14, 2015 at 1:09
so it seems like the answer really depends on what I want to do in j, is that safe to say? — MichaelChirico
– MichaelChirico, Commented May 14, 2015 at 4:26

MichaelChirico · Accepted Answer · 2015-12-21 02:45:16Z

0

As of this writing, the native way to do:

dt[.(seq(1, 100, 2)) & var > 0, j] #some expression j

is the following:

dt[.(seq(1, 100, 2)), .SD[var > 0, j]]

The more I work with data.table, the more natural this is, but it still looks a bit unintuitive. C'est la vie.

Collectives™ on Stack Overflow

Combining vector and binary search in data.table

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related