3

Suppose I have the following data and a vector of column names

dt <- data.table(id = letters[1:10], amount = 1:10, id2 = c(rep('a',5),rep('b',5)),test=rep('a',10))
cols <- c('id','id2','test')

what I'm trying to do is select rows where all columns in the vector have the same specific value like

dt[id=='a' & id2=='a' & test == 'a']

but using the vector cols. Is there a way to do it?

Note: I need to find a way to do it using data.table or base R without making comparisons between the columns like

dt[id==id2 & id==test & id2==test]
2
  • Does dt need to be a data.table or can we use dataframe as well? Commented Aug 3, 2021 at 13:19
  • @elielink need to be a data.table Commented Aug 3, 2021 at 13:19

5 Answers 5

3

You can take help of .SDcols -

library(data.table)

dt[dt[, rowSums(.SD == 'a') == length(cols), .SDcols = cols]]

#   id amount id2 test
#1:  a      1   a    a

This can also be written as -

dt[rowSums(dt[, ..cols] == 'a') == length(cols), ]
Sign up to request clarification or add additional context in comments.

Comments

2

With dplyr and purrr, you could do:

dt %>%
    filter(reduce(.x = across(all_of(cols), ~ . == "a"), .f = `&`))

   id amount id2 test
1:  a      1   a    a

Or the same idea using data.table:

dt[dt[, Reduce(`&`, lapply(.SD, function(x) x == "a")), .SDcols = cols]]

2 Comments

thanks for the answer, but I forgot to write that I'm looking for a solution using data.table or base Rto not mix packages
Updated it with the data.table equivalent :)
1

Here's a data.table-friendly solution that is not looking for the literal "a":

This helper-function simply takes the table of its arguments, and returns TRUE if the number of distinct elements is other than "1". (I test for 2L because I use useNA="always", which will always append an NA count even if 0.)

allsame <- function(x) do.call(mapply, c(list(FUN = function(...) length(table(unlist(list(...)), useNA = "always")) == 2L), x))

From here, you can use it externally or internally:

dt[allsame(dt[,..cols]),]
#        id amount    id2   test
#    <char>  <int> <char> <char>
# 1:      a      1      a      a

It's in fact not data.table-specific, it'll work with anything as long as its argument is a data.frame-like object with all of the comments you need compared.

mtcars[1:3,10:11]
#               gear carb
# Mazda RX4        4    4
# Mazda RX4 Wag    4    4
# Datsun 710       4    1

allsame(mtcars[1:3,10:11])
# [1]  TRUE  TRUE FALSE

Comments

1

Another option is using a join:

dt[as.list(rep('a', length(cols))), on=cols]

Comments

0

Using if_all

library(dplyr)
dt %>%
    filter(if_all(all_of(cols), ~ . == 'a'))
   id amount id2 test
1:  a      1   a    a

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.