Conditional filtering using tidyverse

Question

I want to filter my data frame based on a variable that may or may not exist. As an expected output, I want a df that is filtered (if it has the filter variable), or the original, unfiltered df (if the variable is missing).

Here is a minimal example:

library(tidyverse)
df1 <- 
tribble(~a,~b,
        1L,"a",
        0L, "a",
        0L,"b",
        1L, "b")
df2 <- select(df1, b)

Filtering on df1 returns the required result, a filtered tibble.

filter(df1, a == 1)
# A tibble: 2 x 2
      a     b
  <int> <chr>
1     1     a
2     1     b

But the second one throws an error (expectedly), as the variable is not in the df.

filter(df2, a == 1)
Error in filter_impl(.data, quo) : 
  Evaluation error: object 'a' not found.

I tried filter_at, which would be an obvious choice, but it throws an error if there is no variable that matches the predicament.

filter_at(df2, vars(matches("a")), any_vars(. == 1L))    
Error: `.predicate` has no matching columns

So, my question is: is there a way to create a conditional filtering that produces the expected outcome, preferably within the tidyverse?

I think this Q&A stackoverflow.com/questions/44001722/… should answer your question — talat
– talat, Commented Sep 12, 2017 at 14:07
For example (as in the linked Q), you can do stuff like df2 %>% filter(if("a" %in% names(.)) a == 1 else TRUE) or df2 %>% {if("a" %in% names(.)) filter(., a == 1) else .} — talat
– talat, Commented Sep 12, 2017 at 14:09
@JanvanderLaan OP wants the original data back if variable doesn't exist ("the original, unfiltered df (if the variable is missing). ") — Spacedman
– Spacedman, Commented Sep 12, 2017 at 14:19
can't you just wrap it in a try or tryCatch then ? or you want to be able to use it in pipe chains ? — moodymudskipper
– moodymudskipper, Commented Sep 12, 2017 at 15:06

Tamas Nagy · Accepted Answer · 2017-09-13 10:16:02Z

14

As @docendo-discimus pointed out in the comments, the following solutions work. I also added rlang::has_name instead of "a" %in% names(.).

This Q&A contains the original idea: Conditionally apply pipeline step depending on external value.

df1 %>% 
   filter(if(has_name("a")) a == 1 else TRUE)
# A tibble: 2 x 2
      a     b
  <int> <chr>
1     1     a
2     1     b

df2 %>% 
   filter(if(has_name("a")) a == 1 else TRUE)
# A tibble: 4 x 1
      b
  <chr>
1     a
2     a
3     b
4     b

Or alternatively, by using {}:

df1 %>%
  {if(has_name("a")) filter(., a == 1L) else .} 
# A tibble: 2 x 2
      a     b
  <int> <chr>
1     1     a
2     1     b

> df2 %>%
+   {if(has_name("a")) filter(., a == 1L) else .}
# A tibble: 4 x 1
      b
  <chr>
1     a
2     a
3     b
4     b

edited Sep 13, 2017 at 10:16

answered Sep 12, 2017 at 20:45

Tamas Nagy

1,0611 gold badge13 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Roman · Accepted Answer · 2017-09-12 15:08:23Z

2

Something like this?

# function for expected output
foo <- function(x, y){
  tmp <- which(colnames(x) %in% y)
  if(length(tmp) > 0){
    filter(x, select(x, tmp) == 1)
  }else{
    df1
  }
}

# run the functions
foo(df1, "a")
foo(df2, "a")
# or

df1 %>% foo("a")
# A tibble: 2 x 2
      a     b
  <int> <chr>
1     1     a
2     1     b

df2 %>% foo("a")
# A tibble: 4 x 2
      a     b
  <int> <chr>
1     1     a
2     0     a
3     0     b
4     1     b

answered Sep 12, 2017 at 15:08

Roman

17.7k3 gold badges39 silver badges52 bronze badges

1 Comment

Tamas Nagy Over a year ago

Thanks, it is a fine answer, but I was looking for a tidyverse solution.

Collectives™ on Stack Overflow

Conditional filtering using tidyverse

2 Answers 2

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related