I'm looking for a solution in dplyr for the task of selecting columns of a dataframe based on multiple conditions. Say, we have this type of df:
X <- c("B", "C", "D", "E")
a1 <- c(1, 0, 3, 0)
a2 <- c(235, 270, 100, 1)
a3 <- c(3, 1000, 900, 2)
df1 <- data.frame(X, a1, a2, a3)
Let's further assume I want to select that column/those columns that are
- (i) numeric
- (ii) where all values are smaller than 5
That is, in this case, what we want to select is column a1. How can this be done in dplyr? My understanding is that in order to select a column in dplyr you use select and, if that selection is governed by conditions, also where. But how to combine two such select(where...) statements? This, for example, is not the right way to do it as it throws an error:
df1 %>%
select(where(is.numeric) & where(~ all(.) < 5))
Error: `where()` must be used with functions that return `TRUE` or `FALSE`.
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning message:
In all(.) : coercing argument of type 'character' to logical
df1 %>% select(where(\(.) is.numeric(.) & all(.) <5))removes the error, but gives the wrong answer. :=( I was on the right line, but @benson23 beat me to the answer.