How can I filter multiple columns with dplyr using string matching for the column name?

Question

I'm curious about what would be a more succinct way to achieve this:

nyc_crashes %>% filter(
  `NUMBER OF PERSONS INJURED`     >= 1 |
  `NUMBER OF PERSONS KILLED`      >= 1 |
  `NUMBER OF PEDESTRIANS INJURED` >= 1 |
  `NUMBER OF PEDESTRIANS KILLED`  >= 1 |
  `NUMBER OF CYCLIST INJURED`     >= 1 |
  `NUMBER OF CYCLIST KILLED`      >= 1 |
  `NUMBER OF MOTORIST INJURED`    >= 1 |
  `NUMBER OF MOTORIST KILLED`     >= 1
)

Could I use string matching for "INJURED | KILLED" and not have to write a different condition for each column name?

These might help stackoverflow.com/a/60270550/786542 & stackoverflow.com/a/60269597/786542 — Tung
– Tung, Commented May 17, 2020 at 16:56

Jonathan V. Solórzano · Accepted Answer · 2020-05-17 18:00:04Z

2

You could do that using filter_at with ends_with.

library(dplyr)
nyc_crashes %>%
  # Select columns that end with KILLED or INJURED
  filter_at(vars(c(ends_with("KILLED"),ends_with("INJURED"))), 
            # Keep rows where any of these variables is >= 1 
                          any_vars(. >= 1))

edited May 17, 2020 at 18:00

answered May 17, 2020 at 16:33

Jonathan V. Solórzano

5,78516 silver badges26 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

sdS Over a year ago

is there a way to do this same thing but to keep rows where variable takes on specific values? For instance, instead of just >=1, where they equal 1, 3, or 99?

Jonathan V. Solórzano Over a year ago

You could use any_vars(. %in% c(1, 3, 99)))

Collectives™ on Stack Overflow

How can I filter multiple columns with dplyr using string matching for the column name?

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related