0

I would like to filter out rows based on two column, one column is the ID and the other is a string of IDs (collapses by ",") that should be kept.

Example:


library(dplyr)

mtcars2 <- mtcars%>%
  mutate(carb_l=letters[carb], # This is the ID
         carb_list="c,f,h")%>% #IDs to keep
  select(-mpg,-cyl,-disp)# for clarity


head(mtcars2)
   hp drat    wt  qsec vs am gear carb carb_l carb_list
1 110 3.90 2.620 16.46  0  1    4    4      d     c,f,h
2 110 3.90 2.875 17.02  0  1    4    4      d     c,f,h
3  93 3.85 2.320 18.61  1  1    4    1      a     c,f,h
4 110 3.08 3.215 19.44  1  0    3    1      a     c,f,h
5 175 3.15 3.440 17.02  0  0    3    2      b     c,f,h
6 105 2.76 3.460 20.22  1  0    3    1      a     c,f,h


Expected output:

> mtcars2%>%filter((carb_l %in%c("c","f","h")))
   hp drat   wt qsec vs am gear carb carb_l carb_list
1 180 3.07 4.07 17.4  0  0    3    3      c     c,f,h
2 180 3.07 3.73 17.6  0  0    3    3      c     c,f,h
3 180 3.07 3.78 18.0  0  0    3    3      c     c,f,h
4 175 3.62 2.77 15.5  0  1    5    6      f     c,f,h
5 335 3.54 3.57 14.6  0  1    5    8      h     c,f,h
2
  • 3
    One way can be mtcars2 %>% rowwise() %>% filter(grepl(carb_l, carb_list)) Commented Nov 15, 2022 at 15:54
  • 3
    Use strsplit as for example in mtcars2 %>% filter((carb_l %in% unlist(strsplit(carb_list, ',')))). Commented Nov 15, 2022 at 15:55

2 Answers 2

3

You can use rowwise() and grepl , i.e.

mtcars2 %>% 
 rowwise() %>% 
 filter(grepl(carb_l, carb_list))

# A tibble: 5 × 10
# Rowwise: 
     hp  drat    wt  qsec    vs    am  gear  carb carb_l carb_list
  <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>  <chr>    
1   180  3.07  4.07  17.4     0     0     3     3 c      c,f,h    
2   180  3.07  3.73  17.6     0     0     3     3 c      c,f,h    
3   180  3.07  3.78  18       0     0     3     3 c      c,f,h    
4   175  3.62  2.77  15.5     0     1     5     6 f      c,f,h    
5   335  3.54  3.57  14.6     0     1     5     8 h      c,f,h
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you! This answer works for different elements in carb_list. Example: if the input has added mtcars2[1,10] <- "a,b,d", then this method would also keep the first row.
2
mtcars2 %>% 
   filter(carb_l %in% strsplit(mtcars2$carb_list[1], ",")[[1]])
               hp drat   wt qsec vs am gear carb carb_l carb_list
Merc 450SE    180 3.07 4.07 17.4  0  0    3    3      c     c,f,h
Merc 450SL    180 3.07 3.73 17.6  0  0    3    3      c     c,f,h
Merc 450SLC   180 3.07 3.78 18.0  0  0    3    3      c     c,f,h
Ferrari Dino  175 3.62 2.77 15.5  0  1    5    6      f     c,f,h
Maserati Bora 335 3.54 3.57 14.6  0  1    5    8      h     c,f,h

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.