3

In my data.frame below, I want to subset all rows for which group == 0 AND all 3 years of 2017, 2018, and 2019 are also included.

Desired output in the example below is the information on rows 4, 5, and 6.

I have tried the following solution with no success. Is there a quick fix in BASE R?

dat <- data.frame(group = c(0,0,1, 0,0,0, 1, 1, 1), 
       year = rep(2017:2019, 3))

subset(dat, group == 0 & year == 2017 & year == 2018 & year == 2019)
0

1 Answer 1

2

If the OP wanted to treat the 'group' adjacent unique

library(dplyr)
library(data.table)
dat %>%
   group_by(grp = rleid(group)) %>%
   filter(all(2017:2019 %in% year), group == 0) %>%
   ungroup %>%
   select(-grp)
# A tibble: 3 x 2
#  group  year
#  <dbl> <int>
#1     0  2017
#2     0  2018
#3     0  2019

Or in base R with rle

grp <- with(rle(dat$group), rep(seq_along(values), lengths))
subset(dat, as.logical(ave(year,  grp, FUN = 
    function(x) all(2017:2019 %in% x)) ) & group == 0)
#  group year
#4     0 2017
#5     0 2018
#6     0 2019
Sign up to request clarification or add additional context in comments.

3 Comments

@rnorouzian i was thinking that you wanted to reduce the == number of times
@rnorouzian do you need subset(dat, as.logical(ave(year, grp, FUN = function(x) all(c(2017,2019) %in% x)) ) & group == 0 & year %in% c(2017, 2019))
@rnorouzian or another optoiin is to first do the filter i.e. dat1 <- subset(dat, year %in% c(2017, 2019)) and then apply the code in the solution

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.