1

Hi I have something like:

df<-data.frame(group=c(1, 1, 1, 1, 2, 2, 2,3,3,3,4,4,4), 
               number=c(0,1,1,1,1,1,0,1,0,1,0,1,1))

I want my 'want' to count the times 'number'=1 repeats per 'group'. Then I want the max number of times it repeats (want2). I want the 'want' to reset to 1 when 'number'=0:

dfwant<-data.frame(group=c(1, 1, 1, 1, 2, 2, 2,3,3,3,4,4,4), 
               number=c(0,1,1,1,1,1,0,1,0,1,0,1,1),
               want=c(1,1,2,3,1,2,1,1,1,1,1,1,2),
               want2=c(3,3,3,3,2,2,2,1,1,1,2,2,2))

Thank you!

3 Answers 3

3

Since we want to reset when number = 0, we group_by group and create a new group whenever the number is 0. We then calculate want by taking cumsum of occurrence of 1 within each group and want2 by taking max in group.

library(dplyr)

df %>%
  group_by(group, group1 = cumsum(number == 0)) %>%
  mutate(want = cumsum(number == 1), 
         want = replace(want, number == 0, 1)) %>%
  group_by(group) %>%
  mutate(want1 = max(want)) %>%
  select(-group1)

#   group number  want want1
#   <dbl>  <dbl> <dbl> <dbl>
# 1     1      0     1     3
# 2     1      1     1     3
# 3     1      1     2     3
# 4     1      1     3     3
# 5     2      1     1     2
# 6     2      1     2     2
# 7     2      0     1     2
# 8     3      1     1     1
# 9     3      0     1     1
#10     3      1     1     1
#11     4      0     1     2
#12     4      1     1     2
#13     4      1     2     2
Sign up to request clarification or add additional context in comments.

1 Comment

You could create an column number ==0 and reuse it
2

We can do this easily with rleid and rowid from data.table

library(dplyr)
library(data.table)
df %>%
   group_by(group) %>%
   mutate(want = rowid(rleid(number)), want1 = max(want))
# A tibble: 13 x 4
# Groups:   group [4]
#   group number  want want1
#   <dbl>  <dbl> <int> <int>
# 1     1      0     1     3
# 2     1      1     1     3
# 3     1      1     2     3
# 4     1      1     3     3
# 5     2      1     1     2
# 6     2      1     2     2
# 7     2      0     1     2
# 8     3      1     1     1
# 9     3      0     1     1
#10     3      1     1     1
#11     4      0     1     2
#12     4      1     1     2
#13     4      1     2     2

Or using the data.table syntax

library(data.table)
setDT(df)[, want := rowid(rleid(number)),.(group)][, want1 := max(want), group][]

Comments

1

In base you can use ave for getting the cumsum per group like:

df$want  <- pmax(1, ave(df$number, df$group, cumsum(df$number==0), FUN=cumsum))
df$want2 <- ave(df$want, df$group, FUN=max)
identical(df, dfwant)
#[1] TRUE

To reset when df$number==0 you can use cumsum(df$number==0) as an additional grouping, as already shown by @Ronak-Shah, for ave.

In case df$number holds also other values than 0 and 1 you have to use:

df$want  <- pmax(1, ave(df$number==1, df$group, cumsum(df$number==0)
    , FUN=cumsum))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.