I have a simple dataset with an id variable and date variable, and would like to create a counter variable (counter) that increments whenever date changes within the id variable. Assume the data is sorted by id and date, and that a specific date may appear any number of times within an id. This is very easily done in other languages (SAS with retain or Stata with by: and _n/_N), but I haven't found a very efficient way in R.
2 Answers
We can try
library(dplyr)
df1 %>%
group_by(id) %>%
mutate(counter= cumsum(c(TRUE, date[-1]!=date[-n()])))
# id date counter
# (dbl) (chr) (int)
#1 1 a 1
#2 1 a 1
#3 1 b 2
#4 1 b 2
#5 2 a 1
#6 2 a 1
#7 2 b 2
data
df1 <- data.frame(id= rep(c(1,2), c(4,3)), date= c('a', 'a',
'b', 'b', 'a', 'a', 'b'), stringsAsFactors=FALSE)
1 Comment
C. Johnson
Exactly what I needed. Thanks!

as.numeric(factor(df1$date, unique(df1$date)))by id?