Creating sequence of dates in R by group, dependent on another variable

Question

Right now, my dataset is in wide format, meaning I have one row per person, but I want a long dataset, with multiple rows per person. I have two date variables, ADATE and DDATE, that I want to use as my start and end points, respectively. For example, if someone's ADATE is 02/04/10 and DDATE is 02/07/10, I want 4 rows:

Have:

ID ADATE     DDATE     
1  02/04/10  02/07/10

Want:

ID ADATE     DDATE     NEW_DATE
1  02/04/10  02/07/10  02/04/10
1  02/04/10  02/07/10  02/05/10
1  02/04/10  02/07/10  02/06/10
1  02/04/10  02/07/10  02/07/10

I have multiple datasets that I want to do this for, and I have written code that works for every single dataset except one...I'm not sure why. This is my attempt and the error I get:

jan15_long <- chf_jan15 %>%
  mutate(NEW_DATE = as.Date(ADATE)) %>%
  group_by(ID) %>%
  complete(NEW_DATE = seq.Date(as.Date(ADATE), as.Date(DDATE), by = "day")) %>%
  fill(vars) %>%
  ungroup()
Error in seq.Date(as.Date(ADATE), as.Date(DDATE), by = "day") : 
  'from' must be of length 1

The above code gives me what I want and runs perfectly for every other dataset I have (10 out of 11).

Is there a better way to do this? dplyr makes the most sense to me, so hopefully there's a solution to this.

akrun · Accepted Answer · 2020-01-31 19:26:46Z

3

If there are more than one row, the seq needs to be looped. We can use map2. Also, based on the format of the 'DATE' columns, the as.Date needs a format argument i.e. as.Date(ADATE, "%m/%d/%y") (assuming it is month/day/year format)

library(dplyr)
library(purrr)
library(lubridate)
chf_jan15 %>%
    mutate_at(vars(ends_with("DATE")), mdy) %>%
    mutate(random_date = map2(ADATE, DDATE, seq, by = "day")) %>%
    unnest(c(random_date))
# A tibble: 4 x 4
#     ID ADATE      DDATE      random_date
#  <int> <date>     <date>     <date>     
#1     1 2010-02-04 2010-02-07 2010-02-04 
#2     1 2010-02-04 2010-02-07 2010-02-05 
#3     1 2010-02-04 2010-02-07 2010-02-06 
#4     1 2010-02-04 2010-02-07 2010-02-07

If there is only a single row, after converting to Date class, the complete should work

library(tidyr)
chf_jan15 %>%
   mutate_at(vars(ends_with("DATE")), as.Date, format = "%m/%d/%y") %>%
   mutate(NEW_DATE = ADATE) %>%      
   complete(NEW_DATE = seq(ADATE, DDATE, by = 'day')) %>%
   fill(c(ID, ADATE, DDATE))
# A tibble: 4 x 4
#  NEW_DATE      ID ADATE      DDATE     
#  <date>     <int> <date>     <date>    
#1 2010-02-04     1 2010-02-04 2010-02-07
#2 2010-02-05     1 2010-02-04 2010-02-07
#3 2010-02-06     1 2010-02-04 2010-02-07
#4 2010-02-07     1 2010-02-04 2010-02-07

If there is a single row for each each 'ID', then we can group_split and use complete

chf_jan15 %>%
    mutate_at(vars(ends_with("DATE")), as.Date, format = "%m/%d/%y") %>%
    mutate(NEW_DATE = ADATE) %>%
    group_split(ID) %>%
    map_dfr(~ .x %>%
                 complete(NEW_DATE = seq(ADATE, DDATE, by = 'day')) %>%
                  fill(c(ID, ADATE, DDATE)))

data

chf_jan15 <- structure(list(ID = 1L, ADATE = "02/04/10", 
    DDATE = "02/07/10"), class = "data.frame", row.names = c(NA, 
-1L))

edited Jan 31, 2020 at 19:26

answered Jan 31, 2020 at 18:58

akrun

891k38 gold badges590 silver badges700 bronze badges

Sign up to request clarification or add additional context in comments.

10 Comments

user122514 Over a year ago

Hi akrun. I have to do this within each group. Should I add a group_by before mutate_at? Thanks!

akrun Over a year ago

@user122514 It is not needed because the map2 is loopiing over each row

user122514 Over a year ago

Hi akrun. The first chunk of code works, but the second one doesn't. I'm still getting an error saying that 'from' must be of length 1.

akrun Over a year ago

@user122514 sorry, it needs a group_by before complete updated the code. Please check

user122514 Over a year ago

Thanks! I will try to read up on map2 as I don't really use the purrr package.

|

Collectives™ on Stack Overflow

Creating sequence of dates in R by group, dependent on another variable

1 Answer 1

data

10 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

data

10 Comments

Your Answer

Sign up or log in

Post as a guest

Related