0

I have the following R data frame:

ID     Completed       Days
001    Yes             65
002    No              NA
003    Yes             120
004    Yes             22

I would like to create the following data set:

ID     Month           Success          DaysAtSuccess
001    1               No                 NA
002    1               No                 NA
003    1               No                 NA 
004    1               Yes                22
001    2               No                 NA
002    2               No                 NA
003    2               No                 NA 
004    2               Yes                22
001    3               Yes                65
002    3               No                 NA
003    3               No                 NA 
004    3               Yes                22
001    4               Yes                65
002    4               No                 NA
003    4               Yes               120
004    4               Yes                22

The idea is to have the 'Month' column enumerate by 30 days. For example, Month = 1 would include days 0-30, Month = 2 would include days 31-60, etc... The DaysAtSuccess would equal the value in the Day column if the value is equal to or greater than the lower value at each month. I am working on creating the data set using the mutate (dplyr) and ifelse functions but so far no luck. Any insight would be appreciated.

Edit:

Using the following code, I have been able to generate a 'Month' column:

df$Month <-ceiling(df$Days/30)

Which generates the following data set:

   ID  Completed Days Month
  001        Yes   65     3
  002         No   NA    NA
  003        Yes  120     4
  004        Yes   22     1
5
  • What is .? Is it NA? Commented Jun 7, 2021 at 17:40
  • I don't know if I get what you want, but maybe df$DaysAtSuccess <- df$Days*(df$Success == "Yes")*(cieling(df$Days/30) == df$month)? Commented Jun 7, 2021 at 17:55
  • @PedroAlencar Didn't seem to work, I believe the code won't work as the column df$month is not in the original data frame. Commented Jun 7, 2021 at 18:40
  • Hi @statsguyz, could you create the month column in the original dataframe with df$month <- ceiling(df$Days/30)? Commented Jun 7, 2021 at 18:46
  • @PedroAlencar Yes, that worked to generate a 'Month' column. I will add this to the original question. Commented Jun 7, 2021 at 18:59

1 Answer 1

1

Using your data

tibble::tribble(
  ~ID, ~Completed, ~Days,
  "001", "Yes",      65,
  "002", "No",       NA,
  "003", "Yes",      120,
  "004", "Yes",      22
) -> your_data

I would do

library(tidyverse)

your_data %>%
  pmap(\(ID, Completed, Days) {
    map(seq(30, max(.$Days, na.rm = T), 30),
        ~ tibble(ID = ID,
                 Month = .x / 30,
                 Success = ifelse(!is.na(Days) & Days <= .x, "Yes", "No"),
                 DaysAtSuccess = ifelse(Days <= .x, Days, NA)))
  }) %>%
  bind_rows() %>%
  arrange(Month)

returning

# A tibble: 16 x 4
   ID    Month Success DaysAtSuccess
   <chr> <dbl> <chr>           <dbl>
 1 001       1 No                 NA
 2 002       1 No                 NA
 3 003       1 No                 NA
 4 004       1 Yes                22
 5 001       2 No                 NA
 6 002       2 No                 NA
 7 003       2 No                 NA
 8 004       2 Yes                22
 9 001       3 Yes                65
10 002       3 No                 NA
11 003       3 No                 NA
12 004       3 Yes                22
13 001       4 Yes                65
14 002       4 No                 NA
15 003       4 Yes               120
16 004       4 Yes                22
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.