0

I have this df with 4 columns: Name, screen_date, enroll_date, and screen2enroll_days

Name screen_date enroll_date screen2enroll_days enrollment_type
John 2020-08-20 2020-08-01 14 TypeX
Mike 2020-08-20 2020-08-01 14 TypeY
Sam 2020-10-20 2020-08-05 65 TypeY
Dan 2020-11-05 2020-08-05 90 TypeX
df <-
  data.frame(
    "Name" = c("John", "Mike", "Sam", "Dan"),
    "screen_date" = c("2020-08-01", "2020-08-20", "2020-10-20", "2020-11-05"),
    "enroll_date" = c("2020-08-01", "2020-08-01", "2020-08-05", "2020-08-05"),
    "screen2enroll_days" = c(14, 14, 65, 90),
    "enrollment_type" = c("TypeX", "TypeY", "TypeY", "TypeX")
  )

I want to create a function to read in one or more of my columns and create a new column called Action that uses the column screen2enroll_days to identify if a client needs a screening test. But ran into errors

Name screen_date enroll_date screen2enroll_days Action (new_col)
John 2020-08-14 2020-08-01 14 Up-to-date
Sam 2020-10-20 2020-08-05 65 Requires Screening
Dan 2020-11-05 2020-08-05 90 No Screening Required
Mike 2020-08-20 2020-08-01 14 No Screening Required
mutate_function <- function(df, new_col, my_col, my_col2, value1, value2, value3) {
    
    df %>% mutate(new_col = case_when(
             my_col <= 14 ~ "value1",
             my_col <= 14 & my_col2 != "TypeX" ~ "value2",
             (my_col > 14 & my_col <= 65) ~ "value3",
             TRUE ~ "value2")
    )}

mutate_function(df, Action, mycol = screen2enroll_days, my_col2 = enrollment_type, "Up-to-date", "Requires Screening", "No Screening Required")
3
  • What is your question? Commented Jul 20, 2021 at 17:53
  • and please share some data; use dput(yourdata) and paste the output in your edited question Commented Jul 20, 2021 at 18:46
  • @BillO'Brien: I added the info you requested. It was posted prematurely. Thanks! Commented Jul 20, 2021 at 20:33

1 Answer 1

2

I think there are a number of things to address to make this functional:

  • Inside a function you can dynamically access column names from your arguments with double curly braces ({{...}}). Alternatively, use can use the bang-bang operator with sym: !!sym(). Or, try .data[[variable]] to reference the variable from the pipe. Otherwise, it would seem you are trying to reference a column called my_col or my_col2 (for example) from df which don't exist.

  • If you want to set the new column values based on the value1 value2 or value3 arguments, you will want to leave off the quotes in your case_when statement

  • To dynamically set the new_col, use assignment (:=)

  • When calling the function, you may want to double check your argument names (such as mycol vs. my_col - note underscore)

  • Finally, you may want to double check your case_when logic. I believe the second line might never get called, as all circumstances when my_col is <= 14 will be considered as value1


library(dplyr)

mutate_function <- function(df, new_col, my_col, my_col2, value1, value2, value3) {
  df %>% mutate({{new_col}} := case_when(
    {{my_col}} <= 14 ~ value1,
    {{my_col}} <= 14 & {{my_col2}} != "TypeX" ~ value2,
    ({{my_col}} > 14 & {{my_col}} <= 65) ~ value3,
    TRUE ~ value2)
  )
}

mutate_function(df, 
                new_col = "Action", 
                my_col = "screen2enroll_days", 
                my_col2 = "enrollment_type", 
                "Up-to-date", 
                "Requires Screening", 
                "No Screening Required")
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.