I want to create a new column in my data frame based on the values of another column that contains a set of strings. For some of the strings, I want to change the string, others I want to keep as is.
To keep things short, I want to do this using a vector of strings that specifies which strings I want to change and a vector of strings that I want to change the matches into.
I usually do this using the package dplyr::mutate and the case_when function. For the following code, I want to change Paul and Barbara to Anna and Fred respectively, while keeping the other names.
library(dplyr)
library(tibble)
a<-rep(c("Paul", "Barbara","Joey","Iris"),3)
test<-enframe(a)
mutate(test,
name2 = case_when(
value == "Paul" ~ "Anna",
value == "Barbara" ~ "Fred",
TRUE ~ value)
)
Given that the real dataset is much longer, I would like to use vectors of strings as specified earlier. Using %in% b works to find the matching cells but using vector d to replace the hits throws an error:
b<-c("Paul","Barbara") #only Paul and Barbara need to change
d<-c("Anna","Fred") #they need to change to Anna and Fred
mutate(test,
name2 = case_when(
value %in% b ~ d,
TRUE ~ value)
Error in
mutate(): ! Problem while computingname2 = case_when(value %in% b ~ d, TRUE ~ value). Caused by error incase_when(): !value %in% b ~ dmust be length 12 or one, not 2. Runrlang::last_error()to see where the error occurred.
I was hoping that if the match would be with the second element of b, the second element of d would be used. Clearly, as value %in% b returns a vector of 12 TRUE/FALSE values, this does not work that way but is there any to work with vectors of strings like this?