I have a large data frame and want to create a new variable which depends on two other variables.
Here is a short example:
v1 <- rep(c(1:5),each=3)
v2 <- c('X','A','Y','X','Y','B','X','Y','C','X','Y','C','X','Y','A')
dat <- data.frame(v1,v2)
#create a new var which contains either A,B, or C depending on what is found in v2
#desired output
v3 <- rep(c('A','B','C','C','A'),each=3)
data.frame(v1,v2,v3)
Any ideas on how to do this with a short code?
I tried this, but it's far from the solution. Too many missings. :(
dat$v3[dat$v2 %in% c('A','B','C')] <- dat$v2[dat$v2 %in% c('A','B','C')]
with(dat, ave(v2, v1, FUN = function(i) tail(i, 1)))should do it given that the order is always as shownv3only depends on one variable. The rule appears to be 1 -> A, 2 -> B and so on.dplyr'scase_when()orifelse()statements fit perfectly here. However, since you did not provide the exact conditions, it's hard to write an example.