How to use case_when with loop and regex?

Question

I want to generate a new variable named new_p which takes the values of other variables based on paste0. My approach below produces the new variable, but does not assign the values correctly. It only assigns one value and returns NA for the other values of lev. Thanks for any help.

My data:

tempDF <- structure(list(d1 = c("A", "B", "C"), d2 = c(40L, 50L, 20L), 
    d3 = c(20L, 40L, 50L), d4 = c(60L, 30L, 30L), p_A = c(1L, 
    3L, 2L), p_B = c(3L, 4L, 3L), p_C = c(2L, 1L, 1L), p4 = c(5L, 
    5L, 4L)), class = "data.frame", row.names = c(NA, -3L))

lev<-levels(as.factor(tempDF$d1))

View(tempDF)

My approach:

for(i in seq_along(lev)){

func<-function(tempDF, i, lev){

newDT<-tempDF%>%
mutate(.,  
        new_p = case_when (
         d1  ==  paste0(lev[i]) ~ .[, paste0("p", "_", lev[i])]
        ))%>%
        as.data.frame(.)
        }

newDT<-func(tempDF, i, lev) %>%
        as.data.frame(.)

}

View(newDT)

newDT
  d1 d2 d3 d4 p_A p_B p_C p4 new_p
   A 40 20 60   1   3   2  5 NA
   B 50 40 30   3   4   1  5 NA
   C 20 50 30   2   3   1  4 1

Expected output:

newDT
  d1 d2 d3 d4 p_A p_B p_C p4 new_p
   A 40 20 60   1   3   2  5 1
   B 50 40 30   3   4   1  5 4
   C 20 50 30   2   3   1  4 1

Edit: Barradas's function applied to a larger data:

tempDF <- structure(list(d1 = c("A", "B", "C", "A", "C"), d2 = c(40L, 50L, 20L, 50L, 20L), 
    d3 = c(20L, 40L, 50L, 40L, 50L), d4 = c(60L, 30L, 30L,60L, 30L), p_A = c(1L, 
    3L, 2L, 3L, 2L), p_B = c(3L, 4L, 3L, 3L, 4L), p_C = c(2L, 1L, 1L,2L, 1L), p4 = c(5L, 
    5L, 4L, 5L, 4L)), class = "data.frame", row.names = c(NA, -5L))

View(tempDF)    

lev<-levels(as.factor(tempDF$d1))

func <- function(tempDF, lev){
  i <- match(tempDF$d1, lev)
  j <- match(paste0("p", "_", lev), names(tempDF))
  tempDF$new_p <- tempDF[cbind(i, j)]
  tempDF
}

newDT <- func(tempDF, lev)

Warning message:
In cbind(i, j) :
  number of rows of result is not a multiple of vector length (arg 2)

View(newDT)

newDT
  d1 d2 d3 d4 p_A p_B p_C p4 new_p
   A 40 20 60   1   3   2  5     1
   B 50 40 30   3   4   1  5     4
   C 20 50 30   2   3   1  4     1
   A 50 40 60   3   3   2  5     1  //wrong, new_p should be 3, not 1
   C 20 50 30   2   4   1  4     3  //wrong, new_p should be 1, not 3

Rui Barradas · Accepted Answer · 2019-10-19 17:22:37Z

2

You don't need loops nor pipes to do what the question asks for, match and plain dataframe extraction can solve the problem.

func <- function(tempDF, lev){
  i <- match(tempDF$d1, lev)
  j <- match(paste0("p", "_", lev), names(tempDF))
  tempDF$new_p <- tempDF[cbind(i, j)]
  tempDF
}

newDT <- func(tempDF, lev)

newDT
#  d1 d2 d3 d4 p_A p_B p_C p4 new_p
#1  A 40 20 60   1   3   2  5     1
#2  B 50 40 30   3   4   1  5     4
#3  C 20 50 30   2   3   1  4     1

Edit.

The following function, returns the correct output, with both the original data and with the bigger one.

func <- function(DF, levs){
  i <- sapply(levs, function(l) which(DF$d1 == l))
  j <- rep(match(paste0("p", "_", levs), names(DF)), lengths(i))
  i <- unlist(i)
  o <- cbind(unlist(i),j)
  o <- o[order(o[,1]),]
  DF$new_p <- DF[o]
  DF
}

edited Oct 19, 2019 at 17:22

answered Oct 19, 2019 at 15:09

Rui Barradas

78k8 gold badges41 silver badges75 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Krantz Over a year ago

Thanks, @Rui_Barradas. This is great and very helpful!

Krantz Over a year ago

Hi, Barradas. Again thanks for this. But when I apply the function to a larger data, it gives wrong output. For example, vide my edited question.

Krantz Over a year ago

@Rui_Barradas, any thoughts?

Krantz Over a year ago

@Rui_Barradas, Great. Thanks again for your effort to help me. This indeed works. Well done.

Collectives™ on Stack Overflow

How to use case_when with loop and regex?

1 Answer 1

Edit.

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Edit.

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related