0

Create a dataset and the function I want to use

library(data.table)
DT <- data.table(V1=c(rep("A",5),rep("B",5)),
                 V2=rep(1:5,2),
                 V3=c(10,10,0,0,0,5,10,0,0,0),
                 V4=c(0,0,0,2,2,0,0,0,4,4))
testFunction<-function(x,transformation){
  l<-length(x)
  out <- rep(0, l)
  out[1] <- x[1]
  for (i in 2:l) {
    #out[i] <- x[i] + (1 - transformation) * x[i - 1] #EDIT: Function was wrong
    out[i] <- x[i] + (1 - transformation) * out[i - 1]
  }
  return(out)
}

Now What I want to do is create a new dataset, newDT, using the information from the below application data.frame

application<-data.frame(var=c("V3","V3","V4"),
                        transform=c(0.5,0.9,0.5))

The code I want to end up with from this function is as follows: Creating new variables using the variable names and transformations in the application, and doing this by column V1.

newDT<-DT[,':='(V3_0.5=testFunction(V3,0.5),
         V3_0.9=testFunction(V3,0.9),
         V4_0.5=testFunction(V4,0.5)),
   by="V1"]

It is simple enough to code this up as text using a couple of paste functions, and then passing this to eval(parse(text=....)):

application$code<-paste(application$var,"_",application$transform,"=testFunction(",application$var,",",application$transform,")",sep="")
code<-paste("newDT<-DT[,':='(",paste(application$code,collapse=","),"),by='V1']")
eval(parse(text=code))

however that runs into an issue when you pass over 4076 characters in the string ( (a) No idea why and (b) is not recommended all over the Runiverse).

The question: How do I go about this?

Happy to look at alternative solutions such as dplyr if speed isn't affected.

EDIT: The output table should look as following

     V1 V2 V3 V4  V3_0.5  V3_0.9 V4_0.5
 1:  A  1 10  0 10.0000 10.0000      0
 2:  A  2 10  0 15.0000 11.0000      0
 3:  A  3  0  0  7.5000  1.1000      0
 4:  A  4  0  2  3.7500  0.1100      2
 5:  A  5  0  2  1.8750  0.0110      3
 6:  B  1  5  0  5.0000  5.0000      0
 7:  B  2 10  0 12.5000 10.5000      0
 8:  B  3  0  0  6.2500  1.0500      0
 9:  B  4  0  4  3.1250  0.1050      4
10:  B  5  0  4  1.5625  0.0105      6
7
  • Use testFunction<-function(x,transformation){x+(1-transformation)*shift(x, fill=0)} Commented Oct 25, 2016 at 15:31
  • Sorry, downvoting because it is a bad idea to do this (iterating unnecessarily and writing code in a string to evaluate). Commented Oct 25, 2016 at 15:33
  • @ExperimenteR I doubt that will work. There probably needs to be a cumulative sum or cumulative product somewhere to get around the iteration. Commented Oct 25, 2016 at 15:34
  • 2
    @Frank, IMHO there is no reason to downvote this Q as it shows substantial effort of the OP to find a solution for his problem. And, the Q is about how to do it better. Commented Oct 25, 2016 at 15:58
  • 1
    @UweBlock I'm also VTC-ing as too broad. If it's cut down to a single problem (and that testFunction is a pretty big problem on its own), that would help. I'm using my DV as a signpost that "this is a bad idea" to those who stumble on it later. It's a valid albeit subjective reason. If you hover over the downvote arrow, I'm referring to "not useful". Commented Oct 25, 2016 at 16:07

2 Answers 2

4

Down to the core of your issue, you can pass a vector of parameters into lapply, and then create new columns by reference like this:

library(data.table)

DT <- data.table(col = 1:5)
expon <-  function(y,x){x ^ y}
params <- c(1,5,3)

DT[, (paste0("col_",params, sep = "")) := lapply(params, expon, col)]

This gives you:

   col col_1 col_5 col_3
1:   1     1     1     1
2:   2     2    32     8
3:   3     3   243    27
4:   4     4  1024    64
5:   5     5  3125   125
Sign up to request clarification or add additional context in comments.

1 Comment

Makes sense. The only thing this is missing is the ability to apply the function to two different columns
0

Thanks to Chris for providing me with a step in the right direction, with a solution that will work with a single column.

To expand to multiple columns:

#Turn application into a list
applic_list<-unlist(apply(application, 1, list), recursive = FALSE)
#lapply through this list, using .SD to call the column in question
DT[,(paste(application$var,application$transform,sep="_")) :=
    lapply(applic_list,function(y)      {
      testFunction(as.numeric(y[["transform"]]),.SD[[y[["var"]]]])
    }),by="V1"]

returns

    V1 V2 V3 V4  V3_0.5  V3_0.9 V4_0.5
 1:  A  1 10  0 10.0000 10.0000      0
 2:  A  2 10  0 15.0000 11.0000      0
 3:  A  3  0  0  7.5000  1.1000      0
 4:  A  4  0  2  3.7500  0.1100      2
 5:  A  5  0  2  1.8750  0.0110      3
 6:  B  1  5  0  5.0000  5.0000      0
 7:  B  2 10  0 12.5000 10.5000      0
 8:  B  3  0  0  6.2500  1.0500      0
 9:  B  4  0  4  3.1250  0.1050      4
10:  B  5  0  4  1.5625  0.0105      6

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.