1

I am struggling with the syntax of map when I want to have a flexible number of columns and arguments...it's hard to code this dynamically.

A concrete example...

Suppose I have the following matrix, for arbitrary J:

set.seed(1) 
J=2 
n = 100
BB = data.table(r=1:n) 
for(i in seq(J)) set(x = BB, j = paste0('a',i), value = rnorm(n, 1, 7)) 
for(i in seq(J^2)) set(x = BB, j = paste0('b',i), value = rnorm(n, 1, 7)) 
for(i in seq(J)) set(x = BB, j = paste0('p',i), value = rnorm(n, 1, 7))

So the output is...

> head(BB)
   r        a1        a2         b1        b2        b3        b4          p1         p2
1: 1 -3.385177 -3.342567   3.865813  7.255716  8.521087  1.541122 -1.38746886 -3.9529776
2: 2  2.285503  1.294811  12.822113 -6.331087 14.269583 -1.078080 11.51697174 14.8010041
3: 3 -4.849400 -5.376452  12.106119 14.799362 -3.220981 -7.282696  4.69815399  0.3700092

I want to create a function that creates new columns from the existing columns in the following way..

(since J=2):

Lambda1 = exp(a1 + b1p1 + b2p2) 
Lambda2 = exp(a2 + b3p1 + b4p2) 

If J=1 it would be:

Lambda1 = exp(a1 + b1p1) 

If J=3…

Lambda1 = exp(a1 + b1p1 + b2p2 + b3p3) 
Lambda2 = exp(a2 + b4p1 + b5p2 + b6p3)
Lambda3 = exp(a3 + b7p1 + b8p2 + b9p3)

The output should be:

> head(BB)
   r        a1        a2         b1        b2        b3        b4          p1         p2       lambda1      lambda2
1: 1 -3.385177 -3.342567   3.865813  7.255716  8.521087  1.541122 -1.38746886 -3.9529776  5.547749e-17 5.862180e-10
2: 2  2.285503  1.294811  12.822113 -6.331087 14.269583 -1.078080 11.51697174 14.8010041  2.688353e+24 1.012574e+65
3: 3 -4.849400 -5.376452  12.106119 14.799362 -3.220981 -7.282696  4.69815399  0.3700092  9.401501e+24 8.370005e-11

My attempted solution:

I think it should look something like this, though the J^2 p parts are throwing it off. This solution ignores it.

BB[,
      (paste0("lambda",seq(J))) := Map(
        function(a,b,p) exp(a + b * p),
        mget(paste0("a", seq(J))),
        mget(paste0("b", seq(J))),
        mget(paste0("p", seq(J)))
      )
      ]

2 Answers 2

1

I am not familiar with the data.table terminology but here a solution

# Find the relevant columns
colA <- which(names(BB) %in% paste0("a",seq(J)))
colB <- which(names(BB) %in% paste0("b",seq(J^2)))
colP <- which(names(BB) %in% paste0("p",seq(J)))
# Extract the a's, b's & p's
a <- BB[ ,colA, with = FALSE]
b <- BB[, colB, with = FALSE]
p <- BB[, colP, with = FALSE]
# Multiply the b's and p's - expand the p's before multiplication
bp <- b * do.call("cbind.data.frame", replicate(J, p, simplify = FALSE))
# Loop through the columns to add
for (k in 1:J){ 
  tmpLambda <- exp(rowSums(bp[,((k-1)*J+1):(k*J)]) + a[, k, with = FALSE])
  BB$tmpLambda <- tmpLambda
  names(BB)[ncol(BB)] <- paste0("Lambda",k)
}
# Result
>   head(BB)
   r        a1        a2         b1        b2        b3        b4          p1         p2       Lambda1      Lambda2
1: 1 -3.385177 -3.342567   3.865813  7.255716  8.521087  1.541122 -1.38746886 -3.9529776  5.547749e-17 5.862180e-10
2: 2  2.285503  1.294811  12.822113 -6.331087 14.269583 -1.078080 11.51697174 14.8010041  2.688353e+24 1.012574e+65
3: 3 -4.849400 -5.376452  12.106119 14.799362 -3.220981 -7.282696  4.69815399  0.3700092  9.401501e+24 8.370005e-11

This can wrapped into a function and/or also optimized. Here is a test run with J=3:

> str(BB)
Classes ‘data.table’ and 'data.frame':  100 obs. of  19 variables:
 $ r      : int  1 2 3 4 5 6 7 8 9 10 ...
...
 $ Lambda1: num  6.21e-13 2.93e+23 7.46e-69 8.01e+18 1.45e+13 ...
 $ Lambda2: num  5.61e-36 1.05e+127 7.63e-32 4.36e-33 1.19e-33 ...
 $ Lambda3: num  5.84e+70 3.75e+52 1.60e-02 4.01e+33 2.51e+12 ...
 - attr(*, ".internal.selfref")=<externalptr> 

Hope that helps.

Sign up to request clarification or add additional context in comments.

Comments

1

Another possibility is to use matrix multiplication:

BB[, (paste0("lambda",seq(J))) := lapply(
        as.list(matrix(unlist(mget(paste0("a", seq(J)))), nrow=1L) +
            matrix(unlist(mget(paste0("p", seq(J)))), nrow=1L) %*%
                matrix(unlist(mget(paste0("b", seq(J^2)))), ncol=J))
        , exp),
    by=1:BB[,.N]]

The con is that speed might not be optimal as it is going through each row of the data.table

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.