0

Is there any way to make the following algorithm faster, such as using an apply function?

    set.seed(1)
    n=1000

    y=rnorm(n)
    x1=rnorm(n)
    x2=rnorm(n)

    lm.ft=function(y,x1,x2)
      return(lm(y~x1+x2)$coef)

    res=list();
    for(i in 1:n){
      x1.bar=x1-x1[i]
      x2.bar=x2-x2[i]
      res[[i]]=lm.ft(y,x1.bar,x2.bar)
    }

1 Answer 1

4

Use lm.fit instead of lm:

fun1 <- function() {
  res=list();
  for(i in 1:n){
    x1.bar=x1-x1[i]
    x2.bar=x2-x2[i]
    res[[i]]=lm.ft(y,x1.bar,x2.bar)
  }
  res
}

lm.ft2 <- function(y,x1,x2) lm.fit(cbind(1,x1,x2), y)$coef

fun2 <- function() {
  res2 <- sapply(seq_along(y), function(i, x1, x2, y) {
    x1.bar=x1-x1[i]
    x2.bar=x2-x2[i]
    lm.ft2(y,x1.bar,x2.bar)
  }, x1=x1, x2=x2, y=y)
  res2
}

library(microbenchmark)
microbenchmark(res <- fun1(), res2 <- fun2(), times=10)

#Unit: milliseconds
#         expr        min         lq    median        uq       max neval
#res <- fun1() 147.776069 149.580443 152.64378 159.53053 166.06834    10
#res <- fun2()   8.760102   9.004934  10.34582  10.58757  13.86649    10


all.equal(
  unname(t(res2)),
  unname(do.call(rbind,res))
)
#[1] TRUE
Sign up to request clarification or add additional context in comments.

3 Comments

It seems most of the speed improvement is from using lm.fit. Is there any other way to reduce the speed from the FOR loop? In fact, I need to use double for loops.
I doubt that you need a double for loop. However, for loops are actually pretty fast in R (and lapply and friends are not faster). What you do inside the loop usually is the time consuming part. The best speed-up you can achieve by using vectorized functions.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.