0

I have some troubles with a code which take a huge amount of time to run. Could someone give me some help? Thanks in advance!

all_dist=c()
ddim=dim(b)[1]
ddimi=ddim-1

for (k in 1:dim(b)[2]){
    for (i in seq(1,ddimi,1)){
        for (j in seq(i+1,ddim,1)){
        ze=(b[i,k])-(b[j,k])*(b[i,k])-(b[j,k])
        all_dist=c(all_dist,ze)
}}}

Note that:

str(b)

num [1:5, 1:30007] -0.000292 -0.001384 -0.001412 -0.002603 -0.002848 ... - attr(*, "dimnames")=List of 2 ..$ : NULL ..$ : chr [1:30007] "V1" "V2" "V3" "V4" ...

1

2 Answers 2

1

For-loops and growing your vector with c() are slowing you down. It's best to try to take advantage of vectorization, and to use *apply (or map) functions as much as possible. Here's something that does a little of both by iterating over the columns with sapply, creating combinations and computing the products and differences of those combinations:

mat <- sapply(b, function(x) {y <- combn(x, 2); y[1,] - y[2,] * y[1,] - y[2,]})

It should be fast – maybe not quite as fast as user10488504's very efficient solution, but still pretty fast. It also has very tight syntax, and you might also find it useful that the output is a matrix, with each column corresponding to a column from b.

Data:

set.seed(12345)
b <- as.data.frame(matrix(runif(5*30007, -.001, -.0003), byrow = T, nrow = 5))
Sign up to request clarification or add additional context in comments.

Comments

0
set.seed(0)
b <- matrix(rnorm(5*30007), nrow=5)

all_dist=c()
ddim=dim(b)[1]
ddimi=ddim-1

system.time(
#With foor-Loop
for (k in 1:dim(b)[2]){
    for (i in seq(1,ddimi,1)){
        for (j in seq(i+1,ddim,1)){
        ze=(b[i,k])-(b[j,k])*(b[i,k])-(b[j,k])
        all_dist=c(all_dist,ze)
        }}}
)
#       User      System verstrichen 
#    104.568       3.636     108.206 


#Vectorized with matrix indices
system.time({
K <- 1:dim(b)[2]     #for (k in 1:dim(b)[2]){... creates this vector
I <- seq(1,ddimi,1)  #for (i in seq(1,ddimi,1)){... creates this vector
J <- unlist(lapply(I+1, function(x) seq(x,ddim,1)))  #for (j in seq(i+1,ddim,1)){... creates this vector

IK <- as.matrix(expand.grid(I, K))  #Get all combinations between I and K as you will have with the nested for loops of k and i
IK <- IK[rep(seq_len(nrow(IK)), rep((ddim-1):1,length.out=nrow(IK))),]  #IK-rows need to be repeated, as it is used repeatedly in the "for (j in seq(i+1,ddim,1)){" loop
JK <- as.matrix(expand.grid(j=J, k=K)) #Get all combinations between J and K as you will have with the nested for loops of k and j

#Now you have all the indexes of your for loop as vectors and can make the calculations
tt <- b[IK] - b[JK] * b[IK] - b[JK]
})
#      User      System verstrichen 
#      0.056       0.000       0.097 


identical(all_dist, tt)
#[1] TRUE

As you are using k only on the left side without interaction with the other loops you can partly vectorize by simply leaving the k loop and the index away.

system.time({
tt=c()
for (i in seq(1,ddimi,1)){
  for (j in seq(i+1,ddim,1)){
    tt=c(tt, (b[i,])-(b[j,])*(b[i,])-(b[j,]))
  }}
dim(tt)  <- c(30007, 10)
tt <- as.vector(t(tt))
})
#       User      System verstrichen 
#      0.017       0.000       0.017 
identical(all_dist, tt)
#[1] TRUE

Or you can replace the inner two for loops with index vectors and make an apply loop instead of the k-for loop:

system.time({
I <- seq(1,ddimi,1)
J <- unlist(lapply(I+1, function(x) seq(x,ddim,1)))
I <- I[rep(seq_along(I), rep((ddim-1):1,length.out=length(I)))]
tt  <- as.vector(apply(b, 2, function(x) {x[I] - x[J] * x[I] - x[J]}))
})
#       User      System verstrichen 
#      0.085       0.000       0.085 
identical(all_dist, tt)
#[1] TRUE

Used time of the nice solution from gersht:

system.time({
mat <- as.vector(sapply(as.data.frame(b), function(x) {y <- combn(x, 2); y[1,] - y[2,] * y[1,] - y[2,]}))
})
#       User      System verstrichen 
#      1.083       0.000       1.082 
identical(all_dist, mat)
#[1] TRUE

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.