4

How can I run a for loop in parallel (so I can use all the processors on my windows machine) with the result being a 3 dimension array? The code I have now takes about an hour to run and is something like:

guad = array(NA,c(1680,170,15))
for (r in 1:15)
{
  name = paste("P:/......",r,".csv",sep="")
  pp = read.table(name,sep=",",header=T)
    #lots of stuff to calculate x (which is a matrix)
  guad[,,r]= x  #
}

I have been looking at related questions and thought I could use foreach but I couldn't find a way to combine the matrices into an array.

I am new to parallel programming so any help will be very much appreciated!

3
  • 1
    Look at the .combine parameter of foreach. You don't need to pre-allocate guad with foreach and each iteration should only return a matrix. All matrices can then be combined to an array by foreach. Study the help pages and package vignette. Commented Jul 10, 2013 at 12:49
  • I did look at .combine but it only allows c,cbind,rbind or a function so how can I combine the matrices to an array? Commented Jul 10, 2013 at 13:25
  • By passing a function to .combine that binds matrices into an array? Commented Jul 10, 2013 at 13:35

1 Answer 1

12

You could do that with foreach using the abind function. Here's an example using the doParallel package as the parallel backend which is fairly portable:

library(doParallel)
library(abind)
cl <- makePSOCKcluster(3)
registerDoParallel(cl)
acomb <- function(...) abind(..., along=3)
guad <- foreach(r=1:4, .combine='acomb', .multicombine=TRUE) %dopar% {
  x <- matrix(rnorm(16), 4)  # compute x somehow
  x  # return x as the task result
}

This uses a combine function called acomb that uses the abind function from the abind package to combine the matrices generated by the cluster workers into a 3 dimensional array.

In this case, you can also combine the results using cbind and then modify the dim attribute afterwards to convert the resulting matrix into a 3 dimensional array:

guad <- foreach(r=1:4, .combine='cbind') %dopar% {
  x <- matrix(rnorm(16), 4)  # compute x somehow
  x  # return x as the task result
}
dim(guad) <- c(4,4,4)

The use of abind is useful since it can combine matrices and arrays in a variety of ways. Also, be aware that resetting the dim attribute may cause the matrix to be duplicated which could be a problem for large arrays.

Note that it's a good idea to shutdown the cluster at the end of the script using stopCluster(cl).

Sign up to request clarification or add additional context in comments.

1 Comment

thanks for the help! I used your first example and it seems to be working :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.