I'm currently trying to compute model estimators using the BLB bootstrap , and would like to do so parallel. my code works fine when I'm not doing it parallel. the problem when I'm computing in parallel is that the results I get from each core contains NA values. I don't understand how I get NA values while the Iris Data set's values don't contain NA at all. here is the code that I'm using :
library(doParallel)
library(itertools)
num_of_cores <- detectCores()
cl <- makePSOCKcluster(num_of_cores)
registerDoParallel(cl)
attach(iris)
data <- iris
coeftmp <- data.frame()
system.time(
r <- foreach(dat = isplitRows(data, chunks=num_of_cores),
.combine = cbind) %dopar% {
BLBsize = round(nrow(dat)^0.6)
for (i in 1:400){
set.seed(i)
# sampling B(n) data points from the original data set without replacement
sample_BOFN <- dat[sample(nrow(dat), size = BLBsize, replace = FALSE), ]
# sampling from the subsample with replacment
sample_bootstrap <- sample_BOFN[sample(nrow(sample_BOFN), size = nrow(sample_BOFN), replace = TRUE), ]
bootstrapModel <- glm(sample_bootstrap$Petal.Width ~ Petal.Length + Sepal.Length + Sepal.Width, data = sample_bootstrap)
coeftmp <- rbind(coeftmp, bootstrapModel$coefficients)
}
#calculating the estimators of the model with mean
colMeans(coeftmp)
})
sample_BOFNif you're bootstrapping. But it also doesn't appear that you're usingsample_BOFN, so you may wish to remove this from the (example) code.NAs if you only use 1 core?