1

I'm running a simulation using the future and future.apply packages in R where I need to execute multiple iterations of a function in parallel and bind the results together. When I use more than one iteration (n_iter), I encounter a warning message indicating that row names found from a short variable are being discarded.

In data.frame(..., check.names = FALSE) : row names were found from a short variable and have been discarded

Here's my minimal example that reproduces the warning:


# install.packages("future")
library(future)
# install.packages("future.apply")
library(future.apply)
# you need both packages in order to use "future"

# parallel::detectCores()
# check how many cores you have

options(parallelly.fork.enable = TRUE)
# You have to set this every time you start a new R session


# data generating function
data_generating_function <- function(n, mean, sd) {
  x <- rnorm(n, mean, sd)
  y <- rnorm(n, mean, sd) + 1 * x
  
  return(data.frame(x, y))
}

# data_generating_function(10, 0, 1)


# one simulation
one_simulation <- function(n, mean, sd) {
  data <- data_generating_function(n, mean, sd)
  
  model <- lm(y ~ x, data = data)
  
  p_value <- summary(model)$coefficients[2, 4]
  
  return(p_value)
}

# one_simulation(10, 0, 1)

# grid with simulation parameters
sim_grid <- expand.grid(
  n = c(10, 100, 1000),
  mean = c(0, 1, 2),
  sd = c(1, 2, 3)
)

# number of iterations
# n_iter <- 1 # no problems when runing only one iteration
n_iter <- 10

plan(multicore, workers = 3)
# choose the number of workers here

#this part produces the warnings:
res_simulation <- do.call("rbind", lapply(seq_len(nrow(sim_grid)), function(rowindex)
{
  print(rowindex)
  cbind(sim_grid[rowindex, ], do.call(
    "rbind",
    future_lapply(seq_len(n_iter), function(iter)
      
    {
      
      one_simulation(
        n = sim_grid$n[rowindex],
        mean = sim_grid$mean[rowindex],
        sd = sim_grid$sd[rowindex]
      )
      
    }, future.seed = 12457854 + rowindex # !!! you have to set a new seed for each row, other wise you will have the same results for each row!!!
    )
  ))
}))

When I run just one iteration (setting n_iter to 1), the warning message does not appear, and the results are as expected. However, when I increase n_iter to 10 for multiple iterations, then the warning arises.

I suspect this has something to do with the rbind function. I belive I have to drop the rwonames at some point but I cant figure it out. Any ideas?

1
  • FWIW, {future.apply) is designed to mimic base R apply functions. In this case, you get the same warning if you use lapply(). So, when in doubt, you can always troubleshoot with just that. Commented Jul 8 at 21:27

1 Answer 1

2

The warning you're getting is from cbind when you cbind(sim_grid..., do.call(rbind, .... When you have multiple functions / things happening in your code, it's best to try and break it up in small parts when troubleshooting. See below where I run just rowindex 1:

## sanity check
  rowindex <- 1
  print(rowindex)
 
  cbind( # warning when trying to combine the two lines below occurs here
    sim_grid[rowindex, ],  # running this line only, no warning
    do.call("rbind", future_lapply(seq_len(n_iter), function(iter) # running this line only, no warning
    {
      one_simulation(
        n = sim_grid$n[rowindex],
        mean = sim_grid$mean[rowindex],
        sd = sim_grid$sd[rowindex]
      )
    }, future.seed = 12457854 + rowindex # !!! you have to set a new seed for each row, other wise you will have the same results for each row!!!
    )
    )
  )

To fix this error, add row.names = NULL like so:

 ## sanity check
  rowindex <- 1
  print(rowindex)
  
  cbind(
    sim_grid[rowindex, ],
    do.call("rbind", future_lapply(seq_len(n_iter), function(iter)
    {
      one_simulation(
        n = sim_grid$n[rowindex],
        mean = sim_grid$mean[rowindex],
        sd = sim_grid$sd[rowindex]
      )
    }, future.seed = 12457854 + rowindex # !!! you have to set a new seed for each row, other wise you will have the same results for each row!!!
    )
    ),
    row.names = NULL
  )

see more about the warning here:

cbind warnings : row names were found from a short variable and have been discarded

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.