2

I have encountered the following issue using R both in Linux and Windows environments. In its simplest form, I have a 3 or 4-dimensional array, which I gradually fill using smaller arrays.

A <- array(NA, dim=c(500, 1000,1000))
B <- array(rnorm(1e4), dim=c(1000,1000))
for (i in 1:500)   A[i,,] <- B

The interesting thing is that even though A is certainly allocated, when the loop starts, memory usage shoots up, to the point where the workstation becomes unusable. For context, execution of the third line can rapidly fill up 24GB of RAM, when A is 2000x2000x400.

Does anyone know why this happens, and whether there are ways to circumvent the issue?

1 Answer 1

2

I would expect memory usage to approximately double, assuming A and B are the only objects defined in your workspace. This is because you initialize A as a logical array (NA is logical by default), and the first subset-assignment will convert it to numeric.

> A <- array(NA, dim=c(500, 1000,1000))
> str(A)
 logi [1:500, 1:1000, 1:1000] NA NA NA NA NA NA ...
> A[1,,] <- B
> str(A)
 num [1:500, 1:1000, 1:1000] -1.21 NA NA NA NA NA NA NA NA NA ...

Try this instead:

A <- array(NA_real_, dim=c(500, 1000,1000))
B <- array(rnorm(1e4), dim=c(1000,1000))
gc()
#             used   (Mb) gc trigger   (Mb)  max used   (Mb)
# Ncells    185801   10.0     407500   21.8    350000   18.7
# Vcells 501281866 3824.5  551897808 4210.7 501612188 3827.0
for (i in 1:500)   A[i,,] <- B
gc()
#             used   (Mb) gc trigger   (Mb)  max used   (Mb)
# Ncells    185809   10.0     407500   21.8    350000   18.7
# Vcells 501281867 3824.5  579572698 4421.8 502108245 3830.8

You can see that max memory used barely increased.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks. Your example works on my computer. However, I found out that promoting boolean to double is not the cause of memory explosion. The problem is that R allocated extra memory at each assignment A[i,,] <- B. After the final garbage collection, all is good, but intermediate multiple allocations can cripple execution if the array is too big (mine is actually 2000x2000x500). If gc() is called inside the loop, the problem is solved.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.