0

I am trying to remove a memory bottleneck in my program. Here is the interesting part:

print_mem_info()
print("creating array")
arr = np.empty(vol_to_write.get_shape(), dtype=np.float16)
for v_tmp, a_tmp in zip(v_list, a_list):
    s = to_basis(v_tmp, vol_to_write).get_slices()
    arr[s[0][0]:s[0][1],s[1][0]:s[1][1],s[2][0]:s[2][1]] = copy.deepcopy(a_tmp)
print_mem_info()
print("deleting array")
del arr
print_mem_info()

Here is the output:

Used RAM:  4217.71875 MB
creating array
Used RAM:  4229.68359375 MB
deleting array
Used RAM:  4229.2890625 MB

For print_mem_info I am just using the psutil library:

def print_mem_info():
    mem = psutil.virtual_memory()
    swap = psutil.swap_memory()
    used_ram = (mem.total - mem.available) /1024 /1024
    used_swap = swap.used /1024 /1024 
    print("Used RAM: ", used_ram, "MB")
    # print("Used swap: ", used_swap, "MB")

I am just creating a numpy array, filling it and then I want to delete it (in the program I am supposed to delete it later but for debugging purpose I am putting the del here). What I cannot understand is why the del is not removing the array from RAM, as there are not any other references to this array. I tried with gc.collect() and it did nothing.

I read a lot of other posts from stackoverflow but I could not figure it out. I know that gc.collect() is not supposed to be used and I read somewhere that using del is not recommended but I am manipulating very big numpy arrays so I cannot just let them in RAM.


[edit]:

I tried creating a minimal example here:

import numpy as np
import psutil, os

def print_mem_info():
    process = psutil.Process(os.getpid())
    print(process.memory_info().vms // 1024 // 1024)

if __name__ == "__main__":
    print("program starts")
    print_mem_info()

    print("creating samples...")
    a_list = list()
    for i in range(4):
        a_list.append(np.random.rand(100,100,100))
    print_mem_info()

    print("creating array...")
    arr = np.empty((400,100,100))
    print_mem_info()

    print("filling the array...")
    for i, a_tmp in enumerate(a_list):
        arr[i*100:(i+1)*100,:,:] = a_tmp
        del a_tmp
    print_mem_info()

    print("deleting the array...")
    del arr
    print_mem_info()
13
  • Unfortunately it is more complicated than that, it is an experiment that I am doing for a research thesis, and I am running several scripts one after the other, you need to configure file paths, clone some of my projects etc.... Sorry about that I wish I could Commented Jul 16, 2020 at 20:04
  • My bad it is "psutil" I put the code in the question (psutil.readthedocs.io/en/latest) Commented Jul 16, 2020 at 20:08
  • 1
    First, del doesn't delete objects. del arr unbinds the arr variable. Second, freeing an object doesn't necessarily return memory to the OS. Commented Jul 16, 2020 at 20:10
  • There's the problem. In an empty program, it measures 11 GB used on my machine. Without any code except print_mem_info(). You are getting system wide information but you should look at program wide info Commented Jul 16, 2020 at 20:11
  • @ThomasWeller are you sure that you are not using any RAM at all ? That's weird... For me it seems to work, I mean if I don't run anything else psutil tells me the same than htop about my memory consumption. Commented Jul 16, 2020 at 20:15

1 Answer 1

1

You are measuring the memory on system level, not on process level. You don't know what all other processes on your machine are doing.

Be careful with the example code for measuring memory of a process. Many examples there are mixing virtual memory and physical memory.

RSS (linux term) and Working Set (Windows term) are not good for discussing your problem, because they only consider that part of memory which is currently in physical RAM. Since that heavily depends on how much physical RAM you have, this will vary between machines and is absolutely not comparable.

VMS (linux term) or Private Bytes (Windows term) are much more reliable, since they also consider memory that is used, but swapped to disk if you don't have enough physical RAM.

The following code should help you get things started:

import numpy as np
import psutil
import os

def print_mem_info():
    process = psutil.Process(os.getpid())
    print(process.memory_info().vms // 1024 // 1024)

print_mem_info()
arr = np.empty((100000,100000))
print_mem_info()
del arr
print_mem_info()

On my machine, it prints

261
76705
262

The 76 GB sound plausible for 100.000 * 100.000 items in an array à 8 bytes.

With RSS, the effect is not visible:

47
47
47
Sign up to request clarification or add additional context in comments.

1 Comment

Thank you for this information, it is very interesting, I will use this to monitor my program now. Unfortunately I still cannot see that my array is removed though (I think there is a problem with copying data from a_tmp)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.