Force GPU memory limit in PyTorch

Question

Is there a way to force a maximum value for the amount of GPU memory that I want to be available for a particular Pytorch instance? For example, my GPU may have 12Gb available, but I'd like to assign 4Gb max to a particular process.

I think pytorch will use as memory as it needs, probably the model and the loaded images. I do not think it greedily consumes all the memory available. — Manuel Lagunas
– Manuel Lagunas, Commented Mar 28, 2018 at 8:28
@ManuelLagunas Does it? Even if that is the case, I'd like to know if there is a way to set a maximum limit on the memory that a pytorch instance sees as available. — Giorgos Sfikas
– Giorgos Sfikas, Commented Mar 28, 2018 at 9:01

ndrwnaguib · Accepted Answer · 2021-03-17 08:54:43Z

26

Update (04-MAR-2021): it is now available in the stable 1.8.0 version of PyTorch. Also, in the docs

Original answer follows.

This feature request has been merged into PyTorch master branch. Yet, not introduced in the stable release.

Introduced as set_per_process_memory_fraction

Set memory fraction for a process. The fraction is used to limit an caching allocator to allocated memory on a CUDA device. The allowed value equals the total visible memory multiplied fraction. If trying to allocate more than the allowed value in a process, will raise an out of memory error in allocator.

You can check the tests as usage examples.

edited Mar 17, 2021 at 8:54

answered Jan 4, 2021 at 5:16

ndrwnaguib

6,2463 gold badges33 silver badges56 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Fırat Kıyak Over a year ago

Can we do the same for the CPU and RAM?

kaiyuanxie · Accepted Answer · 2021-03-10 03:53:09Z

Update pytorch to 1.8.0 （pip install --upgrade torch==1.8.0）

function: torch.cuda.set_per_process_memory_fraction(fraction, device=None)

params:

fraction (float) – Range: 0~1. Allowed memory equals total_memory * fraction.

device (torch.device or int, optional) – selected device. If it is None the default CUDA device is used.

eg:

import torch
torch.cuda.set_per_process_memory_fraction(0.5, 0)
torch.cuda.empty_cache()
total_memory = torch.cuda.get_device_properties(0).total_memory
# less than 0.5 will be ok:
tmp_tensor = torch.empty(int(total_memory * 0.499), dtype=torch.int8, device='cuda')
del tmp_tensor
torch.cuda.empty_cache()
# this allocation will raise a OOM:
torch.empty(total_memory // 2, dtype=torch.int8, device='cuda')

"""
It raises an error as follows: 
RuntimeError: CUDA out of memory. Tried to allocate 5.59 GiB (GPU 0; 11.17 GiB total capacity; 0 bytes already allocated; 10.91 GiB free; 5.59 GiB allowed; 0 bytes reserved in total by PyTorch)
"""

Benedict K. · Accepted Answer · 2018-03-28 11:30:19Z

-9

In contrast to tensorflow which will block all of the CPUs memory, Pytorch only uses as much as 'it needs'. However you could:

Reduce the batch size
Use CUDA_VISIBLE_DEVICES=# of GPU (can be multiples) to limit the GPUs that can be accessed.

To make this run within the program try:

import os
os.environ["CUDA_VISIBLE_DEVICES"]="0"

answered Mar 28, 2018 at 11:30

Benedict K.

8542 gold badges8 silver badges23 bronze badges

2 Comments

Ido_f Over a year ago

This is not really an answer. Both methods above would have an effect on the amount of memory used, but none of them restricts pytorch. I find it to be a rather annoying property of Pytorch as the memory allocation allows for better sharing of resources (yes, sometimes at the cost of slower run times). There's also an official feature request: github.com/pytorch/pytorch/issues/18626

maarten Over a year ago

Indeed, this answer does not address the question how to enforce a limit to memory usage. Moreover, it is not true that pytorch only reserves as much GPU memory as it needs. Pytorch keeps GPU memory that is not used anymore (e.g. by a tensor variable going out of scope) around for future allocations, instead of releasing it to the OS. This means that two processes using the same GPU experience out-of-memory errors, even if at any specific time the sum of the GPU memory actually used by the two processes remains below the capacity.

Collectives™ on Stack Overflow

Force GPU memory limit in PyTorch

3 Answers 3

1 Comment

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related