How can I empty the GPU memory when I get a error like OOM?

Question

sometimes,when I got a OOM error,but the parameters of LLM has been load in GPU,and it cannot be cleared automatically.

So,I try this

torch.cuda.empty_cache()

but it did't work.So,everytime I must restart my GPU to clear the cache.

torch.cuda.empty_cache() only releases unreferenced memory from PyTorch’s caching allocator back to CUDA. If your model or tensors are still in scope, memory won’t be freed. Sometimes Jupyter/IPython keeps hidden references, so memory lingers even after del. — Rackymuthu
– Rackymuthu, Commented Sep 6 at 9:51
It seems this question addresses what you are trying to do: stackoverflow.com/questions/57858433/… — SlowDownRandy
– SlowDownRandy, Commented Sep 11 at 15:44

Afrin Jaman · Accepted Answer · 2025-09-07 08:12:53Z

0

Use with torch.no_grad(): during inference to avoid storing gradients and use mixed precision (torch.cuda.amp) to cut memory usage.

torch.cuda.empty_cache() does not “kill” memory still referenced by active objects. To truly free GPU memory- del unused variables, call gc.collect() and torch.cuda.empty_cache()

answered Sep 7 at 8:12

Afrin Jaman

11 bronze badge

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How can I empty the GPU memory when I get a error like OOM?

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related