0

sometimes,when I got a OOM error,but the parameters of LLM has been load in GPU,and it cannot be cleared automatically.

So,I try this

torch.cuda.empty_cache()

but it did't work.So,everytime I must restart my GPU to clear the cache.

2
  • 1
    torch.cuda.empty_cache() only releases unreferenced memory from PyTorch’s caching allocator back to CUDA. If your model or tensors are still in scope, memory won’t be freed. Sometimes Jupyter/IPython keeps hidden references, so memory lingers even after del. Commented Sep 6 at 9:51
  • It seems this question addresses what you are trying to do: stackoverflow.com/questions/57858433/… Commented Sep 11 at 15:44

1 Answer 1

0

Use with torch.no_grad(): during inference to avoid storing gradients and use mixed precision (torch.cuda.amp) to cut memory usage.

torch.cuda.empty_cache() does not “kill” memory still referenced by active objects. To truly free GPU memory- del unused variables, call gc.collect() and torch.cuda.empty_cache()

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.