0

I have interrupted the training of the model in PyTorch on CUDA, which I've run on Windows Subsystem for Linux 2 (WSL2). The dedicated GPU memory of NVIDIA GeForce RTX 3080Ti was not flushed.

enter image description here

What I have tried:

  • gc.collect() and torch.cuda.empty_cache() does not resolve the problem (reference)

  • When running numba.cuda.select_device(0) to potentially cuda.close(), the notebook hangs (reference)

  • After running nvidia-smi to potentially reset the GPU (reference), the command prompt hangs

  • Win + Ctrl + Shift + B to reset the graphics stack in Windows does not help (reference)

  • Restarting the notebook kernel as well as restarting the notebook server does not help

  • Physical reset is not available

UPDATE:

Running nvidia-smi in the command prompt on Windows (not on WSL2) yields the following enter image description here

1 Answer 1

1

I don't know your actual environment.

suppose that you use anaconda window-venv

  1. On cmd >nvidia-smi shows following.

cmd->nvidia-smi

  1. Check pid of python process name(>envs\psychopy\python.exe).
  2. On cmd taskkill /f /PID xxxx

this could be help.

and you don't want doing like this.

if you feeling annoying you can run the script on prompt, it would be automatically flushing gpu memory.

Sign up to request clarification or add additional context in comments.

5 Comments

I ran nvidia-smi in the Windows cmd. A screenshot is in the updated question. I don't find \psychopy\python.exe, though nor do I see any similar processes, which it would make sense to kill. Do you see any?
I use windows only, i'm always see .\python.exe but your env not showing. how about command run on wsl environment
I think you are not using gpu in pytorch similar problem is in here. stackoverflow.com/questions/15197286/… which hold your gpu memory check first
As I mentioned in the description, when running nvidia-smi on WSL2, the command prompt freezes, so I don't get any info about it. Nevertheless, I have managed to physically reboot the machine. Problem is solved. Thanks
you getting hard time. so sorry. try 'nvidia-smi -r' or 'nvidia-smi --gpu-reset' may be this is majino line.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.