RuntimeError: CUDA runtime implicit initialization on GPU:0 failed. Status: all CUDA-capable devices are busy or unavailable

Question

Problem: when I run the following command

python -c "import tensorflow as tf; tf.test.is_gpu_available(); print('version :' + tf.__version__)"

Error:

RuntimeError: CUDA runtime implicit initialization on GPU:0 failed. Status: all CUDA-capable devices are busy or unavailable

Details:

WARNING:tensorflow:From :1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version. Instructions for updating: Use tf.config.list_physical_devices('GPU') instead. 2021-04-18 21:02:51.839069: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA 2021-04-18 21:02:51.846775: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2500000000 Hz 2021-04-18 21:02:51.847076: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fc3bc000b20 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2021-04-18 21:02:51.847104: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2021-04-18 21:02:51.849876: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1 2021-04-18 21:02:51.911161: W tensorflow/compiler/xla/service/platform_util.cc:210] unable to create StreamExecutor for CUDA:0: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_UNKNOWN: unknown error 2021-04-18 21:02:51.911285: I tensorflow/compiler/jit/xla_gpu_device.cc:161] Ignoring visible XLA_GPU_JIT device. Device number is 0, reason: Internal: no supported devices found for platform CUDA 2021-04-18 21:02:51.911546: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-18 21:02:51.912210: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties: pciBusID: 0000:00:07.0 name: GRID T4-4Q computeCapability: 7.5 coreClock: 1.59GHz coreCount: 40 deviceMemorySize: 3.97GiB deviceMemoryBandwidth: 298.08GiB/s 2021-04-18 21:02:51.912446: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 2021-04-18 21:02:51.914362: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10 2021-04-18 21:02:51.916358: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10 2021-04-18 21:02:51.916679: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10 2021-04-18 21:02:51.918787: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10 2021-04-18 21:02:51.919993: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10 2021-04-18 21:02:51.924652: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7 2021-04-18 21:02:51.924792: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-18 21:02:51.925488: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2021-04-18 21:02:51.926100: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0 2021-04-18 21:02:51.926146: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1 Traceback (most recent call last): File "", line 1, in File "/home/miniconda3/envs/py37/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py", line 324, in new_func return func(*args, **kwargs) File "/home/miniconda3/envs/py37/lib/python3.7/site-packages/tensorflow/python/framework/test_util.py", line 1496, in is_gpu_available for local_device in device_lib.list_local_devices(): File "/home/miniconda3/envs/py37/lib/python3.7/site-packages/tensorflow/python/client/device_lib.py", line 43, in list_local_devices _convert(s) for s in _pywrap_device_lib.list_devices(serialized_config) RuntimeError: CUDA runtime implicit initialization on GPU:0 failed. Status: all CUDA-capable devices are busy or unavailable

System information:

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): ubuntu 18.04 Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: cloud server TensorFlow installed from (source or binary): source
TensorFlow version: 2.2.0. Python version: 3.7.7
Installed using virtualenv? pip? conda?: pip & conda.
Bazel version (if compiling from source): 2..0.0
GCC/Compiler version (if compiling from source): 7.5
CUDA/cuDNN version: CUDA 10.1 & cuDNN 7.6.5
GPU model and memory:
00:07.0 VGA compatible controller: NVIDIA Corporation Device 1eb8 (rev a1) (prog-if 00 [VGA controller]).
Subsystem: NVIDIA Corporation Device 130e.
Physical Slot: 7 Flags: bus master, fast devsel, latency 0, IRQ 37 Memory at fc000000 (32-bit, non-prefetchable) [size=16M] Memory at e0000000 (64-bit, prefetchable) [size=256M] Memory at fa000000 (64-bit, non-prefetchable) [size=32M] I/O ports at c500 [size=128] Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ Kernel driver in use: nvidia Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia

I tried looking for solutions to this problem but none of them solved it:

https://forums.developer.nvidia.com/t/all-cuda-capable-devices-are-busy-or-unavailable-what-is-wrong/112858

https://github.com/tensorflow/tensorflow/issues/41990

Tensorflow-GPU Error: "RuntimeError: CUDA runtime implicit initialization on GPU:0 failed. Status: all CUDA-capable devices are busy or unavailable"

https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#recommended-post

https://github.com/tensorflow/tensorflow/issues/48558

https://programmersought.com/article/94034772029/

try running a CUDA sample code like vectorAdd that actually uses the GPU. What result do you get? — Robert Crovella
– Robert Crovella, Commented Apr 18, 2021 at 15:19
Ok, I tried running vectorAdd according to this tutorial: olcf.ornl.gov/tutorials/cuda-vector-addition, and I got module: command not found and aprun: command not found. I also tried sudo apt-get install environment-modules but it did not solve the problem. — Jia Li
– Jia Li, Commented Apr 19, 2021 at 2:28
I have run vecAdd.out with ./vecAdd.out and the output was final result: 0.000000 — Jia Li
– Jia Li, Commented Apr 19, 2021 at 3:43
Yes, that splendid tutorial has no error checking. Nearly useless. Run the CUDA sample code that is called vectorAdd (Just like you ran the CUDA sample code called deviceQuery). In any event, the tutorial indicates that the final result should be 1.000, so that code is not working. Basically, you cannot run CUDA codes properly on that machine. It may be a broken CUDA install, a GRID licensing issue, or perhaps something else. — Robert Crovella
– Robert Crovella, Commented Apr 19, 2021 at 4:10
ok, you are absolutely right. I went to /usr/local/cuda-10.1/samples/0_Simple/vectorAdd and ran ./vectorAdd, then I got [Vector addition of 50000 elements] Failed to allocate device vector A (error code all CUDA-capable devices are busy or unavailable)! How can I know whether it's due to a broken CUDA install or a GRID licensing issue or something else? — Jia Li
– Jia Li, Commented Apr 19, 2021 at 5:44

Fabiano Tarlao · Accepted Answer · 2021-07-13 12:33:27Z

2

I can confirm the case mentioned in a comment.

I had the problem while working with an Ubuntu VM, executed on VMware ESXi host, and using a vGPU partition for a v100 Nvidia GPU.

I got the same error, and I have already tried changing cuda versions and downloading (pip) softwares compiled for that specific CUDA versions, this has NOT solved the issue, the error:

tensorflow.python.framework.errors_impl.InternalError: CUDA runtime implicit initialization on GPU:0 failed. Status: all CUDA-capable devices are busy or unavailable

In my case I forgot to set the license server in /etc/nvidia/grid.conf, and I got exactly the same error, so in my case it was a GRID license issue ... fixing the grid config file and rebooting solved the issue.

edited Jul 13, 2021 at 12:33

answered Jul 13, 2021 at 12:26

Fabiano Tarlao

3,2921 gold badge38 silver badges44 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

DholuBholu · Accepted Answer · 2022-10-12 07:20:50Z

1

You can try to reboot the system. Your GPU is Occupied by the previous run and didn't got free since then.

answered Oct 12, 2022 at 7:20

DholuBholu

3141 silver badge9 bronze badges

Collectives™ on Stack Overflow

RuntimeError: CUDA runtime implicit initialization on GPU:0 failed. Status: all CUDA-capable devices are busy or unavailable

System information:

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

System information:

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related