Problem: when I run the following command
python -c "import tensorflow as tf; tf.test.is_gpu_available(); print('version :' + tf.__version__)"
Error:
RuntimeError: CUDA runtime implicit initialization on GPU:0 failed. Status: all CUDA-capable devices are busy or unavailable
Details:
WARNING:tensorflow:From :1: is_gpu_available (from tensorflow.python.framework.test_util) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.config.list_physical_devices('GPU') instead.
2021-04-18 21:02:51.839069: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2021-04-18 21:02:51.846775: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2500000000 Hz
2021-04-18 21:02:51.847076: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x7fc3bc000b20 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2021-04-18 21:02:51.847104: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2021-04-18 21:02:51.849876: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2021-04-18 21:02:51.911161: W tensorflow/compiler/xla/service/platform_util.cc:210] unable to create StreamExecutor for CUDA:0: failed initializing StreamExecutor for CUDA device ordinal 0: Internal: failed call to cuDevicePrimaryCtxRetain: CUDA_ERROR_UNKNOWN: unknown error
2021-04-18 21:02:51.911285: I tensorflow/compiler/jit/xla_gpu_device.cc:161] Ignoring visible XLA_GPU_JIT device. Device number is 0, reason: Internal: no supported devices found for platform CUDA
2021-04-18 21:02:51.911546: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-04-18 21:02:51.912210: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1561] Found device 0 with properties:
pciBusID: 0000:00:07.0 name: GRID T4-4Q computeCapability: 7.5
coreClock: 1.59GHz coreCount: 40 deviceMemorySize: 3.97GiB deviceMemoryBandwidth: 298.08GiB/s
2021-04-18 21:02:51.912446: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
2021-04-18 21:02:51.914362: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10
2021-04-18 21:02:51.916358: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10
2021-04-18 21:02:51.916679: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10
2021-04-18 21:02:51.918787: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10
2021-04-18 21:02:51.919993: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10
2021-04-18 21:02:51.924652: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2021-04-18 21:02:51.924792: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-04-18 21:02:51.925488: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:981] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-04-18 21:02:51.926100: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0
2021-04-18 21:02:51.926146: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.1
Traceback (most recent call last):
File "", line 1, in
File "/home/miniconda3/envs/py37/lib/python3.7/site-packages/tensorflow/python/util/deprecation.py", line 324, in new_func
return func(*args, **kwargs)
File "/home/miniconda3/envs/py37/lib/python3.7/site-packages/tensorflow/python/framework/test_util.py", line 1496, in is_gpu_available
for local_device in device_lib.list_local_devices():
File "/home/miniconda3/envs/py37/lib/python3.7/site-packages/tensorflow/python/client/device_lib.py", line 43, in list_local_devices
_convert(s) for s in _pywrap_device_lib.list_devices(serialized_config)
RuntimeError: CUDA runtime implicit initialization on GPU:0 failed. Status: all CUDA-capable devices are busy or unavailable
System information:
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): ubuntu 18.04
Mobile device (e.g. iPhone 8, Pixel 2, Samsung Galaxy) if the issue happens on mobile device: cloud server
TensorFlow installed from (source or binary): source
TensorFlow version: 2.2.0.
Python version: 3.7.7
Installed using virtualenv? pip? conda?: pip & conda.
Bazel version (if compiling from source): 2..0.0
GCC/Compiler version (if compiling from source): 7.5
CUDA/cuDNN version: CUDA 10.1 & cuDNN 7.6.5
GPU model and memory:
00:07.0 VGA compatible controller:
NVIDIA Corporation Device 1eb8 (rev a1) (prog-if 00 [VGA controller]).
Subsystem: NVIDIA Corporation Device 130e.
Physical Slot: 7
Flags: bus master, fast devsel, latency 0, IRQ 37
Memory at fc000000 (32-bit, non-prefetchable) [size=16M]
Memory at e0000000 (64-bit, prefetchable) [size=256M]
Memory at fa000000 (64-bit, non-prefetchable) [size=32M]
I/O ports at c500 [size=128]
Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
Kernel driver in use: nvidia
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia
I tried looking for solutions to this problem but none of them solved it:
https://github.com/tensorflow/tensorflow/issues/41990
https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#recommended-post


vectorAddthat actually uses the GPU. What result do you get?module: command not foundandaprun: command not found. I also triedsudo apt-get install environment-modulesbut it did not solve the problem../vecAdd.outand the output wasfinal result: 0.000000vectorAdd(Just like you ran the CUDA sample code calleddeviceQuery). In any event, the tutorial indicates that the final result should be 1.000, so that code is not working. Basically, you cannot run CUDA codes properly on that machine. It may be a broken CUDA install, a GRID licensing issue, or perhaps something else./usr/local/cuda-10.1/samples/0_Simple/vectorAddand ran./vectorAdd, then I got[Vector addition of 50000 elements] Failed to allocate device vector A (error code all CUDA-capable devices are busy or unavailable)!How can I know whether it's due to a broken CUDA install or a GRID licensing issue or something else?