I have created a .NET app that uses Microsoft.ML.OnnxRuntime.Gpu for interference. Now I'm trying to integrate it with Azure Kubernetes.
We have made the setup with Tesla T4 GPU and we confirmed it's visible:

So we know that T4 is visible under ID = 0.
This is basically my code that works locally on my Windows machine:
MLContext _mlContext = new();
var estimator = _mlContext.Transforms.ApplyOnnxModel(
modelFile: _modelFile,
inputColumnNames: _inputColumnNames,
outputColumnNames: _outputColumnNames,
gpuDeviceId: gpuId
);
but when deploying to ACR, K8s etc., we are getting an exception:
System.InvalidOperationException: GPU with ID 0 is not found.
Bunch of information that I think may help.
- My local machine nvidia-smi log:

libraries used:
<PackageVersion Include="Microsoft.ML" Version="3.0.1" /> <PackageVersion Include="Microsoft.ML.ImageAnalytics" Version="3.0.1" /> <PackageVersion Include="Microsoft.ML.OnnxRuntime" Version="1.20.1" /> <PackageVersion Include="Microsoft.ML.OnnxRuntime.Gpu" Version="1.20.1" /> <PackageVersion Include="Microsoft.ML.OnnxTransformer" Version="3.0.1" />Image used in our Dockerfile:
nvidia/cuda:12.3.2-runtime-ubuntu22.04My local
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driverCopyright (c) 2005-2024 NVIDIA CorporationBuilt on Wed_Oct_30_01:18:48_Pacific_Daylight_Time_2024Cuda compilation tools, release 12.6, V12.6.85Build cuda_12.6.r12.6/compiler.35059454_0