1

I have created a .NET app that uses Microsoft.ML.OnnxRuntime.Gpu for interference. Now I'm trying to integrate it with Azure Kubernetes.

We have made the setup with Tesla T4 GPU and we confirmed it's visible:

enter image description here

So we know that T4 is visible under ID = 0.

This is basically my code that works locally on my Windows machine:

MLContext _mlContext = new();

var estimator = _mlContext.Transforms.ApplyOnnxModel(
    modelFile: _modelFile,
    inputColumnNames: _inputColumnNames,
    outputColumnNames: _outputColumnNames,   
    gpuDeviceId: gpuId
);

but when deploying to ACR, K8s etc., we are getting an exception:

System.InvalidOperationException: GPU with ID 0 is not found.

Bunch of information that I think may help.

  • My local machine nvidia-smi log:

enter image description here

  • libraries used:

        <PackageVersion Include="Microsoft.ML" Version="3.0.1" />
        <PackageVersion Include="Microsoft.ML.ImageAnalytics" Version="3.0.1" />
        <PackageVersion Include="Microsoft.ML.OnnxRuntime" Version="1.20.1" />
        <PackageVersion Include="Microsoft.ML.OnnxRuntime.Gpu" Version="1.20.1" />
        <PackageVersion Include="Microsoft.ML.OnnxTransformer" Version="3.0.1" />
    
  • Image used in our Dockerfile:

    nvidia/cuda:12.3.2-runtime-ubuntu22.04
    
  • My local nvcc --version
    nvcc: NVIDIA (R) Cuda compiler driver

    Copyright (c) 2005-2024 NVIDIA Corporation

    Built on Wed_Oct_30_01:18:48_Pacific_Daylight_Time_2024

    Cuda compilation tools, release 12.6, V12.6.85

    Build cuda_12.6.r12.6/compiler.35059454_0

1
  • wonder if docker has been totally configured. check the toolkit install guide Commented Nov 27 at 18:52

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.