Error when deploying pre trained Tensorflow models to one endpoint (multimodel for one endpoint) in sagemaker?

Question

I am following this example from aws https://github.com/aws-samples/sagemaker-multi-model-endpoint-tensorflow-computer-vision/blob/main/multi-model-endpoint-tensorflow-cv.ipynb to apply same workflow with two pre trained models (trained outside of sagemaker).

But when I do the following, logs say that models can't be found:

import boto3
import datetime
from datetime import datetime
import time
import sagemaker
from sagemaker import get_execution_role
from sagemaker.tensorflow.serving import TensorFlowModel
from sagemaker.multidatamodel import MultiDataModel

model_data_prefix = f's3://{BUCKET}/{PREFIX}/mme/'
output = f's3://{BUCKET}/{PREFIX}/mme/test.tar.gz'

modele = TensorFlowModel(model_data=output, 
                          role=role, 
                          image_uri=IMAGE_URI)

mme = MultiDataModel(name=f'mme-tensorflow-{current_time}',
                     model_data_prefix=model_data_prefix,
                     model=modele,
                     sagemaker_session=sagemaker_session)

predictor = mme.deploy(initial_instance_count=1,
                       instance_type='ml.m5.2xlarge',
                       endpoint_name=f'mme-tensorflow-{current_time}')

When I give an image as input to predict, I have this message:

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from model with message "<html>
  <head>
    <title>Internal Server Error</title>
  </head>
  <body>
    <h1><p>Internal Server Error</p></h1>
    
  </body>
</html>
".

Logs give:

Could not find base path /opt/ml/models/.../model for servable ...

What did I missed ?

CrzyFella · Accepted Answer · 2022-05-04 01:59:22Z

1

In the sample notebook, the model is trained within SageMaker. So it is created with certain environment variables like the "SAGEMAKER_PROGRAM"(I think, need to check the documentation) with value set to entry point script.

But while you are creating the model with models trained outside the SageMaker you need to add those environment variables.

Without an entry point script SageMaker is not in a position to know what to do with the request.

answered May 4, 2022 at 1:59

CrzyFella

1914 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

vikrame · Accepted Answer · 2023-02-05 03:29:38Z

0

SageMaker supports deployment of multiple deep learning models on GPUs using NVIDIA Triton inference server. You can bring models trained outside SageMaker and deploy with SageMaker MME using triton model configuration and model repository. Refer to documentation, examples and blog to get started

answered Feb 5, 2023 at 3:29

vikrame

4852 silver badges12 bronze badges

Collectives™ on Stack Overflow

Error when deploying pre trained Tensorflow models to one endpoint (multimodel for one endpoint) in sagemaker?

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related