ValueError while deploying tensorflow model to Amazon SageMaker

Question

I want to deploy my trained tensorflow model to the amazon sagemaker, I am following the official guide here: https://aws.amazon.com/blogs/machine-learning/deploy-trained-keras-or-tensorflow-models-using-amazon-sagemaker/ to deploy my model using jupyter notebook.

But when I try to use code:

predictor = sagemaker_model.deploy(initial_instance_count=1, instance_type='ml.t2.medium')

It gives me the following error message:

ValueError: Error hosting endpoint sagemaker-tensorflow-2019-08-07-22-57-59-547: Failed Reason: The image '520713654638.dkr.ecr.us-west-1.amazonaws.com/sagemaker-tensorflow:1.12-cpu-py3 ' does not exist.

I think the tutorial does not tell me to create an image, and I do not know what to do.

import boto3, re
from sagemaker import get_execution_role

role = get_execution_role()

# make a tar ball of the model data files
import tarfile
with tarfile.open('model.tar.gz', mode='w:gz') as archive:
    archive.add('export', recursive=True)

# create a new s3 bucket and upload the tarball to it
import sagemaker

sagemaker_session = sagemaker.Session()
inputs = sagemaker_session.upload_data(path='model.tar.gz', key_prefix='model')

from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '1.12',
                                  entry_point = 'train.py',
                                  py_version='py3')

%%time
#here I fail to deploy the model and get the error message
predictor = sagemaker_model.deploy(initial_instance_count=1,
                                   instance_type='ml.m4.xlarge')

For future readers, In my case I had to mention py_version='py2' to get it to work. — Pramesh Bajracharya
– Pramesh Bajracharya, Commented Dec 12, 2019 at 6:22

Ujjwal Bhardwaj · Accepted Answer · 2019-08-09 11:18:47Z

2

https://github.com/aws/sagemaker-python-sdk/issues/912#issuecomment-510226311

As mentioned in the issue

Python 3 isn't supported using the TensorFlowModel object, as the container uses the TensorFlow serving api library in conjunction with the GRPC client to handle making inferences, however the TensorFlow serving api isn't supported in Python 3 officially, so there are only Python 2 versions of the containers when using the TensorFlowModel object.

If you need Python 3 then you will need to use the Model object defined in #2 above. The inference script format will change if you need to handle pre and post processing. https://github.com/aws/sagemaker-tensorflow-serving-container#prepost-processing.

answered Aug 9, 2019 at 11:18

Ujjwal Bhardwaj

7555 silver badges11 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Bryan Yuan Over a year ago

Thank you very much! After I removed the py_version='py3', the deployment can now start, but I get this error then:

UnexpectedStatusException: Error hosting endpoint sagemaker-tensorflow-2019-08-09-16-29-57-306: Failed. Reason:  The primary container for production variant AllTraffic did not pass the ping health check. Please check CloudWatch logs for this endpoint.

sanjams Over a year ago

That's a step in the right direction, but as the error says, I would try checking the logs in your CloudWatch. If you cannot discern what is going on, paste the logs here and maybe we can help. Or try reaching out to support...

Collectives™ on Stack Overflow

ValueError while deploying tensorflow model to Amazon SageMaker

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related