2

This seems to be a tricky thing to do, as I haven't found too much documentation for it. I'm trying to deploy a Huggingface pre-trained model for NLU to a SageMaker endpoint. Naturally, I don't want to do this manually, I'd like to automate it through CloudFormation. I found a somewhat useful article on how to deploy, but the name of the training model is confusing and I don't know where I would find the right name for the model I want to deploy or where I would put that name (I want to deploy an all-MiniLM-L6-v2 model).

Is this possible to do? Do I need to deploy a container? If so, how do I set up the container to process requests and return the text embeddings from the model? I've looked into doing this with just a lambda (which would satisfy the automated deployment process), but the packages I need to use greatly exceed the 250MB limit for lambda+layers.

How do I deploy an endpoint from CloudFormation? Does anyone have experience doing this? If so, please share your wisdom.

2 Answers 2

3

To anyone curious, this is how I ended up solving this issue:

I ran a Jupyter notebook locally to create the model artifact. Once complete, I zipped the model artifact into a tar.gz file.

from transformers import AutoModel, AutoTokenizer
from os import makedirs

saved_model_dir = 'saved_model_dir'
makedirs(saved_model_dir, exist_ok=True)

# models were obtained from https://huggingface.co/models
tokenizer = AutoTokenizer.from_pretrained('sentence-transformers/multi-qa-MiniLM-L6-cos-v1')
model = AutoModel.from_pretrained('sentence-transformers/multi-qa-MiniLM-L6-cos-v1')

tokenizer.save_pretrained(saved_model_dir)
model.save_pretrained(saved_model_dir)
cd saved_model_dir && tar czvf ../model.tar.gz *

I included a script in my pipeline to then upload that artifact to S3.

aws s3 cp path/to/model.tar.gz s3://bucket-name/prefix

I also created a CloudFormation template that would stand up my SageMaker resources. The tricky part of this was finding a container image to use, and a colleague was able to point me to this repo that contained a massive list of AWS-maintained container images for deep learning and inference. From there, I just needed to select the one that fit my needs.

Resources:
  SageMakerModel:
    Type: AWS::SageMaker::Model
    Properties:
      PrimaryContainer:
        Image: 763104351884.dkr.ecr.us-east-1.amazonaws.com/pytorch-inference:1.12.0-cpu-py38-ubuntu20.04-sagemaker # image resource found at https://github.com/aws/deep-learning-containers/blob/master/available_images.md
        Mode: SingleModel
        ModelDataUrl: s3://path/to/model.tar.gz
      ExecutionRole: 
      ModelName: inference-model

  SageMakerEndpointConfig:
    Type: AWS::SageMaker::EndpointConfig
    Properties:
      EndpointConfigName: endpoint-config-name
      ProductionVariants:
        - ModelName: inference-model
          InitialInstanceCount: 1
          InstanceType: ml.t2.medium
          VariantName: dev
  
  SageMakerEndpoint:
    Type: AWS::SageMaker::Endpoint
    Properties:
      EndpointName: endpoint-name
      EndpointConfigName: !GetAtt SageMakerEndpointConfig.EndpointConfigName

Once the PyTorch model is created locally, this solution essentially automates the process of provisioning and deploying a SageMaker endpoint for inference. If I need to switch the model, I just need to run my notebook code and it will overwrite my existing model artifact. Then I can redeploy and my pipeline will re-upload the artifact to S3, modify the existing SageMaker resources, and the solution will begin operating using the new model.

This may not be the most elegant solution out there, so any suggestions or pointers would be much appreciated!

Sign up to request clarification or add additional context in comments.

Comments

0

Its a long shot but here are the steps that you can follow.

  1. Create a model.tar.gz following the below instruction, you should download the pt models from HFhub and use it. You can also specify dependencies in the requirements.txt file especially for using sentence transformer. https://sagemaker.readthedocs.io/en/stable/frameworks/pytorch/using_pytorch.html#bring-your-own-model

  2. Once you have the model you can create the cloudformation as mentioned in the blog post that you shared. Creation of the tar.gz file

1 Comment

Thanks for the comment! I didn't follow it entirely, but it helped me find a solution that works for my deployment process.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.