Optimize Model Deployment with SageMaker Model Registry

Introduction

As our document categorization service evolved over six years, we've leveraged AWS SageMaker to train our three core models: an image model, a text model, and a sequence model. Initially, we used SageMaker for training and AWS Lambda for inference, storing our model artifacts on S3. However, we've been curious about using SageMaker endpoints for potentially better performance and cost efficiency.

This blog post walks you through the process of creating a new model version in the SageMaker Model Registry from existing S3 model artifacts, even if you haven't used the SageMaker (Pipelines) workflow end-to-end before for this task.

Understanding SageMaker Model Registry Requirements

A model version in a model group within SageMaker's Model Registry consists of two main components:

A Docker Image for Inference: This contains the runtime environment.
Model Artifacts: These are stored in an S3 tarball, along with any custom code required for model loading and inference.

Let's see how we bring these together with model artifacts that are lying around on S3.

Step-by-Step Guide

1. Retrieve the Docker Image URI

The first step is to determine the appropriate Docker image for your model. The SageMaker SDK provides the convenient command image_uris.retrieve(...) to do just that. You just need to know, which framework and version your model was trained in.

from sagemaker import image_uris
image_uris.retrieve(
    framework='sklearn',
    region='eu-central-1',
    version='1.0-1',
    image_scope='inference'
)

For our text model the correct image uri turns out to be 492215442770.dkr.ecr.eu-central-1.amazonaws.com/sagemaker-scikit-learn:1.0-1-cpu-py3

2. Prepare the Model Artifacts

Next, you'll need to bundle your model artifacts into a tarball. Download your model from S3, compress it into a .tar.gz file, and upload it back to S3:

# the name of the *.pkl file will be important when you load the model (see step 4)
tar -czvf model.tar.gz text_model.pkl 
aws s3 cp model.tar.gz s3://my-models/text/model.tar.gz

3. Define the SageMaker Model

Using the SageMaker SDK, define your model by specifying the S3 URI of the tarball, the image URI, and an execution role:

from sagemaker.sklearn import SKLearnModel

model = SKLearnModel(
    model_data="s3://my-models/text/model.tar.gz",
    entry_point="svm.py",
    role="arn:aws:iam::0123456789:role/service-role/AmazonSageMaker-ExecutionRole", # use the role that gets created when creating a SageMaker Domain
    image_uri="492215442770.dkr.ecr.eu-central-1.amazonaws.com/sagemaker-scikit-learn:1.0-1-cpu-py3"
)

4. Create the Entry Point Script

The SageMaker Scikit-learn model server loads the model at startup and breaks request handling into three steps:

input processing,
prediction, and
output processing.

Therefore your entry point script, or model script (ours is called svm.py, because our text model is a SVM) can define the following functions to manipulate each step:

model_fn: Loads the model.
input_fn: Takes request data and deserializes the data into an object for prediction.
predict_fn: Takes the deserialized request object and performs inference against the loaded model.
output_fn: Takes the result of prediction and serializes this according to the response content type.

Here's an example structure of our svm.py entry point script:

import os
import joblib

def model_fn(model_dir):
    # use the name of the file you provided in the model.tar.gz (step 2)
    model_file = 'text_model.pkl'
    with open(os.path.join(model_dir, model_file), 'rb') as f:
        model = joblib.load(f)
        return model

def predict_fn(input_data, model):
    return model.predict_proba(input_data)

Since we are not transforming input and output we do not need to define these functions.

5. Register the Model Version

Finally, register your model version in the SageMaker Model Registry:

model.register(
    model_package_group_name="svm-model", 
    content_types=["application/json"],
    response_types=["application/json"],
    inference_instances=["ml.t2.medium"], # defines what instance types are valid for later endpoint deployment
    approval_status="PendingManualApproval"
)

Hint If you do not have a model package group, you can create one using the AWS CLI:

aws sagemaker create-model-package-group --model-package-group-name svm-model

Conclusion

Transitioning from using AWS Lambda to SageMaker endpoints for inference involves registering your existing model artifacts in SageMaker's Model Registry. By following the steps outlined above, you can efficiently create new model versions. This helps you leverage SageMaker endpoints for potentially improved performance and cost benefits.

With this streamlined approach, you can unlock the full potential of SageMaker's robust model management and deployment capabilities, ensuring your ML models are always at peak performance.

Transforming Legacy Models: Registering Pre-existing Models in SageMaker Model Registry

Introduction

Understanding SageMaker Model Registry Requirements

Step-by-Step Guide

Conclusion

Comments

MLOPS: Mastering Model Management & Deployment with AWS SageMaker

More from this blog

Auf großer Fahrt: Schiffsüberführung von Stockholm nach Rügen - Teil 2

Auf großer Fahrt: Schiffsüberführung von Stockholm nach Rügen - Teil 1

Troubleshooting AWS IAM Identity Center Permission Issues: A Step-by-Step Guide

JTC Skipper Academy Dalmatia

Command Palette

Introduction

Understanding SageMaker Model Registry Requirements

Step-by-Step Guide

Conclusion

Comments

MLOPS: Mastering Model Management & Deployment with AWS SageMaker

More from this blog