Transforming Legacy Models: Registering Pre-existing Models in SageMaker Model Registry
How to efficiently register your pre-trained models using the Sagemaker SDK to prepare for deployment on SageMaker Endpoints
Introduction
As our document categorization service evolved over six years, we've leveraged AWS SageMaker to train our three core models: an image model, a text model, and a sequence model. Initially, we used SageMaker for training and AWS Lambda for inference, storing our model artifacts on S3. However, we've been curious about using SageMaker endpoints for potentially better performance and cost efficiency.
This blog post walks you through the process of creating a new model version in the SageMaker Model Registry from existing S3 model artifacts, even if you haven't used the SageMaker (Pipelines) workflow end-to-end before for this task.
Understanding SageMaker Model Registry Requirements
A model version in a model group within SageMaker's Model Registry consists of two main components:
A Docker Image for Inference: This contains the runtime environment.
Model Artifacts: These are stored in an S3 tarball, along with any custom code required for model loading and inference.
Let's see how we bring these together with model artifacts that are lying around on S3.
Step-by-Step Guide
1. Retrieve the Docker Image URI
The first step is to determine the appropriate Docker image for your model. The SageMaker SDK provides the convenient command image_uris.retrieve(...)
to do just that. You just need to know, which framework and version your model was trained in.
from sagemaker import image_uris
image_uris.retrieve(
framework='sklearn',
region='eu-central-1',
version='1.0-1',
image_scope='inference'
)
For our text model the correct image uri turns out to be 492215442770.dkr.ecr.eu-central-1.amazonaws.com/sagemaker-scikit-learn:1.0-1-cpu-py3
2. Prepare the Model Artifacts
Next, you'll need to bundle your model artifacts into a tarball. Download your model from S3, compress it into a .tar.gz
file, and upload it back to S3:
# the name of the *.pkl file will be important when you load the model (see step 4)
tar -czvf model.tar.gz text_model.pkl
aws s3 cp model.tar.gz s3://my-models/text/model.tar.gz
3. Define the SageMaker Model
Using the SageMaker SDK, define your model by specifying the S3 URI of the tarball, the image URI, and an execution role:
from sagemaker.sklearn import SKLearnModel
model = SKLearnModel(
model_data="s3://my-models/text/model.tar.gz",
entry_point="svm.py",
role="arn:aws:iam::0123456789:role/service-role/AmazonSageMaker-ExecutionRole", # use the role that gets created when creating a SageMaker Domain
image_uri="492215442770.dkr.ecr.eu-central-1.amazonaws.com/sagemaker-scikit-learn:1.0-1-cpu-py3"
)
4. Create the Entry Point Script
The SageMaker Scikit-learn model server loads the model at startup and breaks request handling into three steps:
input processing,
prediction, and
output processing.
Therefore your entry point script, or model script (ours is called svm.py
, because our text model is a SVM) can define the following functions to manipulate each step:
model_fn
: Loads the model.input_fn
: Takes request data and deserializes the data into an object for prediction.predict_fn
: Takes the deserialized request object and performs inference against the loaded model.output_fn
: Takes the result of prediction and serializes this according to the response content type.
Here's an example structure of our svm.py
entry point script:
import os
import joblib
def model_fn(model_dir):
# use the name of the file you provided in the model.tar.gz (step 2)
model_file = 'text_model.pkl'
with open(os.path.join(model_dir, model_file), 'rb') as f:
model = joblib.load(f)
return model
def predict_fn(input_data, model):
return model.predict_proba(input_data)
Since we are not transforming input and output we do not need to define these functions.
5. Register the Model Version
Finally, register your model version in the SageMaker Model Registry:
model.register(
model_package_group_name="svm-model",
content_types=["application/json"],
response_types=["application/json"],
inference_instances=["ml.t2.medium"], # defines what instance types are valid for later endpoint deployment
approval_status="PendingManualApproval"
)
Hint If you do not have a model package group, you can create one using the AWS CLI:
aws sagemaker create-model-package-group --model-package-group-name svm-model
Conclusion
Transitioning from using AWS Lambda to SageMaker endpoints for inference involves registering your existing model artifacts in SageMaker's Model Registry. By following the steps outlined above, you can efficiently create new model versions. This helps you leverage SageMaker endpoints for potentially improved performance and cost benefits.
With this streamlined approach, you can unlock the full potential of SageMaker's robust model management and deployment capabilities, ensuring your ML models are always at peak performance.