SageMaker Python SDK is an open source library for training and deploying machine learning models on Amazon SageMaker.
With the SDK, you can train and deploy models using popular deep learning frameworks Apache MXNet and TensorFlow. You can also train and deploy models with Amazon algorithms, which are scalable implementations of core machine learning algorithms that are optimized for SageMaker and GPU training. If you have your own algorithms built into SageMaker compatible Docker containers, you can train and host models using these as well.
For detailed API reference please go to: Read the Docs
- Installing SageMaker Python SDK
- SageMaker Python SDK Overview
- MXNet SageMaker Estimators
- TensorFlow SageMaker Estimators
- Chainer SageMaker Estimators
- PyTorch SageMaker Estimators
- AWS SageMaker Estimators
- BYO Docker Containers with SageMaker Estimators
- SageMaker Automatic Model Tuning
- SageMaker Batch Transform
- BYO Model
The SageMaker Python SDK is built to PyPI and can be installed with pip as follows:
pip install sagemaker
You can install from source by cloning this repository and running a pip install command in the root directory of the repository:
git clone https://github.com/aws/sagemaker-python-sdk.git cd sagemaker-python-sdk pip install .
SageMaker Python SDK supports Unix/Linux and Mac.
SageMaker Python SDK is tested on: * Python 2.7 * Python 3.5
SageMaker Python SDK is licensed under the Apache 2.0 License. It is copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. The license is available at: http://aws.amazon.com/apache2.0/
SageMaker Python SDK has unit tests and integration tests.
Unit tests
tox is a prerequisite for running unit tests so you need to make sure you have it installed. To run the unit tests:
tox tests/unit
Integrations tests
To run the integration tests, the following prerequisites must be met
- Access to an AWS account to run the tests on
- AWS account credentials available to boto3 clients used in the tests
- The AWS account has an IAM role named
SageMakerRole
- The libraries listed in the
extras_require
object insetup.py
fortest
are installed. You can do this by running the following command:pip install --upgrade .[test]
You can run integ tests by issuing the following command:
pytest tests/integ
You can also filter by individual test function names (usable with any of the previous commands):
pytest -k 'test_i_care_about'
cd
into the doc
directory and run:
make html
You can edit the templates for any of the pages in the docs by editing the .rst files in the "doc" directory and then running "make html
" again.
SageMaker Python SDK provides several high-level abstractions for working with Amazon SageMaker. These are:
- Estimators: Encapsulate training on SageMaker.
- Models: Encapsulate built ML models.
- Predictors: Provide real-time inference and transformation using Python data-types against a SageMaker endpoint.
- Session: Provides a collection of methods for working with SageMaker resources.
Estimator
and Model
implementations for MXNet, TensorFlow, Chainer, PyTorch, and Amazon ML algorithms are included.
There's also an Estimator
that runs SageMaker compatible custom Docker containers, enabling you to run your own ML algorithms by using the SageMaker Python SDK.
The following sections of this document explain how to use the different estimators and models:
- MXNet SageMaker Estimators and Models
- TensorFlow SageMaker Estimators and Models
- Chainer SageMaker Estimators and Models
- PyTorch SageMaker Estimators
- AWS SageMaker Estimators and Models
- Custom SageMaker Estimators and Models
Here is an end to end example of how to use a SageMaker Estimator:
from sagemaker.mxnet import MXNet
# Configure an MXNet Estimator (no training happens yet)
mxnet_estimator = MXNet('train.py',
train_instance_type='ml.p2.xlarge',
train_instance_count = 1)
# Starts a SageMaker training job and waits until completion.
mxnet_estimator.fit('s3://my_bucket/my_training_data/')
# Deploys the model that was generated by fit() to a SageMaker endpoint
mxnet_predictor = mxnet_estimator.deploy(initial_instance_count=1, instance_type='ml.p2.xlarge')
# Serializes data and makes a prediction request to the SageMaker endpoint
response = mxnet_predictor.predict(data)
# Tears down the SageMaker endpoint
mxnet_estimator.delete_endpoint()
The SageMaker Python SDK supports local mode, which allows you to create estimators and deploy them to your local environment. This is a great way to test your deep learning scripts before running them in SageMaker's managed training or hosting environments.
We can take the example in Using Estimators , and use either local
or local_gpu
as the instance type.
from sagemaker.mxnet import MXNet
# Configure an MXNet Estimator (no training happens yet)
mxnet_estimator = MXNet('train.py',
train_instance_type='local',
train_instance_count=1)
# In Local Mode, fit will pull the MXNet container Docker image and run it locally
mxnet_estimator.fit('s3://my_bucket/my_training_data/')
# Alternatively, you can train using data in your local file system. This is only supported in Local mode.
mxnet_estimator.fit('file:///tmp/my_training_data')
# Deploys the model that was generated by fit() to local endpoint in a container
mxnet_predictor = mxnet_estimator.deploy(initial_instance_count=1, instance_type='local')
# Serializes data and makes a prediction request to the local endpoint
response = mxnet_predictor.predict(data)
# Tears down the endpoint container
mxnet_estimator.delete_endpoint()
If you have an existing model and want to deploy it locally, don't specify a sagemaker_session argument to the MXNetModel
constructor.
The correct session is generated when you call model.deploy()
.
Here is an end-to-end example:
import numpy
from sagemaker.mxnet import MXNetModel
model_location = 's3://mybucket/my_model.tar.gz'
code_location = 's3://mybucket/sourcedir.tar.gz'
s3_model = MXNetModel(model_data=model_location, role='SageMakerRole',
entry_point='mnist.py', source_dir=code_location)
predictor = s3_model.deploy(initial_instance_count=1, instance_type='local')
data = numpy.zeros(shape=(1, 1, 28, 28))
predictor.predict(data)
# Tear down the endpoint container
predictor.delete_endpoint()
For detailed examples of running Docker in local mode, see:
A few important notes:
- Only one local mode endpoint can be running at a time.
- If you are using S3 data as input, it is pulled from S3 to your local environment. Ensure you have sufficient space to store the data locally.
- If you run into problems it often due to different Docker containers conflicting. Killing these containers and re-running often solves your problems.
- Local Mode requires Docker Compose and nvidia-docker2 for
local_gpu
. - Distributed training is not yet supported for
local_gpu
.
By using MXNet SageMaker Estimators
, you can train and host MXNet models on Amazon SageMaker.
Supported versions of MXNet: 1.2.1
, 1.1.0
, 1.0.0
, 0.12.1
.
We recommend that you use the latest supported version, because that's where we focus most of our development efforts.
For more information, see MXNet SageMaker Estimators and Models.
By using TensorFlow SageMaker Estimators
, you can train and host TensorFlow models on Amazon SageMaker.
Supported versions of TensorFlow: 1.4.1
, 1.5.0
, 1.6.0
, 1.7.0
, 1.8.0
, 1.9.0
, 1.10.0
.
We recommend that you use the latest supported version, because that's where we focus most of our development efforts.
For more information, see TensorFlow SageMaker Estimators and Models.
By using Chainer SageMaker Estimators
, you can train and host Chainer models on Amazon SageMaker.
Supported versions of Chainer: 4.0.0
, 4.1.0
.
We recommend that you use the latest supported version, because that's where we focus most of our development efforts.
For more information about Chainer, see https://github.com/chainer/chainer.
For more information about Chainer SageMaker Estimators
, see Chainer SageMaker Estimators and Models.
With PyTorch SageMaker Estimators
, you can train and host PyTorch models on Amazon SageMaker.
Supported versions of PyTorch: 0.4.0
.
We recommend that you use the latest supported version, because that's where we focus most of our development efforts.
For more information about PyTorch, see https://github.com/pytorch/pytorch.
For more information about PyTorch SageMaker Estimators
, see PyTorch SageMaker Estimators and Models.
Amazon SageMaker provides several built-in machine learning algorithms that you can use to solve a variety of problems.
The full list of algorithms is available at: https://docs.aws.amazon.com/sagemaker/latest/dg/algos.html
The SageMaker Python SDK includes estimator wrappers for the AWS K-means, Principal Components Analysis (PCA), Linear Learner, Factorization Machines, Latent Dirichlet Allocation (LDA), Neural Topic Model (NTM) Random Cut Forest and k-nearest neighbors (k-NN) algorithms.
For more information, see AWS SageMaker Estimators and Models.
To use a Docker image that you created and use the SageMaker SDK for training, the easiest way is to use the dedicated Estimator
class.
You can create an instance of the Estimator
class with desired Docker image and use it as described in previous sections.
Please refer to the full example in the examples repo:
git clone https://github.com/awslabs/amazon-sagemaker-examples.git
The example notebook is is located here:
advanced_functionality/scikit_bring_your_own/scikit_bring_your_own.ipynb
All of the estimators can be used with SageMaker Automatic Model Tuning, which performs hyperparameter tuning jobs. A hyperparameter tuning job finds the best version of a model by running many training jobs on your dataset using the algorithm with different values of hyperparameters within ranges that you specify. It then chooses the hyperparameter values that result in a model that performs the best, as measured by a metric that you choose. If you're not using an Amazon SageMaker built-in algorithm, then the metric is defined by a regular expression (regex) you provide. The hyperparameter tuning job parses the training job's logs to find metrics that match the regex you defined. For more information about SageMaker Automatic Model Tuning, see AWS documentation.
The SageMaker Python SDK contains a HyperparameterTuner
class for creating and interacting with hyperparameter training jobs.
Here is a basic example of how to use it:
from sagemaker.tuner import HyperparameterTuner, ContinuousParameter
# Configure HyperparameterTuner
my_tuner = HyperparameterTuner(estimator=my_estimator, # previously-configured Estimator object
objective_metric_name='validation-accuracy',
hyperparameter_ranges={'learning-rate': ContinuousParameter(0.05, 0.06)},
metric_definitions=[{'Name': 'validation-accuracy', 'Regex': 'validation-accuracy=(\d\.\d+)'}],
max_jobs=100,
max_parallel_jobs=10)
# Start hyperparameter tuning job
my_tuner.fit({'train': 's3://my_bucket/my_training_data', 'test': 's3://my_bucket_my_testing_data'})
# Deploy best model
my_predictor = my_tuner.deploy(initial_instance_count=1, instance_type='ml.m4.xlarge')
# Make a prediction against the SageMaker endpoint
response = my_predictor.predict(my_prediction_data)
# Tear down the SageMaker endpoint
my_tuner.delete_endpoint()
This example shows a hyperparameter tuning job that creates up to 100 training jobs, running up to 10 training jobs at a time. Each training job's learning rate is a value between 0.05 and 0.06, but this value will differ between training jobs. You can read more about how these values are chosen in the AWS documentation.
A hyperparameter range can be one of three types: continuous, integer, or categorical. The SageMaker Python SDK provides corresponding classes for defining these different types. You can define up to 20 hyperparameters to search over, but each value of a categorical hyperparameter range counts against that limit.
If you are using an Amazon SageMaker built-in algorithm, you don't need to pass in anything for metric_definitions
.
In addition, the fit()
call uses a list of RecordSet
objects instead of a dictionary:
# Create RecordSet object for each data channel
train_records = RecordSet(...)
test_records = RecordSet(...)
# Start hyperparameter tuning job
my_tuner.fit([train_records, test_records])
To help attach a previously-started hyperparameter tuning job to a HyperparameterTuner
instance,
fit()
adds the module path of the class used to create the tuner to the list of static hyperparameters by default.
If the algorithm you are using cannot handle unknown hyperparameters
(for example, an Amazon SageMaker built-in algorithm that does not have a custom estimator in the Python SDK),
set include_cls_metadata
to False
when you call fit
, so that it does not add the module path as a static hyperparameter:
my_tuner.fit({'train': 's3://my_bucket/my_training_data', 'test': 's3://my_bucket_my_testing_data'},
include_cls_metadata=False)
There is also an analytics object associated with each HyperparameterTuner
instance that contains useful information about the hyperparameter tuning job.
For example, the dataframe
method gets a pandas dataframe summarizing the associated training jobs:
# Retrieve analytics object
my_tuner_analytics = my_tuner.analytics()
# Look at summary of associated training jobs
my_dataframe = my_tuner_analytics.dataframe()
For more detailed examples of running hyperparameter tuning jobs, see:
- Using the TensorFlow estimator with hyperparameter tuning
- Bringing your own estimator for hyperparameter tuning
- Analyzing results
For more detailed explanations of the classes that this library provides for automatic model tuning, see:
After you train a model, you can use Amazon SageMaker Batch Transform to perform inferences with the model. Batch Transform manages all necessary compute resources, including launching instances to deploy endpoints and deleting them afterward. You can read more about SageMaker Batch Transform in the AWS documentation.
If you trained the model using a SageMaker Python SDK estimator,
you can invoke the estimator's transformer()
method to create a transform job for a model based on the training job:
transformer = estimator.transformer(instance_count=1, instance_type='ml.m4.xlarge')
Alternatively, if you already have a SageMaker model, you can create an instance of the Transformer
class by calling its constructor:
transformer = Transformer(model_name='my-previously-trained-model',
instance_count=1,
instance_type='ml.m4.xlarge')
For a full list of the possible options to configure by using either of these methods, see the API docs for Estimator or Transformer.
After you create a Transformer
object, you can invoke transform()
to start a batch transform job with the S3 location of your data.
You can also specify other attributes of your data, such as the content type.
transformer.transform('s3://my-bucket/batch-transform-input')
For more details about what can be specified here, see API docs.
Upload the data to S3 before training. You can use the AWS Command Line Tool (the aws cli) to achieve this.
If you don't have the aws cli, you can install it using pip:
pip install awscli --upgrade --user
If you don't have pip or want to learn more about installing the aws cli, see the official Amazon aws cli installation guide.
After you install the AWS cli, you can upload a directory of files to S3 with the following command:
aws s3 cp /tmp/foo/ s3://bucket/path
For more information about using the aws cli for manipulating S3 resources, see AWS cli command reference.
Create a Predictor
object and provide it with your endpoint name,
then call its predict()
method with your input.
You can use either the generic RealTimePredictor
class, which by default does not perform any serialization/deserialization transformations on your input,
but can be configured to do so through constructor arguments:
http://sagemaker.readthedocs.io/en/latest/predictors.html
Or you can use the TensorFlow / MXNet specific predictor classes, which have default serialization/deserialization logic: http://sagemaker.readthedocs.io/en/latest/sagemaker.tensorflow.html#tensorflow-predictor http://sagemaker.readthedocs.io/en/latest/sagemaker.mxnet.html#mxnet-predictor
Example code using the TensorFlow predictor:
from sagemaker.tensorflow import TensorFlowPredictor predictor = TensorFlowPredictor('myexistingendpoint') result = predictor.predict(['my request body'])
You can also create an endpoint from an existing model rather than training one. That is, you can bring your own model:
First, package the files for the trained model into a .tar.gz
file, and upload the archive to S3.
Next, create a Model
object that corresponds to the framework that you are using: MXNetModel or TensorFlowModel.
Example code using MXNetModel
:
from sagemaker.mxnet.model import MXNetModel
sagemaker_model = MXNetModel(model_data='s3://path/to/model.tar.gz',
role='arn:aws:iam::accid:sagemaker-role',
entry_point='entry_point.py')
After that, invoke the deploy()
method on the Model
:
predictor = sagemaker_model.deploy(initial_instance_count=1,
instance_type='ml.m4.xlarge')
This returns a predictor the same way an Estimator
does when deploy()
is called. You can now get inferences just like with any other model deployed on Amazon SageMaker.
A full example is available in the Amazon SageMaker examples repository.