Skip to content

Commit

Permalink
Add a module for Data Quality Monitoring Job for a SageMaker Endpoint.
Browse files Browse the repository at this point in the history
  • Loading branch information
Patrick Cloke committed Jun 11, 2024
1 parent 37ec764 commit 2bd0ae7
Show file tree
Hide file tree
Showing 18 changed files with 887 additions and 0 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### **Added**

- Added a `sagemaker-model-monitoring-module` module with an example of data quality monitoring of a SageMaker Endpoint.
- Added an option to enable data capture in the `sagemaker-endpoint-module`.

### **Changed**
Expand Down
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ See deployment steps in the [Deployment Guide](DEPLOYMENT.md).
| [SageMaker Custom Kernel Module](modules/sagemaker/sagemaker-custom-kernel/README.md) | Builds custom kernel for SageMaker Studio from a Dockerfile |
| [SageMaker Model Package Group Module](modules/sagemaker/sagemaker-model-package-group/README.md) | Creates a SageMaker Model Package Group to register and version SageMaker Machine Learning (ML) models and setups an Amazon EventBridge Rule to send model package group state change events to an Amazon EventBridge Bus |
| [SageMaker Model Package Promote Pipeline Module](modules/sagemaker/sagemaker-model-package-promote-pipeline/README.md) | Deploy a Pipeline to promote SageMaker Model Packages in a multi-account setup. The pipeline can be triggered through an EventBridge rule in reaction of a SageMaker Model Package Group state event change (Approved/Rejected). Once the pipeline is triggered, it will promote the latest approved model package, if one is found. |
| [SageMaker Model Monitoring Module](modules/sagemaker/sagemaker-model-monitoring-module/README.md) | Deploy a data quality monitoring job which runs against a SageMaker Endpoint. |

### Mlflow Modules

Expand Down
95 changes: 95 additions & 0 deletions modules/sagemaker/sagemaker-model-monitoring/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# SageMaker Model Monitoring

## Description

This module creates a SageMaker Model Monitoring job for data quality.
It requires a deployed model endpoint and the proper check steps
for each monitoring job:

* Data Quality: [QualityCheck step](https://docs.aws.amazon.com/sagemaker/latest/dg/build-and-manage-steps.html#step-type-quality-check)

### Architecture

![SageMaker Model Monitoring Module Architecture](docs/_static/sagemaker-model-monitoring-module-architecture.png "SageMaker Model Monitoring Module Architecture")

## Inputs/Outputs

### Input Parameters

#### Required

- `endpoint-name`: The name of the endpoint used to run the monitoring job.
- `security-group-id`: The VPC security group IDs, should provide access to the given `subnet-ids`.
- `subnet-ids`: The ID of the subnets in the VPC to which you want to connect your training job or model.
- `model-package-arn`: Model package ARN
- `model-bucket-arn`: S3 bucket ARN for model artifacts
- `kms-key-id`: The KMS key used to encrypted storage and output.
- `data-quality-checkstep-output-prefix`: The S3 prefix in `model-artifacts-bucket-arn` which contains the output from the corresponding [Check step in the SageMaker Pipeline](https://docs.aws.amazon.com/sagemaker/latest/dg/build-and-manage-steps.html#build-and-manage-steps-types).
- `data-quality-output-prefix`: The S3 prefix in `model-artifacts-bucket-arn` to contain the output of the monitoring job.


#### Optional

- `sagemaker-project-id`: SageMaker project id
- `sagemaker-project-name`: SageMaker project name
- `data-quality-instance-count`: The number of ML compute instances to use in the model monitoring job.
- `data-quality-instance-type`: The ML compute instance type for the processing job.
- `data-quality-instance-volume-size-in-gb`: The size of the ML storage volume, in gigabytes, that you want to provision.
- `data-quality-max-runtime-in-seconds`: The maximum length of time, in seconds, the monitoring job can run before it is stopped.
- `data-quality-schedule-expression`: A cron expression that describes details about the monitoring schedule.

### Sample manifest declaration

```yaml
name: monitoring
path: modules/sagemaker/sagemaker-model-monitoring
parameters:
- name: sagemaker_project_id
value: dummy123
- name: sagemaker_project_name
value: dummy123
- name: model_package_arn
value: arn:aws:sagemaker:<region>:<account>:model-package/<package_name>/1
- name: model_bucket_arn
value: arn:aws:s3:::<bucket name>
- name: data-quality-checkstep-output-prefix
value: model-training-run-1234/dataqualitycheckstep
- name: data-quality-output-prefix
value: model-training-run-1234/monitor/dataqualityoutput
- name: endpoint_name
valueFrom:
moduleMetadata:
group: endpoints
name: endpoint
key: EndpointName
- name: security_group_id
valueFrom:
moduleMetadata:
group: endpoints
name: endpoint
key: SecurityGroupId
- name: kms_key_id
valueFrom:
moduleMetadata:
group: endpoints
name: endpoint
key: KmsKeyId
- name: subnet_ids
valueFrom:
moduleMetadata:
group: networking
name: networking
key: PrivateSubnetIds
```
### Module Metadata Outputs
- `ModelExecutionRoleArn`: SageMaker Model Execution IAM role ARN
- `ModelName`: SageMaker Model name
- `ModelPackageArn`: SageMaker Model package ARN
- `EndpointName`: SageMaker Endpoint name
- `EndpointUrl`: SageMaker Endpoint Url

#### Output Example

N/A
30 changes: 30 additions & 0 deletions modules/sagemaker/sagemaker-model-monitoring/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
# SPDX-License-Identifier: Apache-2.0

import os

import aws_cdk
import cdk_nag

from sagemaker_model_monitoring.settings import ApplicationSettings
from sagemaker_model_monitoring.stack import SageMakerModelMonitoringStack

# Load application settings from env vars.
app_settings = ApplicationSettings()

environment = aws_cdk.Environment(
account=os.environ["CDK_DEFAULT_ACCOUNT"],
region=os.environ["CDK_DEFAULT_REGION"],
)

app = aws_cdk.App()
stack = SageMakerModelMonitoringStack(
scope=app,
id=app_settings.settings.app_prefix,
env=environment,
**app_settings.parameters.model_dump(),
)

aws_cdk.Aspects.of(app).add(cdk_nag.AwsSolutionsChecks(log_ignores=False))

app.synth()
25 changes: 25 additions & 0 deletions modules/sagemaker/sagemaker-model-monitoring/deployspec.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
publishGenericEnvVariables: true
deploy:
phases:
install:
commands:
- env
# Install whatever additional build libraries
- npm install -g [email protected]
- pip install -r requirements.txt
build:
commands:
- cdk deploy --require-approval never --progress events --app "python app.py" --outputs-file ./cdk-exports.json
# Export metadata
- seedfarmer metadata convert -f cdk-exports.json || true
destroy:
phases:
install:
commands:
# Install whatever additional build libraries
- npm install -g [email protected]
- pip install -r requirements.txt
build:
commands:
# execute the CDK
- cdk destroy --force --app "python app.py"
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
<mxfile modified="2024-06-10T19:00:42.984Z" host="design-inspector.a2z.com" agent="Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:126.0) Gecko/20100101 Firefox/126.0" etag="ymXan4Qe_-5LTGC5bwpJ" version="10.1.8" type="device"><diagram id="Slb-7FiMrRCHa78ZGk73X" name="Page-1"></diagram></mxfile>
41 changes: 41 additions & 0 deletions modules/sagemaker/sagemaker-model-monitoring/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
[tool.ruff]
exclude = [
".eggs",
".git",
".hg",
".mypy_cache",
".ruff_cache",
".tox",
".venv",
"_build",
"buck-out",
"build",
"dist",
"codeseeder",
]
line-length = 120
target-version = "py38"

[tool.ruff.lint]
select = ["F", "I", "E", "W"]
fixable = ["ALL"]

[tool.mypy]
python_version = "3.8"
strict = true
ignore_missing_imports = true
disallow_untyped_decorators = false
exclude = "codeseeder.out/|example/|tests/"
warn_unused_ignores = false

[tool.pytest.ini_options]
addopts = "-v --cov=. --cov-report term"
pythonpath = [
"."
]

[tool.coverage.run]
omit = ["tests/*"]

[tool.coverage.report]
fail_under = 80
12 changes: 12 additions & 0 deletions modules/sagemaker/sagemaker-model-monitoring/requirements-dev.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
awscli
cdk-nag
cfn-lint
check-manifest
mypy
pip-tools
pydot
pyroma
pytest
ruff
types-PyYAML
types-setuptools
Loading

0 comments on commit 2bd0ae7

Please sign in to comment.