Add SageMaker Studio support (#32)

* Init studio support Fix build Fix assert syntax error Update README Init IAM prune Reformat Tighten permissions and bind with unique prefix Update build Update README Add pyproject.toml Cleanup Pin black version in build and cleanup Improve onboarding and various cleanups Update README for SageMaker Studio Update project prefix description Find the custom resource stack deletion Remove unnecessary return None Fix datacapture uri and remove canary Remove canary.js Update canary deployment descriptions * Address comments Update docs Remove synthetics windonw from dashboard Tag pipeline with sagemaker project id and format Fix retrain rule Include necessary permissions to run worflow.ipynb Update README Setup pre-commit to lint and add default kernel Fix trailing newline Update README with one-click button Cleanup Update project tree structure in README Remove dev artifacts from build Update README
aws-samples · May 27, 2021 · 61fc76f · 61fc76f
1 parent 7b8654d
commit 61fc76f
Show file tree

Hide file tree

Showing 34 changed files with 665 additions and 3,517 deletions.
diff --git a/.gitignore b/.gitignore
@@ -0,0 +1,4 @@
+.vscode
+build
+__pycache__
+*.ipynb_checkpoints
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -0,0 +1,16 @@
+default_language_version:
+  python: python3.7
+
+repos:
+  - repo: https://github.com/pre-commit/pre-commit-hooks
+    rev: v2.3.0
+    hooks:
+      - id: trailing-whitespace
+  - repo: local
+    hooks:
+      - id: lint
+        name: lint
+        always_run: true
+        entry: scripts/lint.sh
+        language: system
+        types: [python]
diff --git a/LICENSE b/LICENSE
@@ -12,4 +12,3 @@ FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
 COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
 IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
 CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-
diff --git a/README.md b/README.md
@@ -1,5 +1,4 @@
 # Amazon SageMaker Safe Deployment Pipeline
-
 ## Introduction
 
 This is a sample solution to build a safe deployment pipeline for Amazon SageMaker. This example could be useful for any organization looking to operationalize machine learning with native AWS development tools such as AWS CodePipeline, AWS CodeBuild and AWS CodeDeploy.
@@ -32,64 +31,64 @@ In the following diagram, you can view the continuous delivery stages of AWS Cod
 
 The following is the list of steps required to get up and running with this sample.
 
-###  Prepare an AWS Account
+###  Requirements
+
+* Create your AWS account at [http://aws.amazon.com](http://aws.amazon.com) by following the instructions on the site.
+* A Studio user account, see [onboard to Amazon SageMaker Studio](https://docs.aws.amazon.com/sagemaker/latest/dg/gs-studio-onboard.html)
+###  Enable Amazon SageMaker Studio Project
 
-Create your AWS account at [http://aws.amazon.com](http://aws.amazon.com) by following the instructions on the site.
+1. From AWS console navigate to Amazon SageMaker Studio and click on your studio user name (do **not** Open Studio now) and copy the name of execution role as shown below (similar to `AmazonSageMaker-ExecutionRole-20210112T085906`)
 
-###  *Optionally* fork this GitHub Repository and create an Access Token
-
-1. [Fork](https://github.com/aws-samples/sagemaker-safe-deployment-pipeline/fork) a copy of this repository into your own GitHub account by clicking the **Fork** in the upper right-hand corner.
-2. Follow the steps in the [GitHub documentation](https://help.github.com/en/github/authenticating-to-github/creating-a-personal-access-token-for-the-command-line) to create a new (OAuth 2) token with the following scopes (permissions): `admin:repo_hook` and `repo`. If you already have a token with these permissions, you can use that. You can find a list of all your personal access tokens in [https://github.com/settings/tokens](https://github.com/settings/tokens).  
-3. Copy the access token to your clipboard. For security reasons, after you navigate off the page, you will not be able to see the token again.  If you have lost your token, you can [regenerate](https://docs.aws.amazon.com/codepipeline/latest/userguide/GitHub-authentication.html#GitHub-rotate-personal-token-CLI) your token.
+<p align="center">
+  <img src="docs/studio-execution-role.png" alt="role" width="800" height="400"/>
+</p>
 
-###  Launch the AWS CloudFormation Stack
+2. Click on the launch button below to setup the stack
 
-Click on the **Launch Stack** button below to launch the CloudFormation Stack to set up the SageMaker safe deployment pipeline.
+<p align="center">
+  <a href="https://us-east-1.console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/quickcreate?templateUrl=https%3A%2F%2Famazon-sagemaker-safe-deployment-pipeline.s3.amazonaws.com%2Fstudio.yml&stackName=mlops-studio&param_PipelineBucket=amazon-sagemaker-safe-deployment-pipeline"><img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png" width="250" height="50"></a>
+</p>
 
-[![Launch CFN stack](https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png)](https://us-east-1.console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/quickcreate?templateUrl=https%3A%2F%2Famazon-sagemaker-safe-deployment-pipeline.s3.amazonaws.com%2Fsfn%2Fpipeline.yml&stackName=nyctaxi&param_GitHubBranch=master&param_GitHubRepo=amazon-sagemaker-safe-deployment-pipeline&param_GitHubUser=aws-samples&param_ModelName=nyctaxi&param_NotebookInstanceType=ml.t3.medium)
+and paste the role name copied in step 1 as the value of the parameter `SageMakerStudioRoleName` as shown below and click **Create Stack**
 
-Provide a stack name eg **sagemaker-safe-deployment-pipeline** and specify the parameters.
+<p align="center">
+  <img src="docs/studio-cft.png" alt="role" width="400" height="600"/>
+</p>
+
+*Alternatively*, one can use the provided `scripts/build.sh` (which required [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/install-cliv2.html) installed with appropriate IAM permissions) as follows
+```
+# bash scripts/build.sh S3_BUCKET_NAME STACK_NAME REGION STUDIO_ROLE_NAME
+# REGION should match your default AWS CLI region
+# STUDIO_ROLE_NAME is copied from step 1. Example:
+bash scripts/build.sh example-studio example-pipeline us-east-1 AmazonSageMaker-ExecutionRole-20210112T085906
+```
 
-Parameters | Description
------------ | -----------
-Model Name | A unique name for this model (must be less than 15 characters long).
-S3 Bucket for Dataset | The bucket containing the dataset (defaults to [nyc-tlc](https://registry.opendata.aws/nyc-tlc-trip-records-pds/))
-Notebook Instance Type | The [Amazon SageMaker instance type](https://aws.amazon.com/sagemaker/pricing/instance-types/). Default is ml.t3.medium.
-GitHub Repository | The name (not URL) of the GitHub repository to pull from.
-GitHub Branch | The name (not URL) of the GitHub repository’s branch to use.
-GitHub Username | GitHub Username for this repository. Update this if you have forked the repository.
-GitHub Access Token | The optional Secret OAuthToken with access to your GitHub repository.
-Email Address | The optional Email address to notify on successful or failed deployments.
+3. From the AWS console navigate to `cloudformation` and once the stack `STACK_NAME` is ready
+4. Go to your SageMaker Studio and **Open Studio** (and possibly refresh your browser if you're already in Studio) and from the your left hand side panel, click on the inverted triangle. As with the screenshot below, under `Projects -> Create project -> Organization templates`, you should be able to see the added **SageMaker Safe Deployment Pipeline**. Click on the template name and **Select project template**
 
-![code-pipeline](docs/stack-parameters.png)
+<p align="center">
+  <img src="docs/studio-sagemaker-project-template.png" alt="role" width="800" height="400"/>
+</p>
 
-You can launch the same stack using the AWS CLI. Here's an example:
+5. Choose a name for the project and can leave the rest of the fields with their default values (can use your own email for SNS notifications) and click on **Create project**
+6. Once the project is created, it gives you the option to clone it locally from AWS CodeCommit by a single click. Click clone and it goes directly to the project
+7. Navigate to the code base and go to `notebook/mlops.ipynb`
+8.  Choose a kernel from the prompt such as `Python 3 (Data Science)`
+9.  Assign your project name to the placeholder `PROJECT_NAME` in the first code cell of the `mlops.ipynb` notebook
+10.  Now you are ready to go through the rest of the cells in `notebook/mlops.ipynb`
 
-`
- aws cloudformation create-stack --stack-name sagemaker-safe-deployment \
-   --template-body file://pipeline.yml \
-   --capabilities CAPABILITY_IAM \
-   --parameters \
-       ParameterKey=ModelName,ParameterValue=mymodelname \
-       ParameterKey=GitHubUser,ParameterValue=[email protected] \
-       ParameterKey=GitHubToken,ParameterValue=YOURGITHUBTOKEN12345ab1234234
-`
 
 ###  Start, Test and Approve the Deployment
 
 Once the deployment is complete, there will be a new AWS CodePipeline created, with a Source stage that is linked to your source code repository. You will notice initially that it will be in a *Failed* state as it is waiting on an S3 data source.
 
 ![code-pipeline](docs/data-source-before.png)
 
-Launch the newly created SageMaker Notebook in your [AWS console](https://aws.amazon.com/getting-started/hands-on/build-train-deploy-machine-learning-model-sagemaker/), navigate to the `notebook` directory and opening the notebook by clicking on the `mlops.ipynb` link.
-
-![code-pipeline](docs/sagemaker-notebook.png)
-
 Once the notebook is running, you will be guided through a series of steps starting with downloading the  [New York City Taxi](https://registry.opendata.aws/nyc-tlc-trip-records-pds/) dataset, uploading this to an Amazon SageMaker S3 bucket along with the data source meta data to trigger a new build in the AWS CodePipeline.
 
 ![code-pipeline](docs/datasource-after.png)
 
-Once your pipeline is kicked off it will run model training and deploy a development SageMaker Endpoint.  
+Once your pipeline is kicked off it will run model training and deploy a development SageMaker Endpoint.
 
 There is a manual approval step which you can action directly within the SageMaker Notebook to promote this to production, send some traffic to the live endpoint and create a REST API.
 
@@ -138,15 +137,20 @@ This project is written in Python, and design to be customized for your own mode
 │   ├── buildspec.yml
 │   ├── dashboard.json
 │   ├── requirements.txt
-│   └── run.py
+│   └── run_pipeline.py
 ├── notebook
-│   ├── canary.js
 │   ├── dashboard.json
+|   ├── workflow.ipynb
 │   └── mlops.ipynb
-└── pipeline.yml
+├── scripts
+|   ├── build.sh
+|   ├── lint.sh
+|   └── set_kernelspec.py
+├── pipeline.yml
+└── studio.yml
 ```
 
-Edit the `get_training_params` method in the `model/run.py` script that is run as part of the AWS CodeBuild step to add your own estimator or model definition.
+Edit the `get_training_params` method in the `model/run_pipeline.py` script that is run as part of the AWS CodeBuild step to add your own estimator or model definition.
 
 Extend the AWS Lambda hooks in `api/pre_traffic_hook.py` and `api/post_traffic_hook.py` to add your own validation or inference against the deployed Amazon SageMaker endpoints. You can also edit the `api/app.py` lambda to add any enrichment or transformation to the request/response payload.
 
@@ -158,8 +162,7 @@ This section outlines cost considerations for running the SageMaker Safe Deploym
 - **CodeCommit** – $1/month if you didn't opt to use your own GitHub repository.
 - **CodeDeploy** – No cost with AWS Lambda.
 - **CodePipeline** – CodePipeline costs $1 per active pipeline* per month. Pipelines are free for the first 30 days after creation. More can be found at [AWS CodePipeline Pricing](https://aws.amazon.com/codepipeline/pricing/).
-- **CloudWatch** - This template includes a [Canary](https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Synthetics_Canaries.html), 1 dashboard and 4 alarms (2 for deployment, 1 for model drift and 1 for canary) which costs less than $10 per month.
-  - Canaries cost $0.0012 per run, or $5/month if they run every 10 minutes.
+- **CloudWatch** - This template includes 1 dashboard and 3 alarms (2 for deployment and 1 for model drift) which costs less than $10 per month.
   - Dashboards cost $3/month.
   - Alarm metrics cost $0.10 per alarm.
 - **CloudTrail** - Low cost, $0.10 per 100,000 data events to enable [S3 CloudWatch Event](https://docs.aws.amazon.com/codepipeline/latest/userguide/create-cloudtrail-S3-source-console.html).  For more information, see [AWS CloudTrail Pricing](https://aws.amazon.com/cloudtrail/pricing/)
@@ -170,7 +173,7 @@ This section outlines cost considerations for running the SageMaker Safe Deploym
   - The `ml.t3.medium` instance *notebook* costs $0.0582 an hour.
   - The `ml.m4.xlarge` instance for the *training* job costs $0.28 an hour.
   - The `ml.m5.xlarge` instance for the *monitoring* baseline costs $0.269 an hour.
-  - The `ml.t2.medium` instance for the dev *hosting* endpoint costs $0.065 an hour. 
+  - The `ml.t2.medium` instance for the dev *hosting* endpoint costs $0.065 an hour.
   - The two `ml.m5.large` instances for production *hosting* endpoint costs 2 x $0.134 per hour.
   - The `ml.m5.xlarge` instance for the hourly scheduled *monitoring* job costs $0.269 an hour.
 - **S3** – Prices will vary depending on the size of the model/artifacts stored. The first 50 TB each month will cost only $0.023 per GB stored. For more information, see [Amazon S3 Pricing](https://aws.amazon.com/s3/pricing/).
@@ -193,4 +196,3 @@ See [CONTRIBUTING](CONTRIBUTING.md#security-issue-notifications) for more inform
 ## License
 
 This library is licensed under the MIT-0 License. See the LICENSE file.
-
diff --git a/api/app.py b/api/app.py
@@ -50,8 +50,6 @@ def lambda_handler(event, context):
             "body": predictions,
         }
     except ClientError as e:
-        logger.error(
-            "Unexpected sagemaker error: {}".format(e.response["Error"]["Message"])
-        )
+        logger.error("Unexpected sagemaker error: {}".format(e.response["Error"]["Message"]))
         logger.error(e)
         return {"statusCode": 500, "message": "Unexpected sagemaker error"}
diff --git a/api/pre_traffic_hook.py b/api/pre_traffic_hook.py
@@ -29,16 +29,9 @@ def lambda_handler(event, context):
         else:
             # Validate that endpoint config has data capture enabled
             endpoint_config_name = response["EndpointConfigName"]
-            response = sm.describe_endpoint_config(
-                EndpointConfigName=endpoint_config_name
-            )
-            if (
-                "DataCaptureConfig" in response
-                and response["DataCaptureConfig"]["EnableCapture"]
-            ):
-                logger.info(
-                    "data capture enabled for endpoint config %s", endpoint_config_name
-                )
+            response = sm.describe_endpoint_config(EndpointConfigName=endpoint_config_name)
+            if "DataCaptureConfig" in response and response["DataCaptureConfig"]["EnableCapture"]:
+                logger.info("data capture enabled for endpoint config %s", endpoint_config_name)
             else:
                 error_message = "SageMaker data capture not enabled for endpoint config"
             # TODO: Invoke endpoint if don't have canary / live traffic

diff --git a/assets/deploy-model-dev.yml b/assets/deploy-model-dev.yml
@@ -23,10 +23,10 @@ Resources:
   Model:
     Type: "AWS::SageMaker::Model"
     Properties:
-      ModelName: !Sub mlops-${ModelName}-dev-${TrainJobId}
+      ModelName: !Sub ${ModelName}-dev-${TrainJobId}
       PrimaryContainer:
         Image: !Ref ImageRepoUri
-        ModelDataUrl: !Sub s3://sagemaker-${AWS::Region}-${AWS::AccountId}/${ModelName}/mlops-${ModelName}-${TrainJobId}/output/model.tar.gz
+        ModelDataUrl: !Sub s3://sagemaker-${AWS::Region}-${AWS::AccountId}/${ModelName}/${ModelName}-${TrainJobId}/output/model.tar.gz
       ExecutionRoleArn: !Ref DeployRoleArn
 
   EndpointConfig:
@@ -38,11 +38,11 @@ Resources:
           InstanceType: ml.t2.medium
           ModelName: !GetAtt Model.ModelName
           VariantName: !Sub ${ModelVariant}-${ModelName}
-      EndpointConfigName: !Sub mlops-${ModelName}-dec-${TrainJobId}
+      EndpointConfigName: !Sub ${ModelName}-dec-${TrainJobId}
       KmsKeyId: !Ref KmsKeyId
 
   Endpoint:
     Type: "AWS::SageMaker::Endpoint"
     Properties:
-      EndpointName: !Sub mlops-${ModelName}-dev-${TrainJobId}
+      EndpointName: !Sub ${ModelName}-dev-${TrainJobId}
       EndpointConfigName: !GetAtt EndpointConfig.EndpointConfigName
Original file line number	Diff line number	Diff line change
Expand Up		@@ -12,4 +12,3 @@ FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR
		COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER
		IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
		CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.