Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sagemaker): Initial version of the L2 construct for SageMaker #2888

Closed
wants to merge 11 commits into from

Conversation

mattmcclean
Copy link
Contributor

I have written up the initial L2 Construct for SageMaker. It is quite basic and there can be further enhancements made over time such as adding support for Autoscaling of the endpoint.

It creates 3 L2 resources including:

  • NotebookInstance - creates the SageMaker NotebookInstance and associated LifecycleConfig CFN resources
  • Model - creates the SageMaker Model CFN resource
  • Endpoint - creates the Endpoint + EndpointConfig CFN resources

Fixes #2809

Pull Request Checklist

  • Testing
    • Unit test added (prefer not to modify an existing test, otherwise, it's probably a breaking change)
    • CLI change?: coordinate update of integration tests with team
    • cdk-init template change?: coordinated update of integration tests with team
  • Docs
    • jsdocs: All public APIs documented
    • README: README and/or documentation topic updated
    • Design: For significant features, design document added to design folder
  • Title and Description
    • Change type: title prefixed with fix, feat and module name in parens, which will appear in changelog
    • Title: use lower-case and doesn't end with a period
    • Breaking?: last paragraph: "BREAKING CHANGE: <describe what changed + link for details>"
    • Issues: Indicate issues fixed via: "Fixes #xxx" or "Closes #xxx"
  • Sensitive Modules (requires 2 PR approvers)
    • IAM Policy Document (in @aws-cdk/aws-iam)
    • EC2 Security Groups and ACLs (in @aws-cdk/aws-ec2)
    • Grant APIs (only if not based on official documentation with a reference)

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license.

@mattmcclean mattmcclean requested a review from a team as a code owner June 17, 2019 13:55
@skinny85
Copy link
Contributor

skinny85 commented Aug 5, 2019

Sorry for leaving this PR out in the cold @mattmcclean . Unfortunately, you submitted it right before our mad dash to release 1.0.0.

Now that that's over, I'll be taking over the PR. Expect a review soon.

Thanks, and apologies again.

Copy link
Contributor

@skinny85 skinny85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an extremely high-quality contribution @mattmcclean , thank you so much!

I have some comments, but most of them are from conventions in the Construct Library that we follow, but maybe haven't documented (and are not checking with our linter) well enough.

One big change though: we use 2 spaces for indentation in our code. I understand 4 is the default in most tools, but could you please change it to 2, and re-format the code?

Thanks again for the awesome contribution!

@@ -15,6 +15,78 @@
---
<!--END STABILITY BANNER-->

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually like to start the ReadMe with a short blurb about the service itself (usually copied from AWS documentation), and then installation and import instructions. See CodeBuild for an example.

So, this should be something like:


Amazon SageMaker provides every developer and data scientist with the ability to build, train, and deploy machine learning models quickly. Amazon SageMaker is a fully-managed service that covers the entire machine learning workflow to label and prepare your data, choose an algorithm, train the model, tune and optimize it for deployment, make predictions, and take action. Your models get to production faster with much less effort and lower cost.

Installation

Install the module:

$ npm i @aws-cdk/aws-sagemaker

Import it into your code:

import sagemaker = require('@aws-cdk/aws-sagemaker');

new NotebookInstance(this, 'MyNotebook', {
vpc,
kmsKeyId: key,
instanceType = new ec2.InstanceType('p3.2xlarge'),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use the InstanceType.of method here instead? It's a little "nicer" I think :)

new NotebookInstance(this, 'MyNotebook');
```

Add a KMS encryption key, launch in own VPC, set the instance type to 'ml.p3.xlarge', disable direct internet access and set the EBS volume size to 100 GB.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sentence should probably should end with a colon instead of a period.

});
```

Add custom scripts when notebook instance is created and started.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here (period -> colon).


### Models

Define a model.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Colon :)

const result = super.validate();
// check we have at least one production variant
if (this.productionVariants.length === 0) {
result.push("Must have at least one Production Variant");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So here's an interesting pattern that you can use here.

Since at least one ProductionVariant is required, have in props:

productionVariant: ProductionVariant; // required
extraProductionVariants?: ProductionVariant[]; // optional

This way, you communicate to the clients of this class, at compile time, that at least one variant has to be provided.

* Validates the provided instance type.
* @param instanceType the instance type of the SageMaker endpoint production variant.
*/
private validateInstanceType(instanceType: string, result: string[]) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just have this return string[], and merge the errors in the client of this method, instead of mutating a parameter. I think keeping methods pure is simply easier.

*/
private validateInstanceType(instanceType: string, result: string[]) {
// check if a valid sagemaker instance type
if (!['c4', 'c5', 'm4', 'm5', 'p2', 'p3', 't2'].some(instanceClass => instanceType.indexOf(instanceClass) >= 0)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly the same as in Notebook - extract to a module-private function (and include the allowed types in the error message).

});

// THEN
expect(stack).to(haveResource("AWS::SageMaker::EndpointConfig", {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General comment for all tests: we have also a haveResourceLike matcher that allows you to check only the parts of the resource that you're interested in in a given test. Might come in handy for these.


test.done();
},
"it throws error when incorrect variant weight given"(test: Test) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please always have an empty line between tests on the same level of indentation.

test.equal(errors.length, 1);
const error = errors[0];
test.same(error.source, endpoint);
test.equal(error.message, "Must have at least one Production Variant");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can simplify this with something like:

test.deepEqual(errors.map(e => e.message), ["Must have at least one Production Variant"]);

Same comment for all tests dealing with error checking :)

@mergify
Copy link
Contributor

mergify bot commented Nov 1, 2019

Thanks so much for taking the time to contribute to the AWS CDK ❤️

We will shortly assign someone to review this pull request and help get it
merged. In the meantime, please take a minute to make sure you follow this
checklist
:

  • PR title type(scope): text
    • type: fix, feat, refactor go into CHANGELOG, chore is hidden
    • scope: name of module without aws- or cdk- prefix or postfix (e.g. s3 instead of aws-s3-deployment)
    • text: use all lower-case, do not end with a period, do not include issue refs
  • PR Description
    • Rationale: describe rationale of change and approach taken
    • Issues: indicate issues fixed via: fixes #xxx or closes #xxx
    • Breaking?: last paragraph: BREAKING CHANGE: <describe what changed + link for details>
  • Testing
    • Unit test added. Prefer to add a new test rather than modify existing tests
    • CLI or init templates change? Re-run/add CLI integration tests
  • Documentation
    • README: update module README to describe new features
    • API docs: public APIs must be documented. Copy from official AWS docs when possible
    • Design: for significant features, follow design process

6 similar comments
@mergify
Copy link
Contributor

mergify bot commented Dec 16, 2019

Thanks so much for taking the time to contribute to the AWS CDK ❤️

We will shortly assign someone to review this pull request and help get it
merged. In the meantime, please take a minute to make sure you follow this
checklist
:

  • PR title type(scope): text
    • type: fix, feat, refactor go into CHANGELOG, chore is hidden
    • scope: name of module without aws- or cdk- prefix or postfix (e.g. s3 instead of aws-s3-deployment)
    • text: use all lower-case, do not end with a period, do not include issue refs
  • PR Description
    • Rationale: describe rationale of change and approach taken
    • Issues: indicate issues fixed via: fixes #xxx or closes #xxx
    • Breaking?: last paragraph: BREAKING CHANGE: <describe what changed + link for details>
  • Testing
    • Unit test added. Prefer to add a new test rather than modify existing tests
    • CLI or init templates change? Re-run/add CLI integration tests
  • Documentation
    • README: update module README to describe new features
    • API docs: public APIs must be documented. Copy from official AWS docs when possible
    • Design: for significant features, follow design process

@mergify
Copy link
Contributor

mergify bot commented Dec 16, 2019

Thanks so much for taking the time to contribute to the AWS CDK ❤️

We will shortly assign someone to review this pull request and help get it
merged. In the meantime, please take a minute to make sure you follow this
checklist
:

  • PR title type(scope): text
    • type: fix, feat, refactor go into CHANGELOG, chore is hidden
    • scope: name of module without aws- or cdk- prefix or postfix (e.g. s3 instead of aws-s3-deployment)
    • text: use all lower-case, do not end with a period, do not include issue refs
  • PR Description
    • Rationale: describe rationale of change and approach taken
    • Issues: indicate issues fixed via: fixes #xxx or closes #xxx
    • Breaking?: last paragraph: BREAKING CHANGE: <describe what changed + link for details>
  • Testing
    • Unit test added. Prefer to add a new test rather than modify existing tests
    • CLI or init templates change? Re-run/add CLI integration tests
  • Documentation
    • README: update module README to describe new features
    • API docs: public APIs must be documented. Copy from official AWS docs when possible
    • Design: for significant features, follow design process

@mergify
Copy link
Contributor

mergify bot commented Dec 16, 2019

Thanks so much for taking the time to contribute to the AWS CDK ❤️

We will shortly assign someone to review this pull request and help get it
merged. In the meantime, please take a minute to make sure you follow this
checklist
:

  • PR title type(scope): text
    • type: fix, feat, refactor go into CHANGELOG, chore is hidden
    • scope: name of module without aws- or cdk- prefix or postfix (e.g. s3 instead of aws-s3-deployment)
    • text: use all lower-case, do not end with a period, do not include issue refs
  • PR Description
    • Rationale: describe rationale of change and approach taken
    • Issues: indicate issues fixed via: fixes #xxx or closes #xxx
    • Breaking?: last paragraph: BREAKING CHANGE: <describe what changed + link for details>
  • Testing
    • Unit test added. Prefer to add a new test rather than modify existing tests
    • CLI or init templates change? Re-run/add CLI integration tests
  • Documentation
    • README: update module README to describe new features
    • API docs: public APIs must be documented. Copy from official AWS docs when possible
    • Design: for significant features, follow design process

@mergify
Copy link
Contributor

mergify bot commented Dec 16, 2019

Thanks so much for taking the time to contribute to the AWS CDK ❤️

We will shortly assign someone to review this pull request and help get it
merged. In the meantime, please take a minute to make sure you follow this
checklist
:

  • PR title type(scope): text
    • type: fix, feat, refactor go into CHANGELOG, chore is hidden
    • scope: name of module without aws- or cdk- prefix or postfix (e.g. s3 instead of aws-s3-deployment)
    • text: use all lower-case, do not end with a period, do not include issue refs
  • PR Description
    • Rationale: describe rationale of change and approach taken
    • Issues: indicate issues fixed via: fixes #xxx or closes #xxx
    • Breaking?: last paragraph: BREAKING CHANGE: <describe what changed + link for details>
  • Testing
    • Unit test added. Prefer to add a new test rather than modify existing tests
    • CLI or init templates change? Re-run/add CLI integration tests
  • Documentation
    • README: update module README to describe new features
    • API docs: public APIs must be documented. Copy from official AWS docs when possible
    • Design: for significant features, follow design process

@mergify
Copy link
Contributor

mergify bot commented Dec 16, 2019

Thanks so much for taking the time to contribute to the AWS CDK ❤️

We will shortly assign someone to review this pull request and help get it
merged. In the meantime, please take a minute to make sure you follow this
checklist
:

  • PR title type(scope): text
    • type: fix, feat, refactor go into CHANGELOG, chore is hidden
    • scope: name of module without aws- or cdk- prefix or postfix (e.g. s3 instead of aws-s3-deployment)
    • text: use all lower-case, do not end with a period, do not include issue refs
  • PR Description
    • Rationale: describe rationale of change and approach taken
    • Issues: indicate issues fixed via: fixes #xxx or closes #xxx
    • Breaking?: last paragraph: BREAKING CHANGE: <describe what changed + link for details>
  • Testing
    • Unit test added. Prefer to add a new test rather than modify existing tests
    • CLI or init templates change? Re-run/add CLI integration tests
  • Documentation
    • README: update module README to describe new features
    • API docs: public APIs must be documented. Copy from official AWS docs when possible
    • Design: for significant features, follow design process

@mergify
Copy link
Contributor

mergify bot commented Dec 16, 2019

Thanks so much for taking the time to contribute to the AWS CDK ❤️

We will shortly assign someone to review this pull request and help get it
merged. In the meantime, please take a minute to make sure you follow this
checklist
:

  • PR title type(scope): text
    • type: fix, feat, refactor go into CHANGELOG, chore is hidden
    • scope: name of module without aws- or cdk- prefix or postfix (e.g. s3 instead of aws-s3-deployment)
    • text: use all lower-case, do not end with a period, do not include issue refs
  • PR Description
    • Rationale: describe rationale of change and approach taken
    • Issues: indicate issues fixed via: fixes #xxx or closes #xxx
    • Breaking?: last paragraph: BREAKING CHANGE: <describe what changed + link for details>
  • Testing
    • Unit test added. Prefer to add a new test rather than modify existing tests
    • CLI or init templates change? Re-run/add CLI integration tests
  • Documentation
    • README: update module README to describe new features
    • API docs: public APIs must be documented. Copy from official AWS docs when possible
    • Design: for significant features, follow design process

petermeansrock added a commit to petermeansrock/aws-cdk that referenced this pull request Feb 4, 2020
Based closely on PR aws#2888, this commit introduces the Endpoint L2
construct and the L2 constructs on which it depends, including Model and
EndpointConfig. Departures from PR aws#2888 include:
- EndpointConfig definition moved into its own L2 construct as one
  EndpointConfig resource may be shared by multiple Endpoints.
- An Endpoint-specific IEndpointProductionVariant interface was added to
  support "metric*" and "autoScale*" APIs per endpoint-variant
  combination.
- The Notebook construct was excluded from this new commit to limit
  changes to model hosting use-cases only.
- Feedback on the earlier PR has been incorporated into this new commit.

fixes aws#2809

Co-authored-by: Matt McClean <[email protected]>
Co-authored-by: Yao <[email protected]>
Co-authored-by: Drew Jetter <[email protected]>
Co-authored-by: Murali Ganesh <[email protected]>
Co-authored-by: Abilash Rangoju <[email protected]>
@skinny85
Copy link
Contributor

Closing in favor of #6107.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sagemaker: Create L2 construct for Amazon SageMaker
3 participants