-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(sagemaker): add Model L2 construct #22549
Conversation
This is the first of three PRs to complete the implementation of RFC 431: aws/aws-cdk-rfcs#431 fixes aws#2809 Co-authored-by: Matt McClean <[email protected]> Co-authored-by: Long Yao <[email protected]> Co-authored-by: Drew Jetter <[email protected]> Co-authored-by: Murali Ganesh <[email protected]> Co-authored-by: Abilash Rangoju <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @petermeansrock! Sorry for the delay in getting this reviewed. In general, this is very high quality, thanks :). I have a few comments here and there, but no blockers. After they're addressed, i'll take a more scrutinizing look on the tests, but in general they look good. and then we'll merge!
}); | ||
|
||
test('by supplying a file with an unsupported extension, an exception is thrown', () => { | ||
// GIVEN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super nit: if there's nothing given, we might as well omit this and save 2 lines :)
* For example, 012345678910.dkr.ecr.<region-name>.amazonaws.com/<repository-name>:latest or | ||
* 012345678910.dkr.ecr.<region-name>.amazonaws.com/<repository-name>@sha256:94afd1f2e64d908bc90dbca0035a5b567EXAMPLE. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* For example, 012345678910.dkr.ecr.<region-name>.amazonaws.com/<repository-name>:latest or | |
* 012345678910.dkr.ecr.<region-name>.amazonaws.com/<repository-name>@sha256:94afd1f2e64d908bc90dbca0035a5b567EXAMPLE. | |
* For example, `012345678910.dkr.ecr.<region-name>.amazonaws.com/<repository-name>:latest` or | |
* `012345678910.dkr.ecr.<region-name>.amazonaws.com/<repository-name>@sha256:94afd1f2e64d908bc90dbca0035a5b567EXAMPLE`. |
// The only supported extension for local asset model data | ||
const ARTIFACT_EXTENSION = '.tar.gz'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you think this might be extended by sagemaker anytime soon? if there's a chance that they do, it feels more maintainable to have this be a list:
const ARTIFACT_EXTENSIONS = ['.tar.gz'];
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This is slightly orthogonal to your question, but I first want to call it out as your question itself is forward looking.
Technically, SageMaker has already tweaked this behavior, but for multi-model support (described in the RFC as a future addition that can be added, so not covered by this PR).
With multi-model endpoints, SageMaker expects the ModelDataUrl
to contain an S3 prefix. At runtime, when SageMaker receives an InvokeEndpoint
request for multi-model endpoints, a customer passes a TargetModel
parameter specifying the S3 suffix (relative to the ModelDataUrl
pointing to the model artifacts).
So assume in S3 I have the following object keys:
s3://bucket/some/path/foo.tar.gz
s3://bucket/some/path/bar.tar.gz
With this setup, at configuration time, I would create my model with a ModelDataUrl
of s3://bucket/some/path/
.
At runtime, when invoking the endpoint, I'd either pass foo.tar.gz
or bar.tar.gz
as my TargetModel
.
Unlike with single models (including inference pipelines, both covered in this PR), with multi-models, SageMaker doesn't know at the point of instance provisioning which model artifacts to load. It instead lazily loads the artifacts at request-time.
To close on this comment:
- I think the
.tar.gz
check makes sense for now as it still protects customers in the single model scenario. - I'm not sure what the best strategy would be for protecting against the multi-model scenario, but I assume figuring that out is out-of-scope for this PR.
- Given that
.tar.gz
has been the only supported artifact type for 3+ years (regardless of single vs multi-model), I don't expect that to change any time soon. In other words, it seems sufficient to keep the existing single artifact suffix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that the scenario you describe is out-of-scope for this PR. I think for the multi-model scenario the best we can do is just skip this synth-time check and leave it to deploy-time unfortunately. But not necessary at all for this first PR.
// apply a name tag to the model resource | ||
cdk.Tags.of(this).add(NAME_TAG, this.node.path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe this is just my unfamiliarity with the resource, but is there a reason why we always add a tag to the resource?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually thought this was a more prevalent pattern in the CDK itself, but upon performing a more recent search, it looks like only a handful of modules do this. I'm not personally attached to this tag, so if you'd like, I can just rip this out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I've never seen this before, which is the cause for my confusion. In these scenarios I prefer to leave them out. We can always add them in later, but we can't remove it later if it's already there.
// post-construction validation | ||
this.node.addValidation({ validate: () => this.validateContainers() }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this not called inside of this.renderContainers()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose it just seemed intuitive to me to perform synthesis-time validation using the validation hook. I can certainly move the check though if you'd like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can still have this.validateContainers()
, but that should be called istelf inside this.renderContainers()
. I call this out because it seems different from the designs of other L2s, and for maintainability purposes, it's probably good to keep things relatively in line with other L2s.
/** | ||
* The name of this model. | ||
*/ | ||
readonly modelName: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why modelName
here? The "real" required property is modelArn
, which you are then splicing together in the Import
construct. However, this then requires the model to exist in the same environment that the stack is in. We're not serving customers who have models in different environments that they want to reference here.
I think this is what we want:
- change
modelName
tomodelArn
here as a "required" attribute. - the implementation of
Model.fromModelName()
will need to do the arn generation before passing the arn intofromModelAttributes
. - optionally include a
fromModelArn()
method.
In the actual implementation of the Import
construct, I'll write a comment on how we can get the environment from the arn to support this use case i'm describing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "real" required property is
modelArn
, which you are then splicing together in the Import construct.
What do you mean by "'real' required property"? This modelName
field represents the physical name of the Model
resource. It corresponds to the AWS::SageMaker::Model
resource's ModelName
attribute. Here are the snippets relevant to the physical name and its use on the L1 CfnModel
construct:
super(scope, id, {
physicalName: props.modelName,
});
...
const model = new CfnModel(this, 'Model', {
executionRoleArn: this.role.roleArn,
modelName: this.physicalName,
primaryContainer: cdk.Lazy.any({ produce: () => this.renderPrimaryContainer() }),
vpcConfig: cdk.Lazy.any({ produce: () => this.renderVpcConfig() }),
containers: cdk.Lazy.any({ produce: () => this.renderContainers() }),
});
However, this then requires the model to exist in the same environment that the stack is in. We're not serving customers who have models in different environments that they want to reference here.
I think I might be missing something. Model
resources cannot be shared cross-account nor cross-region. A Model
is always referenced (via CreateEndpoint
, CreateTransformJob
) by its name and not ARN. Doesn't that mean that this PR is working as intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By 'real', I mean that in the fromModelAttributes
method further below, we do the following:
this.modelArn = cdk.Stack.of(this).formatArn({
service: 'sagemaker',
resource: 'model',
resourceName: this.modelName,
});
i.e. we need the modelArn anyway.
While Model
resources cannot be shared cross-account or cross-region today, if that were ever possible we would immediately have a bug in how we import resources. See https://github.com/aws/aws-cdk/pull/22549/files#r1008209404 for what that bug is.
I don't have a problem with having a fromModelName
method the way you have it currently. I think we should also have a fromModelArn
method. It would look something like this:
public static fromModelArn(scope: Construct, id: string, arn: string): IModel {
const modelName = // splice arn to get name
return Model.fromModelAttributes(scope, id, { modelName, modelArn: arn });
}
public static fromModelName(scope: Construct, id: string, name: string): IModel {
const modelArn = cdk.Stack.of(scope).formatArn({
service: 'sagemaker',
resource: 'model',
resourceName: name,
});
return Model.fromModelAttributes(scope, id, { modelName: name, modelArn: modelArn });
}
public static fromModelAttributes(scope: Construct, id: string, attrs: ModelAttributes): IModel {
const modelName = attrs.modelName;
const role = attrs.role;
const modelArn = attrs.modelArn;
class Import extends ModelBase {
public readonly modelName = modelName;
public readonly role = role;
public readonly grantPrincipal: iam.IPrincipal;
public readonly modelArn = modelArn;
constructor(s: Construct, i: string) {
super(s, i, {
environmentFromArn: modelArn,
});
...
}
}
The payoff is the environmentFromArn
. Yes, it may be useless today because Models
cannot be shared cross env. However, we are trying to standardize how we do imports for L2 to guard against this cross env sharing if/when it gets introduced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean I'd end up using a technique like this to prevent customers from attempting to use IModel
across accounts-or-regions downstream within the EndpointConfig
implementation (and from IEndpointConfig
within the Endpoint
implementation)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can just do that in fromModelAttributes
for now -- if the arn region / account is not the same as the scope region / account, then throw an error.
This is still better than before because we now explicitly throw an error in these scenarios, and can easily remove the synth-time check in the future if sagemaker decides to support this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I think about it, there are two additional cross-account-and-region scenarios I need to guard against as well (which I've not yet done in downstream changes, including EndpointConfig
for example):
- Reusing a
Model
instance from an origin stack in a different account/region:const app = new cdk.App(); const originStack = new cdk.Stack(app, 'OriginStack', { env: { region: 'us-west-2', account: '123456789012', }, }); const originStackModel = new sagemaker.Model(originStack, 'MyModel', { modelName: 'explicitly-named-model', containers: [{ image: ContainerImage.fromAsset(path.join(__dirname, 'test-image')), }], }); const destinationStack = new cdk.Stack(app, 'DestinationStack', { env: { region: 'us-west-2', account: '123456789013', }, }); // FIXME: This should fail at synthesis but doesn't currently new sagemaker.EndpointConfig(destinationStack, 'MyEndpointConfig', { instanceProductionVariants: [{ variantName: 'my-variant', model: originStackModel, }], }); app.synth();
- Using an unowned
IModel
instance (generated viaModel.from*
) from a different account region:const app = new cdk.App(); const originStack = new cdk.Stack(app, 'OriginStack', { env: { region: 'us-west-2', account: '123456789012', }, }); const originStackModel = sagemaker.Model.fromModelName(originStack, 'MyModel', 'my-name'); const destinationStack = new cdk.Stack(app, 'DestinationStack', { env: { region: 'us-west-2', account: '123456789013', }, }); // FIXME: This should fail at synthesis but doesn't currently new sagemaker.EndpointConfig(destinationStack, 'MyEndpointConfig', { instanceProductionVariants: [{ variantName: 'my-variant', model: originStackModel, }], }); app.synth();
Rather than implement one region/account check in fromModelAttributes
, wouldn't it be better for me to handle both Model
and unowned IModel
instances universally within EndpointConfig
? If so, since I've already implemented Model.fromModelArn
without any region/account checks infromModelAttributes
, this PR is ready for review. I'll go ahead & mark it as so just in case, but if you think that an additional check here is warranted, I'll incorporate that change as well.
public readonly modelArn: string; | ||
|
||
constructor(s: Construct, i: string) { | ||
super(s, i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super(s, i); | |
super(s, i, { | |
environmentFromArn: modelArn, | |
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this gets the environment from the arn, so that we serve imports from different envs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback, @kaizencc! Here are a set of follow-up questions that I had for a subset of your comments.
// The only supported extension for local asset model data | ||
const ARTIFACT_EXTENSION = '.tar.gz'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: This is slightly orthogonal to your question, but I first want to call it out as your question itself is forward looking.
Technically, SageMaker has already tweaked this behavior, but for multi-model support (described in the RFC as a future addition that can be added, so not covered by this PR).
With multi-model endpoints, SageMaker expects the ModelDataUrl
to contain an S3 prefix. At runtime, when SageMaker receives an InvokeEndpoint
request for multi-model endpoints, a customer passes a TargetModel
parameter specifying the S3 suffix (relative to the ModelDataUrl
pointing to the model artifacts).
So assume in S3 I have the following object keys:
s3://bucket/some/path/foo.tar.gz
s3://bucket/some/path/bar.tar.gz
With this setup, at configuration time, I would create my model with a ModelDataUrl
of s3://bucket/some/path/
.
At runtime, when invoking the endpoint, I'd either pass foo.tar.gz
or bar.tar.gz
as my TargetModel
.
Unlike with single models (including inference pipelines, both covered in this PR), with multi-models, SageMaker doesn't know at the point of instance provisioning which model artifacts to load. It instead lazily loads the artifacts at request-time.
To close on this comment:
- I think the
.tar.gz
check makes sense for now as it still protects customers in the single model scenario. - I'm not sure what the best strategy would be for protecting against the multi-model scenario, but I assume figuring that out is out-of-scope for this PR.
- Given that
.tar.gz
has been the only supported artifact type for 3+ years (regardless of single vs multi-model), I don't expect that to change any time soon. In other words, it seems sufficient to keep the existing single artifact suffix.
/** | ||
* The name of this model. | ||
*/ | ||
readonly modelName: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "real" required property is
modelArn
, which you are then splicing together in the Import construct.
What do you mean by "'real' required property"? This modelName
field represents the physical name of the Model
resource. It corresponds to the AWS::SageMaker::Model
resource's ModelName
attribute. Here are the snippets relevant to the physical name and its use on the L1 CfnModel
construct:
super(scope, id, {
physicalName: props.modelName,
});
...
const model = new CfnModel(this, 'Model', {
executionRoleArn: this.role.roleArn,
modelName: this.physicalName,
primaryContainer: cdk.Lazy.any({ produce: () => this.renderPrimaryContainer() }),
vpcConfig: cdk.Lazy.any({ produce: () => this.renderVpcConfig() }),
containers: cdk.Lazy.any({ produce: () => this.renderContainers() }),
});
However, this then requires the model to exist in the same environment that the stack is in. We're not serving customers who have models in different environments that they want to reference here.
I think I might be missing something. Model
resources cannot be shared cross-account nor cross-region. A Model
is always referenced (via CreateEndpoint
, CreateTransformJob
) by its name and not ARN. Doesn't that mean that this PR is working as intended?
/** | ||
* Hostname of the container. | ||
* | ||
* @default - none |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the docs (which is a lot more complicated than I remember):
This parameter is ignored for models that contain only a PrimaryContainer. When a ContainerDefinition is part of an inference pipeline, the value of the parameter uniquely identifies the container for the purposes of logging and metrics. For information, see Use Logs and Metrics to Monitor an Inference Pipeline. If you don't specify a value for this parameter for a ContainerDefinition that is part of an inference pipeline, a unique name is automatically assigned based on the position of the ContainerDefinition in the pipeline. If you specify a value for the ContainerHostName for any ContainerDefinition that is part of an inference pipeline, you must specify a value for the ContainerHostName parameter of every ContainerDefinition in that pipeline.
So, in summary:
- For a single container model, the value is always ignored.
- For inference pipeline models:
- If one hostname is specified for a container, hostnames must be specified for all of them.
- Otherwise, a unique name is automatically "assigned" by SageMaker.
Here are some action items that could come out of this:
- (Debatable) - Throw an exception on synthesis of single container model's with a hostname specified.
- Throw an exception on synthesis of an inference pipeline model with partially specified hostnames.
- Expand
ContainerDefinition
documentation forcontainerHostname
to include these details (including real@default
behavior).
Please let me know (1) if those sound good and (2) if you can think of others.
// apply a name tag to the model resource | ||
cdk.Tags.of(this).add(NAME_TAG, this.node.path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually thought this was a more prevalent pattern in the CDK itself, but upon performing a more recent search, it looks like only a handful of modules do this. I'm not personally attached to this tag, so if you'd like, I can just rip this out.
// post-construction validation | ||
this.node.addValidation({ validate: () => this.validateContainers() }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose it just seemed intuitive to me to perform synthesis-time validation using the validation hook. I can certainly move the check though if you'd like.
Co-authored-by: Kaizen Conroy <[email protected]>
Pull request has been modified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great! Any sharp edges that come out of this can be worked on in separate PRs -- it's probably more important to get this out there and usable as an experimental
construct first. Great work!
Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork). |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
@Mergifyio requeue |
❌ Command disallowed due to command restrictions in the Mergify configuration.
|
Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork). |
This is the first of three PRs to complete the implementation of RFC 431:
aws/aws-cdk-rfcs#431
fixes #2809
Co-authored-by: Matt McClean [email protected]
Co-authored-by: Long Yao [email protected]
Co-authored-by: Drew Jetter [email protected]
Co-authored-by: Murali Ganesh [email protected]
Co-authored-by: Abilash Rangoju [email protected]
All Submissions:
Adding new Unconventional Dependencies:
New Features
yarn integ
to deploy the infrastructure and generate the snapshot (i.e.yarn integ
without--dry-run
)?By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license