Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(sagemaker): add support uncompressed model #30949

Closed
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
80 changes: 44 additions & 36 deletions packages/@aws-cdk/aws-sagemaker-alpha/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,13 +35,12 @@ In the event that a single container is sufficient for your inference use-case,
single-container model:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker-alpha';
import * as path from 'path';

const image = sagemaker.ContainerImage.fromAsset(path.join('path', 'to', 'Dockerfile', 'directory'));
const modelData = sagemaker.ModelData.fromAsset(path.join('path', 'to', 'artifact', 'file.tar.gz'));
const image = ContainerImage.fromAsset(path.join('path', 'to', 'Dockerfile', 'directory'));
const modelData = ModelData.fromAsset(path.join('path', 'to', 'artifact', 'file.tar.gz'));

const model = new sagemaker.Model(this, 'PrimaryContainerModel', {
const model = new Model(this, 'PrimaryContainerModel', {
containers: [
{
image: image,
Expand All @@ -60,16 +59,15 @@ more about SageMaker inference pipelines. To define an inference pipeline, you c
additional containers for your model:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker-alpha';

declare const image1: sagemaker.ContainerImage;
declare const modelData1: sagemaker.ModelData;
declare const image2: sagemaker.ContainerImage;
declare const modelData2: sagemaker.ModelData;
declare const image3: sagemaker.ContainerImage;
declare const modelData3: sagemaker.ModelData;
declare const image1: ContainerImage;
declare const modelData1: ModelData;
declare const image2: ContainerImage;
declare const modelData2: ModelData;
declare const image3: ContainerImage;
declare const modelData3: ModelData;

const model = new sagemaker.Model(this, 'InferencePipelineModel', {
const model = new Model(this, 'InferencePipelineModel', {
containers: [
{ image: image1, modelData: modelData1 },
{ image: image2, modelData: modelData2 },
Expand All @@ -89,10 +87,9 @@ abstract base class.
Reference a local directory containing a Dockerfile:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker-alpha';
import * as path from 'path';

const image = sagemaker.ContainerImage.fromAsset(path.join('path', 'to', 'Dockerfile', 'directory'));
const image = ContainerImage.fromAsset(path.join('path', 'to', 'Dockerfile', 'directory'));
```

#### ECR Image
Expand All @@ -101,23 +98,21 @@ Reference an image available within ECR:

```typescript
import * as ecr from 'aws-cdk-lib/aws-ecr';
import * as sagemaker from '@aws-cdk/aws-sagemaker-alpha';

const repository = ecr.Repository.fromRepositoryName(this, 'Repository', 'repo');
const image = sagemaker.ContainerImage.fromEcrRepository(repository, 'tag');
const image = ContainerImage.fromEcrRepository(repository, 'tag');
```

#### DLC Image

Reference a deep learning container image:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker-alpha';

const repositoryName = 'huggingface-pytorch-training';
const tag = '1.13.1-transformers4.26.0-gpu-py39-cu117-ubuntu20.04';

const image = sagemaker.ContainerImage.fromDlc(repositoryName, tag);
const image = ContainerImage.fromDlc(repositoryName, tag);
```

### Model Artifacts
Expand All @@ -132,10 +127,9 @@ base class. The default is to have no model artifacts associated with a model.
Reference local model data:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker-alpha';
import * as path from 'path';

const modelData = sagemaker.ModelData.fromAsset(path.join('path', 'to', 'artifact', 'file.tar.gz'));
const modelData = ModelData.fromAsset(path.join('path', 'to', 'artifact', 'file.tar.gz'));
```

#### S3 Model Data
Expand All @@ -144,10 +138,28 @@ Reference an S3 bucket and object key as the artifacts for a model:

```typescript
import * as s3 from 'aws-cdk-lib/aws-s3';
import * as sagemaker from '@aws-cdk/aws-sagemaker-alpha';

const bucket = new s3.Bucket(this, 'MyBucket');
const modelData = sagemaker.ModelData.fromBucket(bucket, 'path/to/artifact/file.tar.gz');
const modelData = ModelData.fromBucket(bucket, 'path/to/artifact/file.tar.gz');
```

When deploying ML models, one option is to archive andcompress the model artifacts into a tar.gz format.
Although this method works well for small models,
compressing a large model artifact with hundreds of billions of parameters and
then decompressing it on an endpoint can take a significant amount of time.
For large model inference, we recommend that you deploy uncompressed ML model.

If you want to use uncompressed ML model,
you can provide options to `ModelData.fromBucket` like a following the code.

```typescript
import * as s3 from 'aws-cdk-lib/aws-s3';

const bucket = new s3.Bucket(this, 'MyBucket');
const modelData = ModelData.fromBucket(bucket, 'path/to/artifact', {
compressionType: CompressionType.NONE,
s3DataType: S3DataType.S3_PREFIX,
});
```

## Model Hosting
Expand All @@ -168,12 +180,11 @@ for model B. Amazon SageMaker distributes two-thirds of the traffic to Model A,
model B:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker-alpha';

declare const modelA: sagemaker.Model;
declare const modelB: sagemaker.Model;
declare const modelA: Model;
declare const modelB: Model;

const endpointConfig = new sagemaker.EndpointConfig(this, 'EndpointConfig', {
const endpointConfig = new EndpointConfig(this, 'EndpointConfig', {
instanceProductionVariants: [
{
model: modelA,
Expand All @@ -199,24 +210,22 @@ more information about the API, see the
API. Defining an endpoint requires at minimum the associated endpoint configuration:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker-alpha';

declare const endpointConfig: sagemaker.EndpointConfig;
declare const endpointConfig: EndpointConfig;

const endpoint = new sagemaker.Endpoint(this, 'Endpoint', { endpointConfig });
const endpoint = new Endpoint(this, 'Endpoint', { endpointConfig });
```

### AutoScaling

To enable autoscaling on the production variant, use the `autoScaleInstanceCount` method:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker-alpha';

declare const model: sagemaker.Model;
declare const model: Model;

const variantName = 'my-variant';
const endpointConfig = new sagemaker.EndpointConfig(this, 'EndpointConfig', {
const endpointConfig = new EndpointConfig(this, 'EndpointConfig', {
instanceProductionVariants: [
{
model: model,
Expand All @@ -225,7 +234,7 @@ const endpointConfig = new sagemaker.EndpointConfig(this, 'EndpointConfig', {
]
});

const endpoint = new sagemaker.Endpoint(this, 'Endpoint', { endpointConfig });
const endpoint = new Endpoint(this, 'Endpoint', { endpointConfig });
const productionVariant = endpoint.findInstanceProductionVariant(variantName);
const instanceCount = productionVariant.autoScaleInstanceCount({
maxCapacity: 3
Expand All @@ -244,11 +253,10 @@ To monitor CloudWatch metrics for a production variant, use one or more of the m
methods:

```typescript
import * as sagemaker from '@aws-cdk/aws-sagemaker-alpha';

declare const endpointConfig: sagemaker.EndpointConfig;
declare const endpointConfig: EndpointConfig;

const endpoint = new sagemaker.Endpoint(this, 'Endpoint', { endpointConfig });
const endpoint = new Endpoint(this, 'Endpoint', { endpointConfig });
const productionVariant = endpoint.findInstanceProductionVariant('my-variant');
productionVariant.metricModelLatency().createAlarm(this, 'ModelLatencyAlarm', {
threshold: 100000,
Expand Down
108 changes: 100 additions & 8 deletions packages/@aws-cdk/aws-sagemaker-alpha/lib/model-data.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,58 @@ import { hashcode } from './private/util';

// The only supported extension for local asset model data
// https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-properties-sagemaker-model-containerdefinition.html#cfn-sagemaker-model-containerdefinition-modeldataurl
const ARTIFACT_EXTENSION = '.tar.gz';
const COMPRESSED_ARTIFACT_EXTENSION = '.tar.gz';

/**
* Specifies how the ML model data is prepared.
*/
export enum CompressionType {
/**
* If you choose `CompressionType.GZIP` and choose `S3DataType.S3_OBJECT` as the value of `s3DataType`,
* S3 URI identifies an object that is a gzip-compressed TAR archive.
* SageMaker will attempt to decompress and untar the object during model deployment.
*/
GZIP = 'Gzip',
/**
* If you choose `CompressionType.NONE` and choose `S3DataType.S3_PREFIX` as the value of `s3DataType`,
* S3 URI identifies a key name prefix, under which all objects represents the uncompressed ML model to deploy.
*
* If you choose `CompressionType.NONE`, then SageMaker will follow rules below when creating model data files
* under `/opt/ml/model` directory for use by your inference code:
Comment on lines +22 to +26
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CompressionType and S3DataType enums seem to be used a lot in conjunction with each other, and when their behaviour changes this drastically between combinations, it makes me wonder if there is any way we can combine the two? Right now, for instance, there doesn't seem to be anything documented about what will happen if I combine CompressionType.GZIP and S3DataType.S3_PREFIX, if that's a valid combination.

* - If you choose `S3DataType.S3_OBJECT` as the value of `s3DataType`, then SageMaker will split the key of the S3 object referenced by S3 URI by slash (/),
* and use the last part as the filename of the file holding the content of the S3 object.
* - If you choose `S3DataType.S3_PREFIX` as the value of `s3DataType`, then for each S3 object under the key name pefix referenced by S3 URI,
* SageMaker will trim its key by the prefix, and use the remainder as the path (relative to `/opt/ml/model`) of the file holding the content of the S3 object.
* SageMaker will split the remainder by slash (/), using intermediate parts as directory names and the last part as filename of the file holding the content of the S3 object.
* - Do not use any of the following as file names or directory names:
* - An empty or blank string
* - A string which contains null bytes
* - A string longer than 255 bytes
* - A single dot (.)
* - A double dot (..)
* - Ambiguous file names will result in model deployment failure.
* For example, if your uncompressed ML model consists of two S3 objects `s3://mybucket/model/weights` and `s3://mybucket/model/weights/part1`
* and you specify `s3://mybucket/model/` as the value of S3 URI and `S3DataType.S3_PREFIX` as the value of `s3DataType`,
* then it will result in name clash between `/opt/ml/model/weights` (a regular file) and `/opt/ml/model/weights/` (a directory).
Comment on lines +32 to +41
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docstring feels unnecessarily lengthy, and the mention of what file/directory names doesn't seem relevant, at least for the enum. Is there anyway we can add a @see tag and link to the documentation instead for all this extra information?

*/
NONE = 'None',
}

/**
* Specifies the type of ML model data to deploy.
*/
export enum S3DataType {
/**
* If you choose `S3DataType.S3_OBJECT`, S3 UTI identifies an object that is the ML model data to deploy.
*/
S3_OBJECT = 'S3Object',
/**
* If you choose `S3DataType.S3_PREFIX`, S3 URI identifies a key name prefix.
* SageMaker uses all objects that match the specified key name prefix as part of the ML model data to deploy.
* A valid key name prefix identified by S3 URI always ends with a forward slash (/).
*/
S3_PREFIX = 'S3Prefix',
}

/**
* The configuration needed to reference model artifacts.
Expand All @@ -17,6 +68,16 @@ export interface ModelDataConfig {
* must point to a single gzip compressed tar archive (.tar.gz suffix).
*/
readonly uri: string;
/**
* Specifies how the ML model data is prepared.
* @default CompressionType.GZIP
*/
readonly compressionType?: CompressionType;
/**
* Specifies the type of ML model data to deploy.
* @default S3DataType.S3_OBJECT
*/
readonly s3DataType?: S3DataType;
}

/**
Expand All @@ -28,9 +89,10 @@ export abstract class ModelData {
* Constructs model data which is already available within S3.
* @param bucket The S3 bucket within which the model artifacts are stored
* @param objectKey The S3 object key at which the model artifacts are stored
* @param options The options for identifying model artifacts
*/
public static fromBucket(bucket: s3.IBucket, objectKey: string): ModelData {
return new S3ModelData(bucket, objectKey);
public static fromBucket(bucket: s3.IBucket, objectKey: string, options?: S3ModelDataOptions): ModelData {
return new S3ModelData(bucket, objectKey, options);
}

/**
Expand All @@ -51,8 +113,33 @@ export abstract class ModelData {
public abstract bind(scope: Construct, model: IModel): ModelDataConfig;
}

/**
* The options for identifying model artifacts.
* When you choose `CompressionType.GZIP` and `S3DataType.S3_OBJECT` then use `ModelDataUrl` property.
* Otherwise, use `ModelDataSource` property.
Comment on lines +118 to +119
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading this, I interpret it as telling the user to use the ModelDataUrl or ModelDataSource properties, but looking through the rest of the changes, it seems like this is something handled internally. Is there a reason the user needs to know about what's going on under the hood?

*
* Currently you cannot use ModelDataSource in conjunction with:
* - SageMaker batch transform
* - SageMaker serverless endpoints
* - SageMaker multi-model endpoints
* - SageMaker Marketplace
*/
export interface S3ModelDataOptions {
/**
* Specifies how the ML model data is prepared.
* @default CompressionType.GZIP
*/
readonly compressionType: CompressionType;
/**
* Specifies the type of ML model data to deploy.
* @default S3DataType.S3_OBJECT
*/
readonly s3DataType: S3DataType;
}

class S3ModelData extends ModelData {
constructor(private readonly bucket: s3.IBucket, private readonly objectKey: string) {
constructor(private readonly bucket: s3.IBucket,
private readonly objectKey: string, private readonly options?: S3ModelDataOptions) {
super();
}

Expand All @@ -61,6 +148,8 @@ class S3ModelData extends ModelData {

return {
uri: this.bucket.urlForObject(this.objectKey),
compressionType: this.options?.compressionType,
s3DataType: this.options?.s3DataType,
};
}
}
Expand All @@ -70,9 +159,6 @@ class AssetModelData extends ModelData {

constructor(private readonly path: string, private readonly options: assets.AssetOptions) {
super();
if (!path.toLowerCase().endsWith(ARTIFACT_EXTENSION)) {
throw new Error(`Asset must be a gzipped tar file with extension ${ARTIFACT_EXTENSION} (${this.path})`);
}
Comment on lines -73 to -75
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The asset's availability determination was moved after binding it as an asset.

}

public bind(scope: Construct, model: IModel): ModelDataConfig {
Expand All @@ -83,11 +169,17 @@ class AssetModelData extends ModelData {
...this.options,
});
}

if (!this.asset.isFile) {
throw new Error(`Asset must be a file, if you want to use directory you can use 'ModelData.fromBucket()' with the 's3DataType' option to 'S3DataType.S3_PREFIX' and 'compressionType' option to 'CompressionType.NONE' (${this.path})`);
}
Comment on lines +172 to +174
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bundled assets are now allowed if they are not directories.

this.asset.grantRead(model);

return {
uri: this.asset.httpUrl,
compressionType: this.asset.assetPath.toLowerCase().endsWith(COMPRESSED_ARTIFACT_EXTENSION)
? CompressionType.GZIP
: CompressionType.NONE,
s3DataType: S3DataType.S3_OBJECT,
Comment on lines +179 to +182
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uncompressed single files are also supported

};
}
}
19 changes: 15 additions & 4 deletions packages/@aws-cdk/aws-sagemaker-alpha/lib/model.ts
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
import * as ec2 from 'aws-cdk-lib/aws-ec2';
import * as iam from 'aws-cdk-lib/aws-iam';
import { CfnModel } from 'aws-cdk-lib/aws-sagemaker';
import * as cdk from 'aws-cdk-lib/core';
import { Construct } from 'constructs';
import { ContainerImage } from './container-image';
import { ModelData } from './model-data';
import { CfnModel } from 'aws-cdk-lib/aws-sagemaker';
import { CompressionType, ModelData, S3DataType } from './model-data';

/**
* Interface that defines a Model resource.
Expand Down Expand Up @@ -357,11 +357,22 @@ export class Model extends ModelBase {
}

private renderContainer(container: ContainerDefinition): CfnModel.ContainerDefinitionProperty {
const image = container.image.bind(this, this);
const modelDataConfig = container.modelData?.bind(this, this);
const useModelDataSource = modelDataConfig?.compressionType === CompressionType.NONE
|| modelDataConfig?.s3DataType === S3DataType.S3_PREFIX;
return {
image: container.image.bind(this, this).imageName,
image: image.imageName,
containerHostname: container.containerHostname,
environment: container.environment,
modelDataUrl: container.modelData ? container.modelData.bind(this, this).uri : undefined,
modelDataSource: useModelDataSource ? {
s3DataSource: {
s3Uri: modelDataConfig.uri,
s3DataType: modelDataConfig.s3DataType!,
compressionType: modelDataConfig.compressionType!,
},
} : undefined,
modelDataUrl: !useModelDataSource ? modelDataConfig?.uri : undefined,
Comment on lines +380 to +387
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible for both of these values to be set at the same time? And if so, what sort of behaviour would we expect?

};
}

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
// Fixture with packages imported, but nothing else
import * as cdk from 'aws-cdk-lib';
import { Construct } from 'constructs';
import { CompressionType, ContainerImage, Endpoint, EndpointConfig, Model, ModelData, S3DataType } from '@aws-cdk/aws-sagemaker-alpha';

class Fixture extends cdk.Stack {
constructor(scope: Construct, id: string) {
Expand Down
Loading