Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SageMaker Notebook Instance module #12

Merged
merged 18 commits into from
Feb 29, 2024
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- added `mlflow-image` and `mlflow-fargate` modules
- added `sagemaker-studio` module
- added `sagemaker-endpoint` module
- added `sagemaker-notebook` module

### **Changed**

Expand Down
36 changes: 36 additions & 0 deletions examples/manifests/sagemaker-notebook-modules.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@

name: notebook
path: modules/sagemaker/sagemaker-notebook
parameters:
- name: notebook_name
value: dummy
- name: instance_type
value: ml.t2.xlarge
- name: direct_internet_access
value: Enabled
- name: root_access
value: Disabled
- name: volume_size_in_gb
value: 20
- name: imds_version
value: 2
- name: subnet_ids
valueFrom:
moduleMetadata:
group: networking
name: networking
key: PrivateSubnetIds
- name: vpc_id
valueFrom:
moduleMetadata:
group: networking
name: networking
key: VpcId
- name: kms_key_arn
value: arn:aws:kms:us-east-1:xxxxxx:key/xxxxx-xxxx-xxxx-xxxxx
- name: code_repository
value: https://git-codecommit.us-east-1.amazonaws.com/v1/repos/xxxxxxxxx
lccasagrande marked this conversation as resolved.
Show resolved Hide resolved
- name: additional_code_repositories
value: '["https://github.com/aws/amazon-sagemaker-examples.git"]'
- name: tags
value: '{"test": "True"}'
80 changes: 80 additions & 0 deletions modules/sagemaker/sagemaker-notebook/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# SageMaker Notebook Instance

## Description

This module creates a SageMaker Notebook instance.

### Architecture

![SageMaker Notebook Instance Architecture](docs/_static/architecture.drawio.png "SageMaker Notebook Instance Architecture")

## Inputs/Outputs

### Input Paramenters

#### Required

- `notebook_name`: The name of the new notebook instance
- `instance_type`: The type of ML compute instance to launch for the notebook instance

#### Optional

- `direct_internet_access`: Sets whether SageMaker provides internet access to the notebook instance, by default None
- `root_access`: Whether root access is enabled or disabled for users of the notebook instance, by default None
- `volume_size_in_gb`: The size, in GB, of the ML storage volume to attach to the notebook instance, by default None
- `imds_version`: The Instance Metadata Service (IMDS) version, by default None
- `subnet_ids`: A list of subnet IDs in a VPC to which you would like to have a connectivity, by default None. Only the first subnet id will be used.
- `vpc_id`: The ID of the VPC to which you would like to have a connectivity, by default None
- `kms_key_arn`: The ARN of a AWS KMS key that SageMaker uses to encrypt data on the storage volume attached, by default None
- `code_repository`: The Git repository associated with the notebook instance as its default code repository, by default None
- `additional_code_repositories`: An array of up to three Git repositories associated with the notebook instance, by default None
- `role_arn`: An IAM Role ARN that SageMaker assumes to perform tasks on your behalf, by default None
- `tags`: Extra tags to apply to the SageMaker notebook instance, by default None

### Sample manifest declaration

```yaml
name: notebook
path: modules/sagemaker/sagemaker-notebook
targetAccount: primary
parameters:
- name: notebook_name
value: dummy123
- name: instance_type
value: ml.t2.xlarge
```

### Module Metadata Outputs

- `SageMakerNotebookArn`: the SageMaker Notebook instance ARN.

#### Output Example

```json
{
"SageMakerNotebookArn": "arn:aws:sagemaker:xxxxxxx:123412341234:notebook-instance/xxxxx",
}
```

## Install Instructions

To start using this module, do the following steps:

1. Create and activate a python virtual environment (venv).

```bash
python3 -m venv .venv # create venv
source .venv/bin/activate # activate venv
```

2. Update PIP to the latest version which, supports installment from pyproject.toml config.

```bash
pip install -U pip
```

3. Install the project.

```bash
pip install -e .
```
25 changes: 25 additions & 0 deletions modules/sagemaker/sagemaker-notebook/app.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
#!/usr/bin/env python3
"""Create a Sagemaker Model Stack."""
import aws_cdk as cdk

from sagemaker_notebook.settings import ApplicationSettings
from sagemaker_notebook.stack import SagemakerNotebookStack

# Load application settings from env vars.
app_settings = ApplicationSettings()

env = cdk.Environment(
account=app_settings.default.account,
region=app_settings.default.region,
)

app = cdk.App()

stack = SagemakerNotebookStack(
scope=app,
construct_id=app_settings.settings.app_prefix,
env=env,
**app_settings.parameters.model_dump(),
)

app.synth()
3 changes: 3 additions & 0 deletions modules/sagemaker/sagemaker-notebook/coverage.ini
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[run]
omit =
tests/*
28 changes: 28 additions & 0 deletions modules/sagemaker/sagemaker-notebook/deployspec.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
publishGenericEnvVariables: true
deploy:
phases:
install:
commands:
- env
# Install whatever additional build libraries
- npm install -g [email protected]
- pip install -U pip
- pip install .
build:
commands:
# execute the CDK
- cdk deploy --require-approval never --progress events --app "python app.py" --outputs-file ./cdk-exports.json
# Export metadata
- seedfarmer metadata convert -f cdk-exports.json || true
destroy:
phases:
install:
commands:
# Install whatever additional build libraries
- npm install -g [email protected]
- pip install -U pip
- pip install .
build:
commands:
# execute the CDK
- cdk destroy --force --app "python app.py"
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
<mxfile host="Electron" modified="2024-02-27T20:02:21.834Z" agent="5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) draw.io/20.2.3 Chrome/102.0.5005.167 Electron/19.0.11 Safari/537.36" etag="IxBhroUwZb-r2ZDYP9zm" version="20.2.3" type="device"><diagram id="nOkpBZcZqbjnXywVXBLF" name="Page-1">7Vttc6JIEP41flyL4UX0I77tpi7Zym2ubm8/WSOMOBtkLBiN3q+/HhhQhjHGqNHbNUmlmGZeoPt5unsaaFi92epzgufTBxaQqGEawaph9Rum2W4Z8F8I1rnAaUtBmNAgF6GN4In+S6Sw6LagAUkrHTljEafzqtBncUx8XpHhJGEv1W4TFlVXneOQ1ARPPo7q0u804FN5W46xkX8hNJwWKyNDnpnhorMUpFMcsJctkTVoWL2EMZ4fzVY9EgndFXrJxw13nC0vLCExf8uAn3+G/qO3nHhWt33fCvhqYfz4JGdZ4mghb9j7/gSCXsQWgbxuvi6UMWc05plCnS78wXo9o+HAmZ5oNU1HEahttypA9ZaYoypQ225VgNTpkbI+Ui9wS1BrVaY3lPWNrQuEP6vLFjyiMemV0DNAGCY4oGCSHotYArKYxaC97pTPImghOHyZUk6e5tgXWn0B1oBswmIuwY/Moi0VL2YFeHMMayVyjswSJBksSW6QvE8U4XlKx+WohPiLJKVL8o2k+eRCCkCci+PZKhSUbeKX1G6GCVvMs8u/g7W0Z0dwOPIFMEY44mIinrBnUtxow7TgdyjA153QKFIUsCQJp8ArL6KhmJ8zsRyWrYhMshlBKzQO77NW3zKkJnRLBDidkkDekkQxLEFWO+mBStKBsyJsRniyhi7FgIK40lHZsvmyYb1d+K7pFuPtwlNh6WnCcuoNGeFA8vEAbpo1bv792Ltx8sbJbU4u576WiXa7jeyzMtHzum63/S4m7g5FO+lpVtlpadhpa9hpts7FTqvGzseELjEngp6LcUz4jao3qm5TNRXTUb4ebTo/ZbwtJlZJjGx30PUUEoN80Bqaw/bpmFyucyImm0cy2XQ0TEbtczHZqTG5xlyNF61bq+22WtZOq6iQ1OU1/aPUbh2rdqRTu30utbdqan+CzdQDfgaWmsZXxsmYsWc4vItTjmOgu2qV9Jlwfyr1qXWuWgerc7JaR1t3tpVumfvTrKAKdTK3LkT1boXHrAt1Ml14UEcjzWikjN7tnFW/USBeddpwzva6ZsfbOten4E45zZxjzBIBOtWvIdTqg/E03JpkP6qHKoh2j8ckemQpldPPaBBEukynPKG4yK3osi+S4HSe62NCV+JC9OEgISlbJD7JgwEEklQXFlLA+yzDu9WNlHtIcgoq3tu5rINwNAmWcyb30K5XJmb4X9CNaQy6TzdfcGW+oNyXa3xBy/Asyz3MF5iuC+7geF8wZpyz2d5cySciH7ycKyARTuHyRuOI+c+jlLOEbMh/dEL2OvGRU2U+Muz65kpX+ThXZtB5hfq9enJ2o/71Un/otgeGfRj1+4bTQ+7vQ33f/DCqt66M6ahe48yfP9x5D/D/G4v2ZP1H4VPu5BT4dYdGWxN5ZOerQ55aykA70ciiD4wphts03QrYHKeGNctp2rqc8mxwq2/1s5K6MYgDGTg+Gm1233Q979dDGyk1CoNIQsGAJHmqYO9USHy93ISKHYqEobZ0/JE+z9RUjhfjiPq3wnE1tbkVjt9bONY+/RGF4w64ptbpCsflOh9TOO7srVAgS8Pk81WOTbtG5a/eXyD4jDl5wetbNDlRNIkxH4VSp0cB7PWsxa6GivaFIwVyawAiQUgKlYrdEwtZjKPBRtoFFxEHpZU3fe6ZsF+m4p+E87U0FF5wVrUiWVH+jxjedGTrx9aZ/krOnDXWFcqLi3sH4eEGs23SK/2kXThOQvLqM2RHb9+ERJiDP65cnc5ecuijAOZWDqEAA3UUk+c3IEdtrO4liXACZbdNjvm2dWqvdB3WHw7yK9hAsNTJEU6vdQlUbvuW1x5FpIAR7okX/xpbj+pANqTiNq8KrXn4OAKtxwWv+t77TnjjLAG9RbATRzAqVbsVxj5m/2NZ17b/af/m/mO/X9hh0SOjmOUoUcy13xTF9k5Ue8kqv8PaRKeKQKhzCQS9BRll7rSVOW3yKH3uJBqPRYXkSIShN0YotCMPPhJhZrEhKnZoDmp2tn/eh7cSqMW0aoHw3HirV2zkW/IsgAg1m9F6yeb2UEot71zPQym778LJxkEPpeDH7nZ0xZZf8qGUD8j2c2Sf8Z1epPDaMi+bmqD6O2mfKf+yGNfpLVVLZ9n3Q/vNnb3f08X+c5gFogI/AZngRfYxxQ48qDbNFvQKqVFI4HjKufgmyhMqMId+EH+iYNj00zwOmxPh1qEFdp3BSUc8dxsK+sE/SH6ac5GGns3O5ScSV2Pni6SghyQQRbnlahMIQ2/xYwstxnkSCOfCCYSxI4H44+H2Jtv/KXMYOrZrdw7LHHoustDwt8kcnsl6NMOxeLUVrmSUkmRJhSs5W3RRNweXDi7mRXan11DfKL7P3lulb50leFhutSxR8/InqtLX1nFMBTQHVd2hufkIPO+++ZLeGvwH</diagram></mxfile>
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
100 changes: 100 additions & 0 deletions modules/sagemaker/sagemaker-notebook/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
[build-system]
requires = ["setuptools", "setuptools-scm"]
build-backend = "setuptools.build_meta"

[project]
name = "seedfarmer-module-sagemaker-notebook"
authors = [
{name = "AWS"}
]
description = "SageMaker Notebook Infrastructure as Code (IaC) Seed-Farmer Module."
requires-python = ">=3.9"
license = {file = "LICENSE"}
classifiers = [
"Development Status :: 4 - Beta",
"Intended Audience :: Developers",
"License :: OSI Approved :: MIT License",
"Programming Language :: Python :: 3 :: Only",
"Programming Language :: Python :: 3.9",
"Typing :: Typed",
]
dependencies = [
"aws-cdk-lib~=2.126.0",
"constructs~=10.3.0",
"pydantic~=2.5.3",
"pydantic-settings~=2.0.3",
"configobj~=5.0.8",
"validators~=0.22.0"

]

dynamic = ["version", "readme"]

[tool.setuptools.dynamic]
version = {attr = "sagemaker_notebook.__version__.VERSION"}
readme = {file = ["README.md"]}

[tool.setuptools.packages.find]
where = ["."] # list of folders that contain the packages (["."] by default)
include = ["sagemaker_notebook"] # package names should match these glob patterns (["*"] by default)


[tool.ruff]
line-length = 120
#src = ["src"]

# Enable pycodestyle (`E`), Pyflakes (`F`), PyDocsStyle (`D`), isort (`I`), pep8-naming (`N`) codes by default.
lint.select = ["E", "F", "D", "I", "N", "Q", "C90", "N", "S", "RUF"]
lint.ignore = []


# Allow autofix for all enabled rules (when `--fix`) is provided.
lint.fixable = ["ALL"]
lint.unfixable = []

# Exclude a variety of commonly ignored directories.
exclude = [
".bzr",
".direnv",
".eggs",
".git",
".git-rewrite",
".hg",
".mypy_cache",
".nox",
".pants.d",
".pytype",
".ruff_cache",
".svn",
".tox",
".venv",
".env",
".vscode",
".aws*",
"__pypackages__",
"_build",
"buck-out",
"build",
"dist",
"node_modules",
"venv",
"shared",
"tests",
"codeseeder.out"
]

lint.dummy-variable-rgx = "^(_+|(_+[a-zA-Z0-9_]*[a-zA-Z0-9]+?))$" # Allow unused variables when underscore-prefixed.
target-version = "py39" # Assume Python 3.9

[tool.ruff.lint.per-file-ignores]
"__init__.py" = ["F401", "D104"]

[tool.ruff.lint.mccabe]
# Unlike Flake8, default to a complexity level of 10.
max-complexity = 10

[tool.ruff.lint.flake8-quotes]
docstring-quotes = "double"

[tool.ruff.lint.pydocstyle]
convention = "numpy"
Empty file.
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
"""Defines package version."""

VERSION = "0.1.0"
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
"""Defines the stack settings."""

import time
from abc import ABC
from typing import Dict, List, Optional

from pydantic import Field, computed_field, field_validator
from pydantic_settings import BaseSettings, SettingsConfigDict


class CdkBaseSettings(BaseSettings, ABC):
"""Defines common configuration for settings."""

model_config = SettingsConfigDict(
case_sensitive=False,
env_nested_delimiter="__",
protected_namespaces=(),
extra="ignore",
populate_by_name=True,
)


class SeedFarmerParameters(CdkBaseSettings):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this!

"""Seedfarmer Parameters.

These parameters are required for the module stack.
"""

model_config = SettingsConfigDict(env_prefix="SEEDFARMER_PARAMETER_")

notebook_name: str
instance_type: str

direct_internet_access: str = Field(default="Enabled")
root_access: str = Field(default="Disabled")
volume_size_in_gb: Optional[int] = Field(default=None)
imds_version: str = Field(default="2")
subnet_ids: Optional[List[str]] = Field(default=None)
code_repository: Optional[str] = Field(default=None)
additional_code_repositories: Optional[List[str]] = Field(default=None)
vpc_id: Optional[str] = Field(default=None)
kms_key_arn: Optional[str] = Field(default=None)
role_arn: Optional[str] = Field(default=None)
tags: Optional[Dict[str, str]] = Field(default=None)

@field_validator("notebook_name")
@classmethod
def validate_name_length(cls, v: str) -> str:
"""Validate if notebook_name length is valid."""
if len(v) <= 50:
return f"{v}-{int(time.time())}"

raise ValueError(f"'name' length must be <= 50, got '{len(v)}'")


class SeedFarmerSettings(CdkBaseSettings):
"""Seedfarmer Settings.

These parameters comes from seedfarmer by default.
"""

model_config = SettingsConfigDict(env_prefix="SEEDFARMER_")

project_name: str = Field(default="")
deployment_name: str = Field(default="")
module_name: str = Field(default="")

@computed_field
@property
def app_prefix(self) -> str:
"""Application prefix."""
prefix = "-".join([self.project_name, self.deployment_name, self.module_name])
return prefix


class CdkDefaultSettings(CdkBaseSettings):
"""CDK Default Settings.

These parameters comes from AWS CDK by default.
"""

model_config = SettingsConfigDict(env_prefix="CDK_DEFAULT_")

account: str
region: str


class ApplicationSettings(CdkBaseSettings):
"""Application settings."""

settings: SeedFarmerSettings = Field(default_factory=SeedFarmerSettings)
parameters: SeedFarmerParameters = Field(default_factory=SeedFarmerParameters)
default: CdkDefaultSettings = Field(default_factory=CdkDefaultSettings)
Loading
Loading