Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: update mlflow version to support LLM tracing & update manifests #239

Merged
merged 7 commits into from
Sep 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### **Changed**

- updated mlflow version to 2.16.0 to support LLM tracing
- remove CDK overhead from `mlflow-image` module
- renamed mlflow manifests and updated README.MD

## v1.5.0

### **Added**
Expand Down
17 changes: 8 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,15 +23,14 @@ See deployment steps in the [Deployment Guide](DEPLOYMENT.md).

End-to-end example use-cases built using modules in this repository.

| Type | Description |
|-----------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [MLOps with Amazon SageMaker](manifests/mlops-sagemaker/) | Set up environment for MLOps with Amazon SageMaker. Deploy secure Amazon SageMaker Studio Domain, and provisions SageMaker Project Templates using Service Catalog, including model training and deployment. |
| [Mlflow experiments tracking with Amazon SageMaker](manifests/mlflow-experiments-tracking/) | An example using Mlflow experiments tracking with Amazon SageMaker. Deploy self-hosted Mlflow instance on AWS Fargate, and Amazon SageMaker Studio Domain environment. |
| [Managed Workflows with Apache Airflow (MWAA) for Machine Learning Training](manifests/mwaa-ml-training/) | An example orchestrating ML training jobs with Managed Workflows for Apache Airflow (MWAA). Deploys MWAA and an example ML training DAG. |
| [Q&A on PDF documents with RAG](manifests/fmops-qna-rag/) | Deploy AppSync GraphQL endpoint for Q&A chatbot with RAG based on OpenSearch, and data ingestion infrastructure. |
| [Ray on Amazon Elastic Kubernetes Service (EKS)](manifests/ray-on-eks/) | Run Ray on AWS EKS. Deploys an AWS EKS cluster, KubeRay Ray Operator, and a Ray Cluster with autoscaling enabled. |
| [Bedrock Fine-Tuning with Step Functions](manifests/bedrock-finetuning-sfn/) | Continuously Fine-tune a Foundation Model with Bedrock Fine-Tuning jobs and AWS Step Functions. |
| [MLOps with Step Functions](manifests/mlops-stepfunctions/) | Automate machine learning lifecycle using Amazon SageMaker and AWS Step Functions. |
| Type | Description |
|-----------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [MLOps with Amazon SageMaker](manifests/mlops-sagemaker/) | Set up environment for MLOps with Amazon SageMaker. Deploy secure Amazon SageMaker Studio Domain, and provisions SageMaker Project Templates using Service Catalog, including model training and deployment. |
| [Mlflow tracking server and model registry with Amazon SageMaker](manifests/mlflow-tracking/) | An example using Mlflow experiments tracking, model registry, and LLM tracing with Amazon SageMaker. Deploy self-hosted Mlflow tracking server and model registry on AWS Fargate, and Amazon SageMaker Studio Domain environment. |
| [Managed Workflows with Apache Airflow (MWAA) for Machine Learning Training](manifests/mwaa-ml-training/) | An example orchestrating ML training jobs with Managed Workflows for Apache Airflow (MWAA). Deploys MWAA and an example ML training DAG. |
| [Ray on Amazon Elastic Kubernetes Service (EKS)](manifests/ray-on-eks/) | Run Ray on AWS EKS. Deploys an AWS EKS cluster, KubeRay Ray Operator, and a Ray Cluster with autoscaling enabled. |
| [Bedrock Fine-Tuning with Step Functions](manifests/bedrock-finetuning-sfn/) | Continuously Fine-tune a Foundation Model with Bedrock Fine-Tuning jobs and AWS Step Functions. |
| [MLOps with Step Functions](manifests/mlops-stepfunctions/) | Automate machine learning lifecycle using Amazon SageMaker and AWS Step Functions. |


## Modules
Expand Down
23 changes: 0 additions & 23 deletions manifests/mlflow-experiments-tracking/deployment.yaml

This file was deleted.

23 changes: 23 additions & 0 deletions manifests/mlflow-tracking/deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
name: mlflow-tracking
toolchainRegion: us-east-1
forceDependencyRedeploy: true
groups:
- name: networking
path: manifests/mlflow-tracking/networking-modules.yaml
- name: storage
path: manifests/mlflow-tracking/storage-modules.yaml
- name: sagemaker-studio
path: manifests/mlflow-tracking/sagemaker-studio-modules.yaml
- name: images
path: manifests/mlflow-tracking/images-modules.yaml
- name: mlflow
path: manifests/mlflow-tracking/mlflow-modules.yaml
targetAccountMappings:
- alias: primary
accountId:
valueFrom:
envVariable: PRIMARY_ACCOUNT
default: true
regionMappings:
- region: us-east-1
default: true
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: mlflow-image
path: git::https://github.com/awslabs/aiops-modules.git//modules/mlflow/mlflow-image?ref=release/1.5.0&depth=1
path: modules/mlflow/mlflow-image
targetAccount: primary
parameters:
- name: ecr-repository-name
Expand Down
2 changes: 1 addition & 1 deletion modules/mlflow/mlflow-image/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Description

This module creates an mlflow container image and pushes to the specified Elastic Container Repository.
This module creates an mlflow tracking server container image and pushes to the specified Elastic Container Repository.

## Inputs/Outputs

Expand Down
45 changes: 0 additions & 45 deletions modules/mlflow/mlflow-image/app.py

This file was deleted.

33 changes: 12 additions & 21 deletions modules/mlflow/mlflow-image/deployspec.yaml
Original file line number Diff line number Diff line change
@@ -1,30 +1,21 @@
build_type: BUILD_GENERAL1_SMALL
publishGenericEnvVariables: true

deploy:
phases:
install:
commands:
- npm install -g [email protected]
- pip install -r requirements.txt
build:
commands:
- cdk deploy --require-approval never --progress events --app "python app.py" --outputs-file ./cdk-exports.json
# Export metadata
- seedfarmer metadata convert -f cdk-exports.json || true
post_build:
commands:
- echo "Build successful"

- aws ecr describe-repositories --repository-names ${SEEDFARMER_PARAMETER_ECR_REPOSITORY_NAME} || aws ecr create-repository --repository-name ${SEEDFARMER_PARAMETER_ECR_REPOSITORY_NAME} --image-scanning-configuration scanOnPush=true
- export COMMIT_HASH=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
- export IMAGE_TAG=${COMMIT_HASH:=latest}
- export REPOSITORY_URI=$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/${SEEDFARMER_PARAMETER_ECR_REPOSITORY_NAME}
- aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
- echo Building the Docker image...
- cd src/ && docker build -t $REPOSITORY_URI:latest .
- docker tag $REPOSITORY_URI:latest $REPOSITORY_URI:$IMAGE_TAG
- docker push $REPOSITORY_URI:latest && docker push $REPOSITORY_URI:$IMAGE_TAG
- seedfarmer metadata add -k ImageUri -v $REPOSITORY_URI:latest
destroy:
phases:
install:
commands:
- npm install -g [email protected]
- pip install -r requirements.txt
build:
commands:
- cdk destroy --force --app "python app.py"
post_build:
commands:
- echo "Destroy successful"
- aws ecr delete-repository --repository-name ${SEEDFARMER_PARAMETER_ECR_REPOSITORY_NAME} --force
# build_type: BUILD_GENERAL1_LARGE
37 changes: 0 additions & 37 deletions modules/mlflow/mlflow-image/integ/integ_image.py

This file was deleted.

42 changes: 42 additions & 0 deletions modules/mlflow/mlflow-image/modulestack.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
AWSTemplateFormatVersion: 2010-09-09
Description: This stack deploys a Module specific IAM permissions

Parameters:
# DeploymentName:
# Type: String
# Description: The name of the deployment
# ModuleName:
# Type: String
# Description: The name of the Module
RoleName:
Type: String
Description: The name of the IAM Role
ECRRepositoryName:
Type: String
Description: The name of the ECR repository

Resources:
Policy:
Type: "AWS::IAM::Policy"
Properties:
PolicyDocument:
Statement:
- Effect: Allow
Action:
- "ecr:Describe*"
- "ecr:Get*"
- "ecr:List*"
Resource: "*"
- Action:
- "ecr:Create*"
- "ecr:Delete*"
- "ecr:*LayerUpload"
- "ecr:UploadLayerPart"
- "ecr:Batch*"
- "ecr:Put*"
Effect: Allow
Resource:
- !Sub "arn:${AWS::Partition}:ecr:${AWS::Region}:${AWS::AccountId}:repository/${ECRRepositoryName}"
Version: 2012-10-17
PolicyName: "modulespecific-policy"
Roles: [!Ref RoleName]
45 changes: 0 additions & 45 deletions modules/mlflow/mlflow-image/pyproject.toml

This file was deleted.

7 changes: 0 additions & 7 deletions modules/mlflow/mlflow-image/requirements.in

This file was deleted.

Loading
Loading