Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin' into feat/fine-tuning-6b
Browse files Browse the repository at this point in the history
  • Loading branch information
JunjieTang-D1 committed Sep 22, 2024
2 parents 2742a68 + e320a4e commit 62aa417
Show file tree
Hide file tree
Showing 23 changed files with 93 additions and 518 deletions.
4 changes: 4 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

### **Changed**

- updated mlflow version to 2.16.0 to support LLM tracing
- remove CDK overhead from `mlflow-image` module
- renamed mlflow manifests and updated README.MD

## v1.5.0

### **Added**
Expand Down
17 changes: 8 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,15 +23,14 @@ See deployment steps in the [Deployment Guide](DEPLOYMENT.md).

End-to-end example use-cases built using modules in this repository.

| Type | Description |
|-----------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [MLOps with Amazon SageMaker](manifests/mlops-sagemaker/) | Set up environment for MLOps with Amazon SageMaker. Deploy secure Amazon SageMaker Studio Domain, and provisions SageMaker Project Templates using Service Catalog, including model training and deployment. |
| [Mlflow experiments tracking with Amazon SageMaker](manifests/mlflow-experiments-tracking/) | An example using Mlflow experiments tracking with Amazon SageMaker. Deploy self-hosted Mlflow instance on AWS Fargate, and Amazon SageMaker Studio Domain environment. |
| [Managed Workflows with Apache Airflow (MWAA) for Machine Learning Training](manifests/mwaa-ml-training/) | An example orchestrating ML training jobs with Managed Workflows for Apache Airflow (MWAA). Deploys MWAA and an example ML training DAG. |
| [Q&A on PDF documents with RAG](manifests/fmops-qna-rag/) | Deploy AppSync GraphQL endpoint for Q&A chatbot with RAG based on OpenSearch, and data ingestion infrastructure. |
| [Ray on Amazon Elastic Kubernetes Service (EKS)](manifests/ray-on-eks/) | Run Ray on AWS EKS. Deploys an AWS EKS cluster, KubeRay Ray Operator, and a Ray Cluster with autoscaling enabled. |
| [Bedrock Fine-Tuning with Step Functions](manifests/bedrock-finetuning-sfn/) | Continuously Fine-tune a Foundation Model with Bedrock Fine-Tuning jobs and AWS Step Functions. |
| [MLOps with Step Functions](manifests/mlops-stepfunctions/) | Automate machine learning lifecycle using Amazon SageMaker and AWS Step Functions. |
| Type | Description |
|-----------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [MLOps with Amazon SageMaker](manifests/mlops-sagemaker/) | Set up environment for MLOps with Amazon SageMaker. Deploy secure Amazon SageMaker Studio Domain, and provisions SageMaker Project Templates using Service Catalog, including model training and deployment. |
| [Mlflow tracking server and model registry with Amazon SageMaker](manifests/mlflow-tracking/) | An example using Mlflow experiments tracking, model registry, and LLM tracing with Amazon SageMaker. Deploy self-hosted Mlflow tracking server and model registry on AWS Fargate, and Amazon SageMaker Studio Domain environment. |
| [Managed Workflows with Apache Airflow (MWAA) for Machine Learning Training](manifests/mwaa-ml-training/) | An example orchestrating ML training jobs with Managed Workflows for Apache Airflow (MWAA). Deploys MWAA and an example ML training DAG. |
| [Ray on Amazon Elastic Kubernetes Service (EKS)](manifests/ray-on-eks/) | Run Ray on AWS EKS. Deploys an AWS EKS cluster, KubeRay Ray Operator, and a Ray Cluster with autoscaling enabled. |
| [Bedrock Fine-Tuning with Step Functions](manifests/bedrock-finetuning-sfn/) | Continuously Fine-tune a Foundation Model with Bedrock Fine-Tuning jobs and AWS Step Functions. |
| [MLOps with Step Functions](manifests/mlops-stepfunctions/) | Automate machine learning lifecycle using Amazon SageMaker and AWS Step Functions. |


## Modules
Expand Down
23 changes: 0 additions & 23 deletions manifests/mlflow-experiments-tracking/deployment.yaml

This file was deleted.

23 changes: 23 additions & 0 deletions manifests/mlflow-tracking/deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
name: mlflow-tracking
toolchainRegion: us-east-1
forceDependencyRedeploy: true
groups:
- name: networking
path: manifests/mlflow-tracking/networking-modules.yaml
- name: storage
path: manifests/mlflow-tracking/storage-modules.yaml
- name: sagemaker-studio
path: manifests/mlflow-tracking/sagemaker-studio-modules.yaml
- name: images
path: manifests/mlflow-tracking/images-modules.yaml
- name: mlflow
path: manifests/mlflow-tracking/mlflow-modules.yaml
targetAccountMappings:
- alias: primary
accountId:
valueFrom:
envVariable: PRIMARY_ACCOUNT
default: true
regionMappings:
- region: us-east-1
default: true
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: mlflow-image
path: git::https://github.com/awslabs/aiops-modules.git//modules/mlflow/mlflow-image?ref=release/1.5.0&depth=1
path: modules/mlflow/mlflow-image
targetAccount: primary
parameters:
- name: ecr-repository-name
Expand Down
2 changes: 1 addition & 1 deletion modules/mlflow/mlflow-image/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Description

This module creates an mlflow container image and pushes to the specified Elastic Container Repository.
This module creates an mlflow tracking server container image and pushes to the specified Elastic Container Repository.

## Inputs/Outputs

Expand Down
45 changes: 0 additions & 45 deletions modules/mlflow/mlflow-image/app.py

This file was deleted.

33 changes: 12 additions & 21 deletions modules/mlflow/mlflow-image/deployspec.yaml
Original file line number Diff line number Diff line change
@@ -1,30 +1,21 @@
build_type: BUILD_GENERAL1_SMALL
publishGenericEnvVariables: true

deploy:
phases:
install:
commands:
- npm install -g [email protected]
- pip install -r requirements.txt
build:
commands:
- cdk deploy --require-approval never --progress events --app "python app.py" --outputs-file ./cdk-exports.json
# Export metadata
- seedfarmer metadata convert -f cdk-exports.json || true
post_build:
commands:
- echo "Build successful"

- aws ecr describe-repositories --repository-names ${SEEDFARMER_PARAMETER_ECR_REPOSITORY_NAME} || aws ecr create-repository --repository-name ${SEEDFARMER_PARAMETER_ECR_REPOSITORY_NAME} --image-scanning-configuration scanOnPush=true
- export COMMIT_HASH=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-7)
- export IMAGE_TAG=${COMMIT_HASH:=latest}
- export REPOSITORY_URI=$AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com/${SEEDFARMER_PARAMETER_ECR_REPOSITORY_NAME}
- aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin $AWS_ACCOUNT_ID.dkr.ecr.$AWS_DEFAULT_REGION.amazonaws.com
- echo Building the Docker image...
- cd src/ && docker build -t $REPOSITORY_URI:latest .
- docker tag $REPOSITORY_URI:latest $REPOSITORY_URI:$IMAGE_TAG
- docker push $REPOSITORY_URI:latest && docker push $REPOSITORY_URI:$IMAGE_TAG
- seedfarmer metadata add -k ImageUri -v $REPOSITORY_URI:latest
destroy:
phases:
install:
commands:
- npm install -g [email protected]
- pip install -r requirements.txt
build:
commands:
- cdk destroy --force --app "python app.py"
post_build:
commands:
- echo "Destroy successful"
- aws ecr delete-repository --repository-name ${SEEDFARMER_PARAMETER_ECR_REPOSITORY_NAME} --force
# build_type: BUILD_GENERAL1_LARGE
37 changes: 0 additions & 37 deletions modules/mlflow/mlflow-image/integ/integ_image.py

This file was deleted.

42 changes: 42 additions & 0 deletions modules/mlflow/mlflow-image/modulestack.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
AWSTemplateFormatVersion: 2010-09-09
Description: This stack deploys a Module specific IAM permissions

Parameters:
# DeploymentName:
# Type: String
# Description: The name of the deployment
# ModuleName:
# Type: String
# Description: The name of the Module
RoleName:
Type: String
Description: The name of the IAM Role
ECRRepositoryName:
Type: String
Description: The name of the ECR repository

Resources:
Policy:
Type: "AWS::IAM::Policy"
Properties:
PolicyDocument:
Statement:
- Effect: Allow
Action:
- "ecr:Describe*"
- "ecr:Get*"
- "ecr:List*"
Resource: "*"
- Action:
- "ecr:Create*"
- "ecr:Delete*"
- "ecr:*LayerUpload"
- "ecr:UploadLayerPart"
- "ecr:Batch*"
- "ecr:Put*"
Effect: Allow
Resource:
- !Sub "arn:${AWS::Partition}:ecr:${AWS::Region}:${AWS::AccountId}:repository/${ECRRepositoryName}"
Version: 2012-10-17
PolicyName: "modulespecific-policy"
Roles: [!Ref RoleName]
45 changes: 0 additions & 45 deletions modules/mlflow/mlflow-image/pyproject.toml

This file was deleted.

7 changes: 0 additions & 7 deletions modules/mlflow/mlflow-image/requirements.in

This file was deleted.

Loading

0 comments on commit 62aa417

Please sign in to comment.