-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(mlflow): Add s3 artifact type #116
feat(mlflow): Add s3 artifact type #116
Conversation
6cc8085
to
de72ee9
Compare
de72ee9
to
f5a004d
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #116 +/- ##
===========================================
- Coverage 56.39% 37.87% -18.53%
===========================================
Files 48 68 +20
Lines 2477 3976 +1499
===========================================
+ Hits 1397 1506 +109
- Misses 884 2263 +1379
- Partials 196 207 +11
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
It seems like the codecov report is getting a lot of unexpected coverage drops and I think it might be due to the way we are using codecov: https://docs.codecov.com/docs/unexpected-coverage-changes. I don't think I'd be able to identify and resolve the root cause in the scope of this PR though; I'll just log this as a card and then potentially remove the codecov CICD job temporarily so that it doesn't make the entire CICD pipeline appear like it failed (when in reality it's due to an unexpected codecov job failure). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding two more comments/questions, the rest is LGTM! Thanks @deadlycoconuts
20a9b19
to
f6e5f3e
Compare
Thanks for the review @tiopramayudi ! I'll be merging this PR shortly! |
…registries and s3 bucket support (#605) ## Note 🚨 ~~This PR should not be merged without the changes in caraml-dev/mlp#116 being merged, published and imported as dependencies first. This PR is currently only using a branch of a fork (the source of the quoted PR) of the MLP repository.~~ The dependent PR has been merged. # Description In order to provide support for using image registries that use [Docker registry credentials](https://docs.docker.com/reference/cli/docker/login/#configure-the-credential-store) as well as [AWS S3-based blob storage services](https://docs.aws.amazon.com/code-library/latest/ug/go_2_s3_code_examples.html), this PR refactors the API server to support these additional image registry and bob storage options. More concretely, these are the following changes made: - Set up the workflow needed to allow platform maintainers to configure the image registry that the API server will push images to (Docker or Google Cloud/Artifact Registry) as well as the blob storage service that it should read model artifacts from/write files to (S3-based store/Google Cloud Storage) - Allow the API server to access a configured Docker registry to check if an image is available - Allow the API server to check and hash model dependencies in a configured S3-based store - Allow Kaniko jobs spun up by the API server to use load model artifacts from a configured S3-based store when building model images - Allow Kaniko jobs spun up by the API server to push images to the configured Docker registry # Modifications - `api/cmd/api/setup.go` - Make the initialisation of the image builder set up the artifact service type and docker registry correctly depending on the one set up - `api/config/config.go` - Introduce new configs for platform maintainers to specify the `KanikoPushRegistryType` and the `KanikoDockerCredentialSecretName` - `api/pkg/imagebuilder/imagebuilder.go` - Make changes to the image builder to configure the Kaniko job spec correctly depending on the selected registry type and blob storage type - `python/batch-predictor/docker/app.Dockerfile` - Add steps to the batch predictor docker image to authenticate and pull model artifacts correctly depending on the configured blob storage type - `python/batch-predictor/docker/base.Dockerfile` - Add steps to the base batch predictor image to install the AWS CLI - `python/pyfunc-server/docker/Dockerfile` - Add steps to the pyfunc server docker image to authenticate and pull model artifacts correctly depending on the configured blob storage type - `python/pyfunc-server/docker/base.Dockerfile` - Add steps to the base pyfunc server image to install the AWS CLI # Tests <!-- Besides the existing / updated automated tests, what specific scenarios should be tested? Consider the backward compatibility of the changes, whether corner cases are covered, etc. Please describe the tests and check the ones that have been completed. Eg: - [x] Deploying new and existing standard models - [ ] Deploying PyFunc models --> # Checklist - [x] Added PR label - [x] Added unit test, integration, and/or e2e tests - [x] Tested locally - [ ] Updated documentation - [ ] Update Swagger spec if the PR introduce API changes - [ ] Regenerated Golang and Python client if the PR introduces API changes # Release Notes <!-- Does this PR introduce a user-facing change? If no, just write "NONE" in the release-note block below. If yes, a release note is required. Enter your extended release note in the block below. If the PR requires additional action from users switching to the new release, include the string "action required". For more information about release notes, see kubernetes' guide here: http://git.k8s.io/community/contributors/guide/release-notes.md --> ```release-note NONE ```
Context
Given the need to support other Mlflow registries other than Google Cloud Storage, this PR introduces support for S3-based registries to the
artifact
package, allowing other CaraML components such as Merlin and Turing to work with Mlflow that use S3-based artifact registries.Main Changes
api/pkg/artifact/artifact.go
- Created a new interfaceSchemeInterface
and added a new implementation of theService
interface (S3ArtifactClient
)Other Modifications
To introduce support for S3, the AWS Golang SDK (version 2) will be introduced as a new dependency of MLP. This change requires updating the version of Go from 1.20 to 1.22, which further requires upgrades to our linter versions (which now report more errors than before), GitHub actions, etc. This PR thus includes those changes which may seem irrelevant to the original scope of this PR as described in its title.
Here's a list of some of those changes:
_
Random comment:
If anyone finds the naming of client/service a little confusing in this artifact package, we can rename them in this PR also (I'm open for any suggestions 😅).