Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(mlflow): Add s3 artifact type #116

Merged
merged 22 commits into from
Oct 10, 2024

Conversation

deadlycoconuts
Copy link
Contributor

@deadlycoconuts deadlycoconuts commented Sep 9, 2024

Context

Given the need to support other Mlflow registries other than Google Cloud Storage, this PR introduces support for S3-based registries to the artifact package, allowing other CaraML components such as Merlin and Turing to work with Mlflow that use S3-based artifact registries.

Main Changes

  • api/pkg/artifact/artifact.go - Created a new interface SchemeInterface and added a new implementation of the Service interface (S3ArtifactClient)

Other Modifications

To introduce support for S3, the AWS Golang SDK (version 2) will be introduced as a new dependency of MLP. This change requires updating the version of Go from 1.20 to 1.22, which further requires upgrades to our linter versions (which now report more errors than before), GitHub actions, etc. This PR thus includes those changes which may seem irrelevant to the original scope of this PR as described in its title.

Here's a list of some of those changes:

  • Updating the version of Go to 1.22
  • Updating the version of golangci-lint-action to v6 and the golangci-lint to v1.58.1
  • Deleting of deprecated linters
  • Naming unused arguments as _

Random comment:
If anyone finds the naming of client/service a little confusing in this artifact package, we can rename them in this PR also (I'm open for any suggestions 😅).

@deadlycoconuts deadlycoconuts self-assigned this Sep 9, 2024
Copy link

codecov bot commented Sep 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 37.87%. Comparing base (9b92012) to head (f6e5f3e).

❗ There is a different number of reports uploaded between BASE (9b92012) and HEAD (f6e5f3e). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (9b92012) HEAD (f6e5f3e)
api-test 2 1
Additional details and impacted files
@@             Coverage Diff             @@
##             main     #116       +/-   ##
===========================================
- Coverage   56.39%   37.87%   -18.53%     
===========================================
  Files          48       68       +20     
  Lines        2477     3976     +1499     
===========================================
+ Hits         1397     1506      +109     
- Misses        884     2263     +1379     
- Partials      196      207       +11     
Flag Coverage Δ
api-test 37.87% <ø> (-18.53%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@deadlycoconuts deadlycoconuts added the enhancement New feature or request label Sep 18, 2024
api/pkg/artifact/artifact.go Outdated Show resolved Hide resolved
api/pkg/artifact/artifact.go Outdated Show resolved Hide resolved
api/pkg/artifact/artifact.go Outdated Show resolved Hide resolved
api/pkg/artifact/artifact.go Show resolved Hide resolved
api/pkg/artifact/artifact.go Outdated Show resolved Hide resolved
api/it/database/database.go Outdated Show resolved Hide resolved
@deadlycoconuts
Copy link
Contributor Author

deadlycoconuts commented Sep 25, 2024

It seems like the codecov report is getting a lot of unexpected coverage drops and I think it might be due to the way we are using codecov: https://docs.codecov.com/docs/unexpected-coverage-changes. I don't think I'd be able to identify and resolve the root cause in the scope of this PR though; I'll just log this as a card and then potentially remove the codecov CICD job temporarily so that it doesn't make the entire CICD pipeline appear like it failed (when in reality it's due to an unexpected codecov job failure).

Copy link
Contributor

@tiopramayudi tiopramayudi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding two more comments/questions, the rest is LGTM! Thanks @deadlycoconuts

api/pkg/artifact/artifact.go Show resolved Hide resolved
api/pkg/artifact/artifact.go Show resolved Hide resolved
@deadlycoconuts
Copy link
Contributor Author

Thanks for the review @tiopramayudi ! I'll be merging this PR shortly!

@deadlycoconuts deadlycoconuts merged commit a98c895 into caraml-dev:main Oct 10, 2024
8 checks passed
@deadlycoconuts deadlycoconuts deleted the add_s3_artifact_type branch October 10, 2024 04:31
deadlycoconuts added a commit to caraml-dev/merlin that referenced this pull request Nov 8, 2024
…registries and s3 bucket support (#605)

## Note 🚨
~~This PR should not be merged without the changes in
caraml-dev/mlp#116 being merged, published and
imported as dependencies first. This PR is currently only using a branch
of a fork (the source of the quoted PR) of the MLP repository.~~ The
dependent PR has been merged.

# Description
In order to provide support for using image registries that use [Docker
registry
credentials](https://docs.docker.com/reference/cli/docker/login/#configure-the-credential-store)
as well as [AWS S3-based blob storage
services](https://docs.aws.amazon.com/code-library/latest/ug/go_2_s3_code_examples.html),
this PR refactors the API server to support these additional image
registry and bob storage options. More concretely, these are the
following changes made:

- Set up the workflow needed to allow platform maintainers to configure
the image registry that the API server will push images to (Docker or
Google Cloud/Artifact Registry) as well as the blob storage service that
it should read model artifacts from/write files to (S3-based
store/Google Cloud Storage)
- Allow the API server to access a configured Docker registry to check
if an image is available
- Allow the API server to check and hash model dependencies in a
configured S3-based store
- Allow Kaniko jobs spun up by the API server to use load model
artifacts from a configured S3-based store when building model images
- Allow Kaniko jobs spun up by the API server to push images to the
configured Docker registry

# Modifications
- `api/cmd/api/setup.go` - Make the initialisation of the image builder
set up the artifact service type and docker registry correctly depending
on the one set up
- `api/config/config.go` - Introduce new configs for platform
maintainers to specify the `KanikoPushRegistryType` and the
`KanikoDockerCredentialSecretName`
- `api/pkg/imagebuilder/imagebuilder.go` - Make changes to the image
builder to configure the Kaniko job spec correctly depending on the
selected registry type and blob storage type
- `python/batch-predictor/docker/app.Dockerfile` - Add steps to the
batch predictor docker image to authenticate and pull model artifacts
correctly depending on the configured blob storage type
- `python/batch-predictor/docker/base.Dockerfile` - Add steps to the
base batch predictor image to install the AWS CLI
- `python/pyfunc-server/docker/Dockerfile` - Add steps to the pyfunc
server docker image to authenticate and pull model artifacts correctly
depending on the configured blob storage type
- `python/pyfunc-server/docker/base.Dockerfile` - Add steps to the base
pyfunc server image to install the AWS CLI

# Tests
<!-- Besides the existing / updated automated tests, what specific
scenarios should be tested? Consider the backward compatibility of the
changes, whether corner cases are covered, etc. Please describe the
tests and check the ones that have been completed. Eg:
- [x] Deploying new and existing standard models
- [ ] Deploying PyFunc models
-->

# Checklist
- [x] Added PR label
- [x] Added unit test, integration, and/or e2e tests
- [x] Tested locally
- [ ] Updated documentation
- [ ] Update Swagger spec if the PR introduce API changes
- [ ] Regenerated Golang and Python client if the PR introduces API
changes

# Release Notes
<!--
Does this PR introduce a user-facing change?
If no, just write "NONE" in the release-note block below.
If yes, a release note is required. Enter your extended release note in
the block below.
If the PR requires additional action from users switching to the new
release, include the string "action required".

For more information about release notes, see kubernetes' guide here:
http://git.k8s.io/community/contributors/guide/release-notes.md
-->

```release-note
NONE
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants