Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Checkpoint Loading from MLflow Model Registry #17618

Closed
sam-h-bean opened this issue Jun 9, 2022 · 5 comments
Closed

Add Checkpoint Loading from MLflow Model Registry #17618

sam-h-bean opened this issue Jun 9, 2022 · 5 comments

Comments

@sam-h-bean
Copy link
Contributor

Feature request

I would like the ability to pass a model URI that points to a model in an MLflow model registry then load a HuggingFace transformer directly from the registry.

Motivation

Model versioning and lifecycle management is a common practice in MLOps so I think it makes sense for this to be a first-class feature in HuggingFace.

Your contribution

I already have this functional but would like to contribute it back to the project so others can leverage the MLflow model registry to maintain their model lifecycles from development to production!

@sam-h-bean
Copy link
Contributor Author

sam-h-bean commented Jun 17, 2022

@sgugger this is linked to #17686

The code I have working currently looks like

import os
import mlflow
import glob


def download_from_registry(src_path, dst_path):
    if not os.path.isdir(dst_path):
        os.mkdir(dst_path)

    mlflow.pyfunc.load_model(src_path, dst_path=dst_path)

    return glob.glob(os.path.join(dst_path, "artifacts", "checkpoint-*"))[0]

model_path = download_from_registry("models:/my-model/1", "./my-model/")
model = AutoModelForSequenceClassification.from_pretrained(model_path)

and I'm wondering what you think this would look like contributed to the open-source. The code above only works once you have logged the checkpoint as an artifact and registered that model in the MLflow registry. However, it could be factored to also load the model from an MLflow run I think. wdyt?

@sgugger
Copy link
Collaborator

sgugger commented Jun 17, 2022

Hi @sam-h-bean !

We do not plan on supporting other model repositories than our model Hub for the from_pretrained method in Transformers. You should build a bridge to upload those checkpoints from MLFlow to the Hub and benefit from all the goodies we have such as the inference widget, model cards, community PRs etc. 😃

Your solution also works since we support local checkpoints, and only takes three lines of code as you demonstrated 😉

@sam-h-bean
Copy link
Contributor Author

Hey @sgugger does the Hub have support for private models? These are proprietary models and thus can not be made public. What is the suggest method for cases such as this? It seems like this is a case that will become more prevalent as more companies adopt large language models.

@sgugger
Copy link
Collaborator

sgugger commented Jun 20, 2022

Yes, you can have private models/datasets/spaces on the Hub. See the doc!

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants