Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huggingface optimum prepackaged server #4081

Merged
merged 4 commits into from
May 11, 2022

Conversation

axsaucedo
Copy link
Contributor

@axsaucedo axsaucedo commented May 7, 2022

Which issue(s) this PR fixes:

Fixes #4082

Introduces MLServer Runtime: SeldonIO/MLServer#573

HuggingFace Server

Thanks to our collaboration with the HuggingFace team you can now easily deploy your models from the HuggingFace Hub with Seldon Core.

We also support the high performance optimizations provided by the Transformer Optimum framework.

Pipeline parameters

The parameters available:

Name Description
task The transformer pipeline task
pretrained_model The name of the pretrained model in the Hub
pretrained_tokenizer Transformer name in Hub if different to the one provided with model
optimum_model Boolean to enable loading model with Optimum framework

Simple Example

You can deploy a HuggingFace model by providing parameters to your pipeline.

apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
  name: gpt2-model
spec:
  protocol: v2
  predictors:
  - graph:
      name: transformer
      implementation: HUGGINGFACE_SERVER
      parameters:
      - name: task
        type: STRING
        value: text-generation
      - name: pretrained_model
        type: STRING
        value: distilgpt2
    name: default
    replicas: 1

Quantized & Optimized Models with Optimum

You can deploy a HuggingFace model loaded using the Optimum library by using the optimum_model parameter

apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
  name: gpt2-model
spec:
  protocol: v2
  predictors:
  - graph:
      name: transformer
      implementation: HUGGINGFACE_SERVER
      parameters:
      - name: task
        type: STRING
        value: text-generation
      - name: pretrained_model
        type: STRING
        value: distilgpt2
      - name: optimum_model
        type: BOOL
        value: true
    name: default
    replicas: 1

@axsaucedo axsaucedo changed the title WIP: Huggingface prepackaged server with optimum optimization WIP: Huggingface optimum prepackaged server May 7, 2022
@axsaucedo axsaucedo changed the title WIP: Huggingface optimum prepackaged server Huggingface optimum prepackaged server May 9, 2022
@axsaucedo axsaucedo requested review from ukclivecox and adriangonz May 9, 2022 08:59
@axsaucedo
Copy link
Contributor Author

/test integration

1 similar comment
@axsaucedo
Copy link
Contributor Author

/test integration

@axsaucedo
Copy link
Contributor Author

/test integration

@axsaucedo
Copy link
Contributor Author

It seems failed test is test_label_update which is known to be flaky

/test integration

@axsaucedo
Copy link
Contributor Author

/test integration

@axsaucedo axsaucedo requested a review from adriangonz May 10, 2022 13:24
@axsaucedo
Copy link
Contributor Author

All unit integration tests pass - should be good to merge
image

Copy link
Contributor

@adriangonz adriangonz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I think it should be ready to go @axsaucedo 👍

@seldondev
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: adriangonz

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@seldondev seldondev merged commit 3910a89 into SeldonIO:master May 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create huggingface optimum transformer prepackaged server
3 participants