Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference params support #9068

Merged
merged 11 commits into from
Jul 28, 2023
Merged

Inference params support #9068

merged 11 commits into from
Jul 28, 2023

Conversation

serena-ruan
Copy link
Collaborator

@serena-ruan serena-ruan commented Jul 13, 2023

Related Issues/PRs

Inference params are parameters that are passed to the model at inference time. These parameters do not need to be specified when training the model, but could be useful for inference. In some cases, especially popular LLMs, the same model may require different parameter configurations for different samples at inference time.

With params support, you can now specify a dictionary of inference params during model inference, providing a broader utility and improved control over the generated inference results, particularly for LLM use cases. By passing different params such as temperature, max_length, etc. to the model at inference time, you can easily control the output of the model.

What changes are proposed in this pull request?

(Please fill in changes proposed in this fix)

How is this patch tested?

  • Existing unit/integration tests
  • New unit/integration tests
  • Manual tests (describe details, including test results, below)

Does this PR change the documentation?

  • No. You can skip the rest of this section.
  • Yes. Make sure the changed pages / sections render correctly in the documentation preview.

Release Notes

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

Support extra params for model inference.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/gateway: AI Gateway service, Gateway client APIs, third-party Gateway integrations
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/recipes: Recipes, Recipe APIs, Recipe configs, Recipe Templates
  • area/projects: MLproject format, project running backends
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/server-infra: MLflow Tracking server backend
  • area/tracking: Tracking Service, tracking client APIs, autologging

Interface

  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

Language

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

Integrations

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

serena-ruan and others added 7 commits July 13, 2023 23:30
…gnature (#8972)

* add ParamSchema and ParamSpec in model signature

Signed-off-by: Serena Ruan <[email protected]>

* update DataType and add tests

Signed-off-by: Serena Ruan <[email protected]>

* convert all params to python native types

Signed-off-by: Serena Ruan <[email protected]>

* remove useless variable

Signed-off-by: Serena Ruan <[email protected]>

* add INVALID_PARAMETER_VALUE

Signed-off-by: Serena Ruan <[email protected]>

* fix _infer_param_schema

Signed-off-by: Serena Ruan <[email protected]>

* address comments

Signed-off-by: Serena Ruan <[email protected]>

* add test case for _find_duplicates

Signed-off-by: Serena Ruan <[email protected]>

* rename type to dtype

Signed-off-by: Serena Ruan <[email protected]>

* fix test failure in windows

Signed-off-by: Serena Ruan <[email protected]>

* remove pylint disable and fix windows test

Signed-off-by: Serena Ruan <[email protected]>

---------

Signed-off-by: Serena Ruan <[email protected]>
* add predict params for pyfunc and python model

Signed-off-by: Serena Ruan <[email protected]>

* fix

Signed-off-by: Serena Ruan <[email protected]>

* add more tests and fix a small bug

Signed-off-by: Serena Ruan <[email protected]>

* address comments

Signed-off-by: Serena Ruan <[email protected]>

* add warnings for params missing case

Signed-off-by: Serena Ruan <[email protected]>

* reformat

Signed-off-by: Serena Ruan <[email protected]>

---------

Signed-off-by: Serena Ruan <[email protected]>
* add inference params for all flavors

Signed-off-by: Serena Ruan <[email protected]>

* fix and update tests

Signed-off-by: Serena Ruan <[email protected]>

* update sklearn test

Signed-off-by: Serena Ruan <[email protected]>

* update sklearn test

Signed-off-by: Serena Ruan <[email protected]>

* address comments

Signed-off-by: Serena Ruan <[email protected]>

* add unused for pylint

Signed-off-by: Serena Ruan <[email protected]>

* update pylint

Signed-off-by: Serena Ruan <[email protected]>

---------

Signed-off-by: Serena Ruan <[email protected]>
…ng (#8976)

Signed-off-by: Serena Ruan <[email protected]>
Signed-off-by: Serena Ruan <[email protected]>
Co-authored-by: Harutaka Kawamura <[email protected]>
@serena-ruan serena-ruan marked this pull request as ready for review July 21, 2023 06:18
@mlflow-automation
Copy link
Collaborator

mlflow-automation commented Jul 27, 2023

Documentation preview for 01eeba2 will be available here when this CircleCI job completes successfully.

More info

Copy link
Collaborator

@dbczumar dbczumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM based on extensive bug bashing and the fact that all of the individual PRs that went into this feature branch were reviewed, tested, and approved. Thanks so much, @serena-ruan !

Signed-off-by: Serena Ruan <[email protected]>
@github-actions github-actions bot added area/models MLmodel format, model serialization/deserialization, flavors area/scoring MLflow Model server, model deployment tools, Spark UDFs rn/feature Mention under Features in Changelogs. labels Jul 28, 2023
@serena-ruan serena-ruan added the only-latest If applied, only test the latest version of each group in cross-version tests. label Jul 28, 2023
@serena-ruan serena-ruan merged commit 916764d into master Jul 28, 2023
outputs: '[{"name": "output", "type": "string"}]'
params: '[{"name": "temperature", "type": "float", "default": 0.5, "shape": null},
{"name": "top_k", "type": "integer", "default": 1, "shape": null},
{"name": "suppress_tokens", "type": "integer", "default": [101, 102], "shape": [-1]}]'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@serena-ruan Does it mean the "shape" is only supposed to be used with tensor values?
If "shape" also applies to lists, then for a list like [101, 102], shouldn't the "shape" of it be (2,)?
Maybe this can be clarified by providing an example of a value whose shape is (2, 3), so that people understand the regular case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shape is supposed to only be used for list/array values, and currently params only support scalar value or 1D array value, so shape should either be None or (-1,). And for list like in this case [101, 102] it's a 1-dimensional array so the shape is (-1,), actually for whatever 1D array shape (-1,) always work.

params=inference_config,
)

# Saving model without inference_config
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if I also save the model with inference_config? Is it not allowed? If that is allowed, if I run inference without params, would the inference_config or default params be used?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great point!! I should add a test case for this. It is allowed, and if you run inference without params, then both inference_config and default params in the ModelSignature are applied, if there're overlaps params takes priority.

BenWilson2 pushed a commit to BenWilson2/mlflow that referenced this pull request Jul 31, 2023
Signed-off-by: Serena Ruan <[email protected]>
Signed-off-by: Serena Ruan <[email protected]>
Co-authored-by: Harutaka Kawamura <[email protected]>
santiagxf pushed a commit to santiagxf/mlflow that referenced this pull request Aug 7, 2023
Signed-off-by: Serena Ruan <[email protected]>
Signed-off-by: Serena Ruan <[email protected]>
Co-authored-by: Harutaka Kawamura <[email protected]>
clarkh-ncino pushed a commit to ncino/mlflow that referenced this pull request Aug 23, 2023
Signed-off-by: Serena Ruan <[email protected]>
Signed-off-by: Serena Ruan <[email protected]>
Co-authored-by: Harutaka Kawamura <[email protected]>
Signed-off-by: Clark Hollar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/models MLmodel format, model serialization/deserialization, flavors area/scoring MLflow Model server, model deployment tools, Spark UDFs only-latest If applied, only test the latest version of each group in cross-version tests. rn/feature Mention under Features in Changelogs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants