Inference params support #9068

serena-ruan · 2023-07-13T15:35:31Z

Related Issues/PRs

Inference params are parameters that are passed to the model at inference time. These parameters do not need to be specified when training the model, but could be useful for inference. In some cases, especially popular LLMs, the same model may require different parameter configurations for different samples at inference time.

With params support, you can now specify a dictionary of inference params during model inference, providing a broader utility and improved control over the generated inference results, particularly for LLM use cases. By passing different params such as temperature, max_length, etc. to the model at inference time, you can easily control the output of the model.

What changes are proposed in this pull request?

(Please fill in changes proposed in this fix)

How is this patch tested?

Existing unit/integration tests
New unit/integration tests
Manual tests (describe details, including test results, below)

Does this PR change the documentation?

No. You can skip the rest of this section.
Yes. Make sure the changed pages / sections render correctly in the documentation preview.

Release Notes

Is this a user-facing change?

No. You can skip the rest of this section.
Yes. Give a description of this change to be included in the release notes for MLflow users.

Support extra params for model inference.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

Interface

area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
area/windows: Windows support

Language

language/r: R APIs and clients
language/java: Java APIs and clients
language/new: Proposals for new client languages

Integrations

integrations/azure: Azure and Azure ML integrations
integrations/sagemaker: SageMaker integrations
integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
rn/feature - A new user-facing feature worth mentioning in the release notes
rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
rn/documentation - A user-facing documentation change worth mentioning in the release notes

…gnature (#8972) * add ParamSchema and ParamSpec in model signature Signed-off-by: Serena Ruan <[email protected]> * update DataType and add tests Signed-off-by: Serena Ruan <[email protected]> * convert all params to python native types Signed-off-by: Serena Ruan <[email protected]> * remove useless variable Signed-off-by: Serena Ruan <[email protected]> * add INVALID_PARAMETER_VALUE Signed-off-by: Serena Ruan <[email protected]> * fix _infer_param_schema Signed-off-by: Serena Ruan <[email protected]> * address comments Signed-off-by: Serena Ruan <[email protected]> * add test case for _find_duplicates Signed-off-by: Serena Ruan <[email protected]> * rename type to dtype Signed-off-by: Serena Ruan <[email protected]> * fix test failure in windows Signed-off-by: Serena Ruan <[email protected]> * remove pylint disable and fix windows test Signed-off-by: Serena Ruan <[email protected]> --------- Signed-off-by: Serena Ruan <[email protected]>

* add predict params for pyfunc and python model Signed-off-by: Serena Ruan <[email protected]> * fix Signed-off-by: Serena Ruan <[email protected]> * add more tests and fix a small bug Signed-off-by: Serena Ruan <[email protected]> * address comments Signed-off-by: Serena Ruan <[email protected]> * add warnings for params missing case Signed-off-by: Serena Ruan <[email protected]> * reformat Signed-off-by: Serena Ruan <[email protected]> --------- Signed-off-by: Serena Ruan <[email protected]>

* add inference params for all flavors Signed-off-by: Serena Ruan <[email protected]> * fix and update tests Signed-off-by: Serena Ruan <[email protected]> * update sklearn test Signed-off-by: Serena Ruan <[email protected]> * update sklearn test Signed-off-by: Serena Ruan <[email protected]> * address comments Signed-off-by: Serena Ruan <[email protected]> * add unused for pylint Signed-off-by: Serena Ruan <[email protected]> * update pylint Signed-off-by: Serena Ruan <[email protected]> --------- Signed-off-by: Serena Ruan <[email protected]>

…ng (#8976) Signed-off-by: Serena Ruan <[email protected]> Signed-off-by: Serena Ruan <[email protected]> Co-authored-by: Harutaka Kawamura <[email protected]>

Signed-off-by: Serena Ruan <[email protected]>

mlflow-automation · 2023-07-27T05:48:22Z

Documentation preview for 01eeba2 will be available here when this CircleCI job completes successfully.

More info

Ignore this comment if this PR does not change the documentation.
It takes a few minutes for the preview to be available.
The preview is updated when a new commit is pushed to this PR.
This comment was created by https://github.com/mlflow/mlflow/actions/runs/5688711952.

dbczumar

LGTM based on extensive bug bashing and the fact that all of the individual PRs that went into this feature branch were reviewed, tested, and approved. Thanks so much, @serena-ruan !

Signed-off-by: Serena Ruan <[email protected]>

liangz1 · 2023-07-28T19:14:04Z

docs/source/models.rst

+        outputs: '[{"name": "output", "type": "string"}]'
+        params: '[{"name": "temperature", "type": "float", "default": 0.5, "shape": null},
+                  {"name": "top_k", "type": "integer", "default": 1, "shape": null},
+                  {"name": "suppress_tokens", "type": "integer", "default": [101, 102], "shape": [-1]}]'


@serena-ruan Does it mean the "shape" is only supposed to be used with tensor values?
If "shape" also applies to lists, then for a list like [101, 102], shouldn't the "shape" of it be (2,)?
Maybe this can be clarified by providing an example of a value whose shape is (2, 3), so that people understand the regular case.

shape is supposed to only be used for list/array values, and currently params only support scalar value or 1D array value, so shape should either be None or (-1,). And for list like in this case [101, 102] it's a 1-dimensional array so the shape is (-1,), actually for whatever 1D array shape (-1,) always work.

liangz1 · 2023-07-28T21:17:33Z

docs/source/models.rst

+        params=inference_config,
+    )
+
+    # Saving model without inference_config


What if I also save the model with inference_config? Is it not allowed? If that is allowed, if I run inference without params, would the inference_config or default params be used?

That's a great point!! I should add a test case for this. It is allowed, and if you run inference without params, then both inference_config and default params in the ModelSignature are applied, if there're overlaps params takes priority.

Signed-off-by: Serena Ruan <[email protected]> Signed-off-by: Serena Ruan <[email protected]> Co-authored-by: Harutaka Kawamura <[email protected]>

Signed-off-by: Serena Ruan <[email protected]> Signed-off-by: Serena Ruan <[email protected]> Co-authored-by: Harutaka Kawamura <[email protected]> Signed-off-by: Clark Hollar <[email protected]>

serena-ruan and others added 7 commits July 13, 2023 23:30

[Support inference params-4] Support inference params for model servi…

cd7d5ab

…ng (#8976) Signed-off-by: Serena Ruan <[email protected]> Signed-off-by: Serena Ruan <[email protected]> Co-authored-by: Harutaka Kawamura <[email protected]>

Add spark udf model serving test (#9110)

9355f8c

Signed-off-by: Serena Ruan <[email protected]>

[Support inference params-5]Add docs for inference params (#8977)

c05cb51

Signed-off-by: Serena Ruan <[email protected]>

fix sklearn raw model extraction (#9050)

e7e4852

Signed-off-by: Serena Ruan <[email protected]>

serena-ruan marked this pull request as ready for review July 21, 2023 06:18

Merge branch 'master' into inf_params

8cd9d7a

Signed-off-by: Serena Ruan <[email protected]>

dbczumar approved these changes Jul 28, 2023

View reviewed changes

fix bug

2741b7c

Signed-off-by: Serena Ruan <[email protected]>

serena-ruan force-pushed the inf_params branch from 176aec2 to 2741b7c Compare July 28, 2023 01:47

github-actions bot added area/models MLmodel format, model serialization/deserialization, flavors area/scoring MLflow Model server, model deployment tools, Spark UDFs rn/feature Mention under Features in Changelogs. labels Jul 28, 2023

serena-ruan added the only-latest If applied, only test the latest version of each group in cross-version tests. label Jul 28, 2023

serena-ruan added 2 commits July 28, 2023 10:45

fix test

9513674

Signed-off-by: Serena Ruan <[email protected]>

add sentencepiece for running transformers test

01eeba2

Signed-off-by: Serena Ruan <[email protected]>

serena-ruan merged commit 916764d into master Jul 28, 2023

liangz1 reviewed Jul 28, 2023

View reviewed changes

BenWilson2 pushed a commit to BenWilson2/mlflow that referenced this pull request Jul 31, 2023

Inference params support (mlflow#9068)

589ca04

Signed-off-by: Serena Ruan <[email protected]> Signed-off-by: Serena Ruan <[email protected]> Co-authored-by: Harutaka Kawamura <[email protected]>

santiagxf pushed a commit to santiagxf/mlflow that referenced this pull request Aug 7, 2023

Inference params support (mlflow#9068)

65b847b

Signed-off-by: Serena Ruan <[email protected]> Signed-off-by: Serena Ruan <[email protected]> Co-authored-by: Harutaka Kawamura <[email protected]>

M4nouel mentioned this pull request Sep 3, 2023

Add inference params support to MLFlow's custom invocation endpoint SeldonIO/MLServer#1374

Closed

pulungw mentioned this pull request Sep 22, 2023

pzmm.MLFlowModel.read_mlflow_model_file() failed with JSONDecodeError: Extra data sassoftware/python-sasctl#179

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference params support #9068

Inference params support #9068

serena-ruan commented Jul 13, 2023 •

edited

Loading

mlflow-automation commented Jul 27, 2023 •

edited

Loading

dbczumar left a comment

liangz1 Jul 28, 2023

serena-ruan Jul 29, 2023

liangz1 Jul 28, 2023

serena-ruan Jul 29, 2023

Inference params support #9068

Inference params support #9068

Conversation

serena-ruan commented Jul 13, 2023 • edited Loading

Related Issues/PRs

What changes are proposed in this pull request?

How is this patch tested?

Does this PR change the documentation?

Release Notes

Is this a user-facing change?

What component(s), interfaces, languages, and integrations does this PR affect?

How should the PR be classified in the release notes? Choose one:

mlflow-automation commented Jul 27, 2023 • edited Loading

dbczumar left a comment

Choose a reason for hiding this comment

liangz1 Jul 28, 2023

Choose a reason for hiding this comment

serena-ruan Jul 29, 2023

Choose a reason for hiding this comment

liangz1 Jul 28, 2023

Choose a reason for hiding this comment

serena-ruan Jul 29, 2023

Choose a reason for hiding this comment

serena-ruan commented Jul 13, 2023 •

edited

Loading

mlflow-automation commented Jul 27, 2023 •

edited

Loading