Skip to content

Commit

Permalink
[DOCS] Adds link to tutorial and API docs to trained model autoscalin…
Browse files Browse the repository at this point in the history
  • Loading branch information
szabosteve authored Oct 16, 2024
1 parent 0cd306f commit ccf6ab9
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 13 deletions.
20 changes: 10 additions & 10 deletions docs/reference/inference/service-elser.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -80,12 +80,13 @@ Must be a power of 2. Max allowed value is 32.
[[inference-example-elser]]
==== ELSER service example

The following example shows how to create an {infer} endpoint called
`my-elser-model` to perform a `sparse_embedding` task type.
The following example shows how to create an {infer} endpoint called `my-elser-model` to perform a `sparse_embedding` task type.
Refer to the {ml-docs}/ml-nlp-elser.html[ELSER model documentation] for more info.

The request below will automatically download the ELSER model if it isn't
already downloaded and then deploy the model.
NOTE: If you want to optimize your ELSER endpoint for ingest, set the number of threads to `1` (`"num_threads": 1`).
If you want to optimize your ELSER endpoint for search, set the number of threads to greater than `1`.

The request below will automatically download the ELSER model if it isn't already downloaded and then deploy the model.

[source,console]
------------------------------------------------------------
Expand All @@ -100,7 +101,6 @@ PUT _inference/sparse_embedding/my-elser-model
------------------------------------------------------------
// TEST[skip:TBD]


Example response:

[source,console-result]
Expand Down Expand Up @@ -130,12 +130,12 @@ If using the Python client, you can set the `timeout` parameter to a higher valu
[[inference-example-elser-adaptive-allocation]]
==== Setting adaptive allocation for the ELSER service

The following example shows how to create an {infer} endpoint called
`my-elser-model` to perform a `sparse_embedding` task type and configure
adaptive allocations.
NOTE: For more information on how to optimize your ELSER endpoints, refer to {ml-docs}/ml-nlp-elser.html#elser-recommendations[the ELSER recommendations] section in the model documentation.
To learn more about model autoscaling, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] page.

The following example shows how to create an {infer} endpoint called `my-elser-model` to perform a `sparse_embedding` task type and configure adaptive allocations.

The request below will automatically download the ELSER model if it isn't
already downloaded and then deploy the model.
The request below will automatically download the ELSER model if it isn't already downloaded and then deploy the model.

[source,console]
------------------------------------------------------------
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ PUT _inference/sparse_embedding/my-elser-endpoint <1>
be used and ELSER creates sparse vectors. The `inference_id` is
`my-elser-endpoint`.
<2> The `elser` service is used in this example.
<3> This setting enables and configures adaptive allocations.
<3> This setting enables and configures {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[adaptive allocations].
Adaptive allocations make it possible for ELSER to automatically scale up or down resources based on the current load on the process.

[NOTE]
Expand Down Expand Up @@ -284,6 +284,8 @@ query from the `semantic-embedding` index:

[discrete]
[[semantic-text-further-examples]]
==== Further examples
==== Further examples and reading

If you want to use `semantic_text` in hybrid search, refer to https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb[this notebook] for a step-by-step guide.
* If you want to use `semantic_text` in hybrid search, refer to https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb[this notebook] for a step-by-step guide.
* For more information on how to optimize your ELSER endpoints, refer to {ml-docs}/ml-nlp-elser.html#elser-recommendations[the ELSER recommendations] section in the model documentation.
* To learn more about model autoscaling, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] page.

0 comments on commit ccf6ab9

Please sign in to comment.