diff --git a/docs/reference/inference/service-elser.asciidoc b/docs/reference/inference/service-elser.asciidoc index c7217f38d459b..6afc2a2e3ef65 100644 --- a/docs/reference/inference/service-elser.asciidoc +++ b/docs/reference/inference/service-elser.asciidoc @@ -80,12 +80,13 @@ Must be a power of 2. Max allowed value is 32. [[inference-example-elser]] ==== ELSER service example -The following example shows how to create an {infer} endpoint called -`my-elser-model` to perform a `sparse_embedding` task type. +The following example shows how to create an {infer} endpoint called `my-elser-model` to perform a `sparse_embedding` task type. Refer to the {ml-docs}/ml-nlp-elser.html[ELSER model documentation] for more info. -The request below will automatically download the ELSER model if it isn't -already downloaded and then deploy the model. +NOTE: If you want to optimize your ELSER endpoint for ingest, set the number of threads to `1` (`"num_threads": 1`). +If you want to optimize your ELSER endpoint for search, set the number of threads to greater than `1`. + +The request below will automatically download the ELSER model if it isn't already downloaded and then deploy the model. [source,console] ------------------------------------------------------------ @@ -100,7 +101,6 @@ PUT _inference/sparse_embedding/my-elser-model ------------------------------------------------------------ // TEST[skip:TBD] - Example response: [source,console-result] @@ -130,12 +130,12 @@ If using the Python client, you can set the `timeout` parameter to a higher valu [[inference-example-elser-adaptive-allocation]] ==== Setting adaptive allocation for the ELSER service -The following example shows how to create an {infer} endpoint called -`my-elser-model` to perform a `sparse_embedding` task type and configure -adaptive allocations. +NOTE: For more information on how to optimize your ELSER endpoints, refer to {ml-docs}/ml-nlp-elser.html#elser-recommendations[the ELSER recommendations] section in the model documentation. +To learn more about model autoscaling, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] page. + +The following example shows how to create an {infer} endpoint called `my-elser-model` to perform a `sparse_embedding` task type and configure adaptive allocations. -The request below will automatically download the ELSER model if it isn't -already downloaded and then deploy the model. +The request below will automatically download the ELSER model if it isn't already downloaded and then deploy the model. [source,console] ------------------------------------------------------------ diff --git a/docs/reference/search/search-your-data/semantic-search-semantic-text.asciidoc b/docs/reference/search/search-your-data/semantic-search-semantic-text.asciidoc index dbcfbb1b615f9..60692c19c184a 100644 --- a/docs/reference/search/search-your-data/semantic-search-semantic-text.asciidoc +++ b/docs/reference/search/search-your-data/semantic-search-semantic-text.asciidoc @@ -50,7 +50,7 @@ PUT _inference/sparse_embedding/my-elser-endpoint <1> be used and ELSER creates sparse vectors. The `inference_id` is `my-elser-endpoint`. <2> The `elser` service is used in this example. -<3> This setting enables and configures adaptive allocations. +<3> This setting enables and configures {ml-docs}/ml-nlp-auto-scale.html#nlp-model-adaptive-allocations[adaptive allocations]. Adaptive allocations make it possible for ELSER to automatically scale up or down resources based on the current load on the process. [NOTE] @@ -284,6 +284,8 @@ query from the `semantic-embedding` index: [discrete] [[semantic-text-further-examples]] -==== Further examples +==== Further examples and reading -If you want to use `semantic_text` in hybrid search, refer to https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb[this notebook] for a step-by-step guide. \ No newline at end of file +* If you want to use `semantic_text` in hybrid search, refer to https://colab.research.google.com/github/elastic/elasticsearch-labs/blob/main/notebooks/search/09-semantic-text.ipynb[this notebook] for a step-by-step guide. +* For more information on how to optimize your ELSER endpoints, refer to {ml-docs}/ml-nlp-elser.html#elser-recommendations[the ELSER recommendations] section in the model documentation. +* To learn more about model autoscaling, refer to the {ml-docs}/ml-nlp-auto-scale.html[trained model autoscaling] page. \ No newline at end of file