diff --git a/_search-plugins/neural-sparse-search.md b/_search-plugins/neural-sparse-search.md index b2b4fc33d6..8aa2ff7dbf 100644 --- a/_search-plugins/neural-sparse-search.md +++ b/_search-plugins/neural-sparse-search.md @@ -16,8 +16,8 @@ Introduced 2.11 When selecting a model, choose one of the following options: -- Use a sparse encoding model at both ingestion time and search time (high performance, relatively high latency). -- Use a sparse encoding model at ingestion time and a tokenizer at search time for relatively low performance and low latency. The tokenism doesn't conduct model inference, so you can deploy and invoke a tokenizer using the ML Commons Model API for a more consistent experience. +- Use a sparse encoding model at both ingestion time and search time for better search relevance at the expense of relatively high latency. +- Use a sparse encoding model at ingestion time and a tokenizer at search time for lower search latency at the expense of relatively lower search relevance. Tokenization doesn't involve model inference, so you can deploy and invoke a tokenizer using the ML Commons Model API for a more streamlined experience. **PREREQUISITE**
Before using neural sparse search, make sure to set up a [pretrained sparse embedding model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models) or your own sparse embedding model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model).