From 0a57d9437a14e3e1a7198c5c95a0cc7fbc08544a Mon Sep 17 00:00:00 2001 From: zhichao-aws Date: Tue, 16 Jul 2024 18:06:32 +0800 Subject: [PATCH 1/2] improve wording for ns Signed-off-by: zhichao-aws --- _search-plugins/neural-sparse-search.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_search-plugins/neural-sparse-search.md b/_search-plugins/neural-sparse-search.md index b2b4fc33d6..c540b95884 100644 --- a/_search-plugins/neural-sparse-search.md +++ b/_search-plugins/neural-sparse-search.md @@ -16,8 +16,8 @@ Introduced 2.11 When selecting a model, choose one of the following options: -- Use a sparse encoding model at both ingestion time and search time (high performance, relatively high latency). -- Use a sparse encoding model at ingestion time and a tokenizer at search time for relatively low performance and low latency. The tokenism doesn't conduct model inference, so you can deploy and invoke a tokenizer using the ML Commons Model API for a more consistent experience. +- Use a sparse encoding model at both ingestion time and search time for high search relevance with relatively high latency. +- Use a sparse encoding model at ingestion time and a tokenizer at search time for low search latency with relatively low search relevance. The tokenism doesn't conduct model inference, so you can deploy and invoke a tokenizer using the ML Commons Model API for a more consistent experience. **PREREQUISITE**
Before using neural sparse search, make sure to set up a [pretrained sparse embedding model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models) or your own sparse embedding model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model). From 00cef50da96f6112d3f2977bdbd40f4aa8b4244e Mon Sep 17 00:00:00 2001 From: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> Date: Tue, 16 Jul 2024 10:34:18 -0400 Subject: [PATCH 2/2] Apply suggestions from code review Signed-off-by: kolchfa-aws <105444904+kolchfa-aws@users.noreply.github.com> --- _search-plugins/neural-sparse-search.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/_search-plugins/neural-sparse-search.md b/_search-plugins/neural-sparse-search.md index c540b95884..8aa2ff7dbf 100644 --- a/_search-plugins/neural-sparse-search.md +++ b/_search-plugins/neural-sparse-search.md @@ -16,8 +16,8 @@ Introduced 2.11 When selecting a model, choose one of the following options: -- Use a sparse encoding model at both ingestion time and search time for high search relevance with relatively high latency. -- Use a sparse encoding model at ingestion time and a tokenizer at search time for low search latency with relatively low search relevance. The tokenism doesn't conduct model inference, so you can deploy and invoke a tokenizer using the ML Commons Model API for a more consistent experience. +- Use a sparse encoding model at both ingestion time and search time for better search relevance at the expense of relatively high latency. +- Use a sparse encoding model at ingestion time and a tokenizer at search time for lower search latency at the expense of relatively lower search relevance. Tokenization doesn't involve model inference, so you can deploy and invoke a tokenizer using the ML Commons Model API for a more streamlined experience. **PREREQUISITE**
Before using neural sparse search, make sure to set up a [pretrained sparse embedding model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/pretrained-models/#sparse-encoding-models) or your own sparse embedding model. For more information, see [Choosing a model]({{site.url}}{{site.baseurl}}/ml-commons-plugin/integrating-ml-models/#choosing-a-model).