You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently for neural sparse query, users need to register a sparse_encoding/sparse_tokenize model in advance and provide the model id in query body. For bi-encoder mode, we do need the ml-commons suite to manage the lifecycle of sparse encoding models. But for doc-only mode, we only use a tokenizer for query, and it would be somehow heavy to manage it with ml-commons suite. There will be several drawbacks:
users need to configure the only_run_on_ml_node settings to enable the tokenizer on data nodes
users need to register the model and manage the model groups, even to manage the model_id
the tokenizer predict requests will be dispatched among cluster nodes, which brings extra traffic cost
What are you proposing?
Build the analyzer-based neural sparse query. The sparse_tokenize model will be wrapped as a Lucene Analyzer. Users bind the analyzer to index field, and the neural sparse query will call the analyzer to encode the query.
The pretrained amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1 will be supported as pre-defined tokenizer. The token weight is encoded in payload attribute.
Besides used for neural sparse query, the analyzer can also be invoked like others. E.g. analyze API, chunking processor.
What is the developer experience going to be?
Will alter the model_id verification logics at neural sparse query builder. And add the pre-defined bert analyzer.
Are there any security considerations?
N/A
Are there any breaking changes to the API
We'll support a new query type for neural sparse query. I.e. users can bind the analyzer to index field, instead of providing the model id in query body.
Will modify the neural sparse query logics. The model id is not required. And it will read analyzer from shard context, then use analyzer to encode the query text.
Will use HuggingFaceTokenizer implementation from djl library. DJL is already a dependency in ml-commons.
Will put the config file of amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1 to plugin resource directory.
The text was updated successfully, but these errors were encountered:
By "analyzer": "bert_tokenizer", do you mean that bert_tokenizer is a built-in tokenizer? What are other supported tokenzer?
You mention that then use analyzer to encode the query text. Can you elaborate more? For example, whether the user needs to register the sparse encoding model first and how does the analyzer locate the model for encoding.
The RFC is targeted for neural sparse query. Is there any blocker for the neural dense query? Perhaps the RFC should consider both queries.
You mention that then use analyzer to encode the query text. Can you elaborate more? For example, whether the user needs to register the sparse encoding model first and how does the analyzer locate the model for encoding.
Users only need to configure the analyzer in index mappings. No need for register model.
The RFC is targeted for neural sparse query. Is there any blocker for the neural dense query? Perhaps the RFC should consider both queries.
I don't see the overlap of tokenizer and neural dense query. The tokenizer can't work along for dense retrieval, and the text embedding model contains tokenizers
What/Why
What problems are you trying to solve?
Currently for neural sparse query, users need to register a sparse_encoding/sparse_tokenize model in advance and provide the model id in query body. For bi-encoder mode, we do need the ml-commons suite to manage the lifecycle of sparse encoding models. But for doc-only mode, we only use a tokenizer for query, and it would be somehow heavy to manage it with ml-commons suite. There will be several drawbacks:
What are you proposing?
Build the analyzer-based neural sparse query. The sparse_tokenize model will be wrapped as a Lucene Analyzer. Users bind the analyzer to index field, and the neural sparse query will call the analyzer to encode the query.
The pretrained
amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1
will be supported as pre-defined tokenizer. The token weight is encoded in payload attribute.Besides used for neural sparse query, the analyzer can also be invoked like others. E.g. analyze API, chunking processor.
What is the developer experience going to be?
Will alter the model_id verification logics at neural sparse query builder. And add the pre-defined bert analyzer.
Are there any security considerations?
N/A
Are there any breaking changes to the API
We'll support a new query type for neural sparse query. I.e. users can bind the analyzer to index field, instead of providing the model id in query body.
What is the user experience going to be?
create index
search
What will it take to execute?
HuggingFaceTokenizer
implementation from djl library. DJL is already a dependency in ml-commons.amazon/neural-sparse/opensearch-neural-sparse-tokenizer-v1
to plugin resource directory.The text was updated successfully, but these errors were encountered: