You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Tell us about your request. Provide a summary of the request and all versions that are affected.
I'm requesting for us to document additional details related to the API / data exchange contract within Neural Search and AI connectors.
Provide a high-level picture to communicate the data exchange/flow between a neural search pipeline and an AI connector to an external model/AI service. Neural search pipeline <-> AI connector. Set the context that the following documentation is to describe the interface for neural search pipelines, so that the AI connector builder understands how to implement pre/post processors to satisfy the expected request/response format.
The request/response format varies by pipeline, so we need to document the request/response format for each:
neural search pipeline for text-embedding (bi-encoder) based semantic search.
neural search pipeline for sparse encoder based semantic search
neural search pipeline for conversational search
Please document the expected request/response formats not as examples, but as APIs and/or data schemas. This is an API / data schema and can't be properly documented as an example output.
Please contact @ylwu-amzn and his team for details. The following:
The following are some details, but required the engineering team to provide clarity so that we can produce documentation like the aforementioned examples.
Neural Search pipeline: text embeddings based semantic search pipeline
Request: the format of the request sent by neural search / enumerations for the "data_type" field. Thanks
{
Input: String Array
}
Input: a list of Strings containing the text provided by the user through the neural search query API.
Can you clarify what each item in the list represents?
Response: the format or the response expected by the neural search
{
name: String
data_type: String Const (“FLOAT32” | “INT8” … )
Shape: Int Array
Data: Float Array
}
Data: the data for one or more dense vector returned by the integrated embedding model.
Shape: the shape (tensor dimensions) for Data
data_type: granular data type information for "Data" for data type checking.
name: Need engineering to provide clarity to the user on the purpose of this field.
here are some examples provided by engineering. Again, these are examples and do not define the API contacts.
For text embedding, the sample input is
POST /_plugins/_ml/_predict/text_embedding/zwla5YUB1qmVrJFlwzXJ
{
"text_docs": [ "today is sunny" ]
}
Sample response
{
"inference_results": [
{
"output": [
{
"name": "sentence_embedding", # the name must be sentence_embedding
"data_type": "FLOAT32",
"shape": [
384
],
"data": [
-0.023314998,
0.08975688,
0.07847973,
...
]
}
]
}
]
}
2. For neural sparse , the sample input is same
POST /_plugins/_ml/_predict/sparse_encoding/zwla5YUB1qmVrJFlwzXJ
{
"text_docs": [ "today is sunny" ]
}
Sample response
{
"inference_results": [
{
"output": [
{
"name": "output",
"dataAsMap": {
"response": [
{
"tonight": 0.15685688,
"usa": 0.0455468,
"rating": 0.1603944,
...
}
]
}
}
]
}
]
}
(edited) 10:51
3. For cross-encoder, the input is different, check opensearch-project/ml-commons#1615
{
"query_text": "today is sunny",
"text_docs": [
"today is sunny",
"today is july fifth",
"it is winter"
]
}
What do you want to do?
Tell us about your request. Provide a summary of the request and all versions that are affected.
I'm requesting for us to document additional details related to the API / data exchange contract within Neural Search and AI connectors.
Provide a high-level picture to communicate the data exchange/flow between a neural search pipeline and an AI connector to an external model/AI service. Neural search pipeline <-> AI connector. Set the context that the following documentation is to describe the interface for neural search pipelines, so that the AI connector builder understands how to implement pre/post processors to satisfy the expected request/response format.
The request/response format varies by pipeline, so we need to document the request/response format for each:
Please document the expected request/response formats not as examples, but as APIs and/or data schemas. This is an API / data schema and can't be properly documented as an example output.
Here are a couple of example of how to properly document APIs: https://docs.anthropic.com/claude/reference/messages_post, https://docs.cohere.com/reference/embed. It includes both the contact/spec and examples for clarity.
Please contact @ylwu-amzn and his team for details. The following:
The following are some details, but required the engineering team to provide clarity so that we can produce documentation like the aforementioned examples.
Neural Search pipeline: text embeddings based semantic search pipeline
Request: the format of the request sent by neural search / enumerations for the "data_type" field. Thanks
{
Input: String Array
}
Input: a list of Strings containing the text provided by the user through the neural search query API.
Can you clarify what each item in the list represents?
Response: the format or the response expected by the neural search
{
name: String
data_type: String Const (“FLOAT32” | “INT8” … )
Shape: Int Array
Data: Float Array
}
Data: the data for one or more dense vector returned by the integrated embedding model.
Shape: the shape (tensor dimensions) for Data
data_type: granular data type information for "Data" for data type checking.
name: Need engineering to provide clarity to the user on the purpose of this field.
here are some examples provided by engineering. Again, these are examples and do not define the API contacts.
For text embedding, the sample input is
POST /_plugins/_ml/_predict/text_embedding/zwla5YUB1qmVrJFlwzXJ
{
"text_docs": [ "today is sunny" ]
}
Sample response
{
"inference_results": [
{
"output": [
{
"name": "sentence_embedding", # the name must be sentence_embedding
"data_type": "FLOAT32",
"shape": [
384
],
"data": [
-0.023314998,
0.08975688,
0.07847973,
...
]
}
]
}
]
}
2. For neural sparse , the sample input is same
POST /_plugins/_ml/_predict/sparse_encoding/zwla5YUB1qmVrJFlwzXJ
{
"text_docs": [ "today is sunny" ]
}
Sample response
{
"inference_results": [
{
"output": [
{
"name": "output",
"dataAsMap": {
"response": [
{
"tonight": 0.15685688,
"usa": 0.0455468,
"rating": 0.1603944,
...
}
]
}
}
]
}
]
}
(edited)
10:51
3. For cross-encoder, the input is different, check opensearch-project/ml-commons#1615
{
"query_text": "today is sunny",
"text_docs": [
"today is sunny",
"today is july fifth",
"it is winter"
]
}
Sample response
{
"inference_results": [
{
"output": [
{
"name": "similarity",
"data_type": "FLOAT32",
"shape": [
1
],
"data": [
10.223609
],
"byte_buffer": {
"array": "Un3CwA==",
"order": "LITTLE_ENDIAN"
}
}
]
},
{
"output": [
{
"name": "similarity",
"data_type": "FLOAT32",
"shape": [
1
],
....
]
}
What other resources are available? Provide links to related issues, POCs, steps for testing, etc.
Refer to above.
The text was updated successfully, but these errors were encountered: