[FEATURE] Support conversional search in ML Inference Search Response Processor with memory #3242

mingshl · 2024-11-27T20:08:34Z

Is your feature request related to a problem?
to support conversational search, when sending the request to the remote model, we not only needs to send the questions, but also the historical context.

For example,
OpenAI API:

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello, how are you?"},
    {"role": "assistant", "content": "I'm doing well, thank you. How can I assist you today?"},
    {"role": "user", "content": "What's the weather like?"}
]

Bedrock converse API:

"messages": [
        {
            "role": "user",
            "content": [
                {
                    "text": "Write an article about impact of high inflation to GDP of a country"
                }
            ]
        }
    ]

in ml inference search response processor, introduce a new parameter, "conversational_search", to be true or false. when it's true, and the input_map config to read the memory id from query extension. ml inference processors will read the memory from GetConversationsRequest action, try to send the message list and questions together to the remote model api.

{
 "ml_inference": {
   "model_id": "<model_id>", 
   "conversational_search": true
   "function_name": "<function_name>",
   "full_response_path": "<full_response_path>",
  "conversation_search": { "memory_input":"""{"message": {“role”: “${parameters.role}”, “content”: 
                                         “${parameters.content}”}}""" // optional , 
   "memory_output": """"message": {"role": "assistant", "content": "${DataAsMap.response}"}}"""
  }
   "model_config":{
     "<model_config_field>": "<config_value>"
   },
   "model_input": "<model_input>",
   "input_map": [
     {
       "memory_id": "$._query.ext.ml_inference.memory_id"
       "content": "$._query.ext.ml_inference.question"
     }
   ],
   "output_map": [
     {
       "<new_document_field>": "<model_output_field>"
     }
   ],
   "override": "<override>",
   "one_to_one": false
 }
}

when search in query, users can use the ml inference search extension to ask question.

GET /my_rag_test_data/_search?search_pipeline=rag_pipeline
{
  "query": {
    "match": {
      "text": "Abraham Lincoln"
    }
  },
  "ext": {
    "ml_inference": {
      "llm_question": "Was Abraham Lincoln a good politician",
      "memory_id": "iXC4bI0BfUsSoeNTjS30"
    }
  }
}

To reuse the current memory and message API, propose to add a new field in interaction and message API to allow custom message.

propose new interface for message, interaction:

@input 
structure CreateInteractionInput  {
    @required
    @httpLabel
    conversationId: ConversationId
    input: String
    prompt: String
    response: String
    agent: String,
    customMessage: Object
    attributes: InteractionAttributes
}

What alternatives have you considered?
A clear and concise description of any alternative solutions or features you've considered.

Do you have any additional context?
[[Add any other context or screenshots about the feature request here.](https://github.com//issues/1150)
](#1877)

The text was updated successfully, but these errors were encountered:

Distorted-Dundar · 2024-11-27T23:10:28Z

Hey,

Is the memory id an index? If so what are some responsibilities of this index it looks like the ml inference processor will load the index and store info in the index on return is there a restriction on this memory id or can any other index be used?

How do we know we should clean up these memories after sometime?

austintlee · 2024-11-28T00:09:41Z

I worry that this might blur the line between the ML response processor and the existing RAG processor. We may be adding too much to the ML inference processor interface.

mingshl · 2024-11-29T23:15:19Z

I worry that this might blur the line between the ML response processor and the existing RAG processor. We may be adding too much to the ML inference processor interface.

Hi @austintlee , we are working on OpenSearch Flow Project, you can refer here https://github.com/opensearch-project/dashboards-flow-framework/blob/main/documentation/tutorial.md for the tutorial.

This OpenSearch Flow is aiming at using ML Inference Processors (ingest/search) as a generic processor to run inference during ingest and search in a workflow to simplify set up and configurations. Of course, if users are familiar with Rag processors or others existing processors, users can use other processors as well, there are drop down options in the processors option that user can pick and bindle, it's up to the users choice for their use cases.

mingshl · 2024-11-29T23:16:36Z

Hey,

Is the memory id an index? If so what are some responsibilities of this index it looks like the ml inference processor will load the index and store info in the index on return is there a restriction on this memory id or can any other index be used?

How do we know we should clean up these memories after sometime?

yes the memory would be story in index. Indeed the memory and message API were releases. checkout this doc https://opensearch.org/docs/latest/ml-commons-plugin/api/memory-apis/get-memory/ https://opensearch.org/docs/latest/ml-commons-plugin/api/memory-apis/get-message/

mingshl added enhancement New feature or request untriaged labels Nov 27, 2024

dhrubo-os added this to ml-commons projects Dec 3, 2024

mingshl moved this to In Progress in ml-commons projects Dec 4, 2024

mingshl removed the untriaged label Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Support conversional search in ML Inference Search Response Processor with memory #3242

[FEATURE] Support conversional search in ML Inference Search Response Processor with memory #3242

mingshl commented Nov 27, 2024 •

edited

Loading

Distorted-Dundar commented Nov 27, 2024

austintlee commented Nov 28, 2024

mingshl commented Nov 29, 2024

mingshl commented Nov 29, 2024

[FEATURE] Support conversional search in ML Inference Search Response Processor with memory #3242

[FEATURE] Support conversional search in ML Inference Search Response Processor with memory #3242

Comments

mingshl commented Nov 27, 2024 • edited Loading

Distorted-Dundar commented Nov 27, 2024

austintlee commented Nov 28, 2024

mingshl commented Nov 29, 2024

mingshl commented Nov 29, 2024

mingshl commented Nov 27, 2024 •

edited

Loading