Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Knowledge base in OpenSearch #2832

Open
yuye-aws opened this issue Aug 15, 2024 · 7 comments
Open

[RFC] Knowledge base in OpenSearch #2832

yuye-aws opened this issue Aug 15, 2024 · 7 comments
Assignees
Labels

Comments

@yuye-aws
Copy link
Member

yuye-aws commented Aug 15, 2024

Problem statement

Users in ml-commons can use RAG (Retrieval Augmented Generation) service either from an OpenSearch index or a bedrock knowledge base. The RAG service mainly consists of two steps. First, retrieve documents from the knowledge base. Second, generate answer given retrieved documents.

Despite that, both OpenSearch index and bedrock knowledge base is accessible. The configuration steps are quite different. The only method to support bedrock knowledge base is ML Model tool. It is not straightforward for users to configure their bedrock RAG service because they are required to configure some specialized functions or parameters in connector. We are considering how to provide users with a more convenient method to configure the knowledge base workflow. It would be better if user could have a unified experience to use different knowledge bases.

In the scope

  • Users can customize the prompt when querying the RAG service
  • Users can choose whether the knowledge base is an OpenSearch index or hosted on bedrock
  • For OpenSearch index, users can specify search with dense or sparse embedding

Out of the scope

  • Register or deploy the LLM model
  • Register or deploy the dense or sparse embedding model
  • Retrieve documents alone and not generate answers
  • Configure the OpenSearch index or the bedrock knowledge base
  • Search OpenSearch index with BM25 query
  • Search bedrock knowledge base with non-bedrock embedding model
  • Allow users to specify specify how to concatenate retrieved documents

Options

This RFC lists a few options along with their pros and cons. We expect to solicit feedbacks on or other potential better options.

Option 1

Instruct users to use bedrock knowledge base service via MLModel Tool or Connector Tool.

Pros

  1. No coding effort. All we need to do is to publish some tutorials and documentation.

Cons

  1. It is weird for user to use RAG service with MLModel tool.
  2. More configuration effort for the users. Users need to configure some post processing steps through painless script in connector.
  3. The painless script in OpenSearch may be impacted by throttling mechanism.

Option 2

Integrate bedrock knowledge base into RAGTool.

Pros

  1. Unified experience. User can query index or bedrock knowledge base with a single tool.
  2. No post process effort for user.

Cons

  1. The parameter configuration in RAG tool would be quite different. It would be a breaking change to RAG tool in previous versions.

Option 3

Implement a new tool like RemoteRAG Tool (feel free to tell me if you can think of any better name.

Pros

  1. Easy for users to distinguish between RemoteRAG tool and LocalRAG tool.
  2. No post process effort for user.

Cons

  1. Code duplication among RemoteRAG tool and LocalRAG tool.
@model-collapse model-collapse self-assigned this Aug 15, 2024
@yuye-aws
Copy link
Member Author

Let us have option 1 as the default option.

@dhrubo-os
Copy link
Collaborator

It is not straightforward for users to configure their bedrock RAG service because they are required to configure some specialized functions or parameters in connector.

Can you explain this with more details. May be add example as well so that reader can understand what is the current problem?

It is weird for user to use RAG service with MLModel tool.

What do you mean by weird here? It not a good CX experience or anti design patter or what?

The painless script in OpenSearch may be impacted by throttling mechanism.

Why painless script only concerns for option 1?

@ylwu-amzn
Copy link
Collaborator

Why can't use ConnectorTool?

@yuye-aws
Copy link
Member Author

Why can't use ConnectorTool?

Option 1 can also be done by ConnectorTool. In this case, let's consider it the same as the ML MODEL tool

@yuye-aws
Copy link
Member Author

Can you explain this with more details. May be add example as well so that reader can understand what is the current problem?

There are many parameters to be specified within the API config for Retrieve and RetrieveAndGenerate. We cannot assume that the user will only use some of the parameters. It is tiresome or a bit annoying to configure them at request_body field in the connector.

Besides, the user may wish to use configure the post_process step within the connector. Taking the RetrieveAndGenerate API as an example, the user may wish to see the reference url alongside with the output text. It would be better if we can provide them several preconfigured post process functions.

@yuye-aws
Copy link
Member Author

What do you mean by weird here? It not a good CX experience or anti design patter or what?

The customer may ask: Since we have RAG Tool, why must I use bedrock RAG service with ML Model tool or Connector tool? Connector tool to some extent makes sense but ML Model tool does not.

@yuye-aws
Copy link
Member Author

Why painless script only concerns for option 1?

Java is compiled and executed, but painless script is interpreted. It is a bit better to use JAVA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: On-deck
Development

No branches or pull requests

5 participants