Guidance for Q&A System with Similarity Search-based Retrieval Augmented Generation (RAG) on AWS

Overview
- Architecture
- Cost
Prerequisites
- Operating System
Deployment Steps
Running the Guidance
Next Steps
Cleanup
Notices
Authors

Overview

Amazon DocumentDB offers native vector search capabilities, enabling you to perform similarity searches with ease. In this guidance, we provide a step-by-step walkthrough that covers all the essential building blocks required to create an enterprise-ready Retrieval Augmented Generation (RAG) application, such as a question-answering (Q&A) system. Our approach leverages a combination of various AWS services, including Amazon Bedrock, an easy way to build and scale generative AI applications with foundation models. We utilize Titan Text for text embeddings and Anthropic's Claude on Amazon Bedrock as our Large Language Model (LLM), while Amazon DocumentDB (with MongoDB compatibility) serves as our vector database. Additionally, we demonstrate the integration with open-source RAG framework like LlamaIndex, facilitating seamless interfacing with all the components involved.

Architecture

The architecture diagram outlines an approach to effectively handle user queries and provide responses. It uses a foundation model available on Amazon Bedrock by employing a Retrieval Augmented Generation (RAG) technique. This approach leverages the vector search capabilities of Amazon DocumentDB and the LlamaIndex framework to retrieve relevant data, thereby enhancing the model's ability to generate contextually appropriate responses.

How It Works

The Q&A application follows these steps to provide responses to your questions:

User uploads enterprise or external data which lies outside of the large language model’s (LLM) training data to augment the trained model. It can come from various sources including APIs, databases, or document repositories.
The application hosted on Amazon EC2 preprocesses data by removing inconsistencies and errors, splitting large documents into manageable sections, and chunking the text into smaller, coherent pieces for easier processing.
Application generates text embeddings for relevant data using the Titan text embedding models on Amazon Bedrock.
Application fetches credentials from AWS Secrets Manager to connect to Amazon DocumentDB.
Application creates a vector search index in Amazon DocumentDB and uses LlamaIndex to load the generated text embeddings along with other relevant information into a DocumentDB collection.
User submits a natural language query for finding relevant answers to web application.
Application fetches credentials from AWS Secrets Manager to connect to Amazon DocumentDB.
The user’s question is transformed into a vector embedding in notebook using the same embedding model that was used during data ingestion workflow.
Application passes the query to LlamaIndex query engine. LlamaIndex is a data orchestration tool that helps with data indexing and querying. LlamaIndex performs a similarity search in the DocumentDB collection using the query embedding. The search retrieves the most relevant documents based on their proximity to the query vector.
LlamaIndex query engine augments this retrieved information, along with the user's question as a prompt to the LLM model on Amazon Bedrock to generate more accurate and informed responses.

Cost

You are responsible for the cost of the AWS services used while running this Guidance. We recommend creating a Budget through AWS Cost Explorer to help manage costs. Prices are subject to change. For full details, refer to the pricing webpage for each AWS service used in this Guidance.

Sample Cost Table

The following table provides a sample cost breakdown for deploying this Guidance with the default parameters in the US East (N. Virginia) Region for one month.

AWS service	Dimensions	Cost [USD/month]
Amazon DocumentDB Instance Based Cluster	Standard Cluster Configuration, 1 X Instance type (db.r6g.large), Storage (10 GB), I/Os (2 millions), Backup 1 Day	$193.50
Amazon Bedrock - Titan Text Embeddings model	Number of Input tokens (100 million per month)	$10.00
Amazon Bedrock - Anthropic Claude	Number of Input tokens (100 million per month)	$10.00
Amazon EC2	Operating system (Linux), Workload (Consistent, Number of instances: 1), Instance name (m5.large), EBS Storage amount (8 GB)	$36.41
AWS Secrets Manager	1 Secret, 1 million requests/month	$5.40

Prerequisites

To utilize Amazon Bedrock's foundational models, you must first request access to them. This step is a prerequisite before you can start using the Amazon Bedrock APIs to invoke the models. In the following steps, we will configure model access in Amazon Bedrock, enabling you to build and run generative AI applications. Amazon Bedrock offers a diverse range of foundation models from various providers, including AI21 Labs, Anthropic, Cohere, Meta, Stability AI, and Amazon itself.

Amazon Bedrock Setup Instructions

In the AWS Console, select the Region from which you want to access Amazon Bedrock.
For this guidance , we will be using the us-east-1 region.
Search for Amazon Bedrock by typing in the search bar on the AWS console.
Expand the side menu with three horizontal lines (as shown below), select Model access and click on Enable specific models button.
For this guidance, we'll be using Anthropic's Claude 3 models as LLMs and Amazon Titan family of embedding models. Click Next in the bottom right corner to review and submit.
You will be granted access to Amazon Titan models instantly. The Access status column will change to In progress for Anthropic Claude 3 momentarily. Keep reviewing the Access status column. You may need to refresh the page periodically. You should see Access granted shortly (wait time is typically 1-3 mins).

Note

Now you have successfully configured Amazon Bedrock.

To deploy this guidance, ensure that the user has permissions to create, list, and modify resources
- A VPC and the required networking components
- Amazon DocumentDB
- Amazon SageMaker
- AWS Secrets Manager

Operating System

These deployment instructions are optimized to best work on Amazon Linux 2 AMI or Mac OS. Deployment in another OS may require additional steps.

Deployment Steps

The cloudformation stack can be easily deployed using AWS Console or using AWS CLI and here are the steps for both.

Using AWS Console

Below are the steps to deploy the Cloudformation temolate using the AWS Console

Download the data-rag-aws-llama-DocumentDB.yaml
Navigate to AWS CloudFormation service on your AWS Console
Choose Create stack and select with new resources (standard)
On Specify template choose Upload a template file
Enter the Stack name for your CloudFormation stack.
For DB cluster username, enter the name of your administrator user in the Amazon DocumentDB cluster.
For DB cluster password, enter the administrator password for your Amazon DocumentDB cluster (minimum 8 characters).
Choose Next.
Select the check box in the Capabilities section to allow the stack to create an IAM role, then choose Submit.

Using AWS CLI

Clone the repo using command

gh repo clone aws-solutions-library-samples/guidance-for-similarity-search-based-retrieval-augmented-generation-on-aws
Change directory to the deplpoyment folder

cd guidance-for-similarity-search-based-retrieval-augmented-generation-on-aws/deployment
Create the stack, here is an example command to deploy the stack

aws cloudformation create-stack --template-body file://data-rag-aws-llama-DocumentDB.yaml --stack-name <StackName> --parameters ParameterKey=DBUsername,ParameterValue=<DocumentDB_Username> ParameterKey=DBPassword,ParameterValue=<DBPassword_Password> --capabilities <CAPABILITY_NAMED_IAM>

Deployment Validation

Deployment validation can be done using AWS Console or AWS CLI

Using AWS Console

Open CloudFormation console and verify the status of the template with your stack name provided earlier. The stack creation status should be CREATE_COMPLETE
If your deployment is sucessful, you should see an active Amazon DocumentDB cluster and Amazon EC2 instance running in your account.
You can locate the Q&A application public URL from outputs tab of stack.

Using AWS CLI

Open CloudFormation console and verify the status of the template with the name starting with your stack name.
If your deployment is sucessful, you should see an active Amazon DocumentDB cluster and Amazon EC2 instance running in your account.
Run the following CLI command to validate the deployment: aws cloudformation describe <stack name>

Running the Guidance

Open the Q&A application public URL from Cloudformation outputs tab.
Click on the Upload sample document from Left Navigation bar.

Download the Q3 earnings call transcript of AnyCompany. You can also use any other document for Q&A.
Upload the document. Document will be processed and embeddings will be stored in Amazon DocumentDB.

Click on the Q&A System from Left Navigation bar
Enter your question for your document and get the response form application.

Few sample questions:
- "When these results were announced?"
- "How much was AnyCompany revenue in Q3?"
- "What is the AnyCompany outlook for next year?"

Next Steps

You can explore additional sample datasets as well as other Retrieval-Augmented Generation (RAG) frameworks for developing Question and Answer systems. Furthermore, you can experiment with varying the chunk size , LLM hyperparameters to fine-tune the responses generated by the Large Language Model (LLM).

Cleanup

Using AWS Console

Navigate to Cloudformation console, locate the stack with the name you provided while creating the stack
Select the stack and choose Delete

Using AWS CLI

To delete the stack run the following command (replace the stack-name)

aws cloudformation delete-stack --stack-name <StackName>

Notices

Customers are responsible for making their own independent assessment of the information in this Guidance. This Guidance: (a) is for informational purposes only, (b) represents AWS current product offerings and practices, which are subject to change without notice, and (c) does not create any commitments or assurances from AWS and its affiliates, suppliers or licensors. AWS products or services are provided “as is” without warranties, representations, or conditions of any kind, whether express or implied. AWS responsibilities and liabilities to its customers are controlled by AWS agreements, and this Guidance is not part of, nor does it modify, any agreement between AWS and its customers.

License

The Q&A System with Similarity Search-based Retrieval Augmented Generationn is released under the MIT-0 License.

Authors

Gururaj Bayari
Anshu Vajpayee

Contribution

This repository is intended for educational purposes and does not accept further contributions. Feel free to utilize and enhance the app based on your own requirements.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
assets		assets
deployment		deployment
source		source
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Guidance for Q&A System with Similarity Search-based Retrieval Augmented Generation (RAG) on AWS

Overview

Architecture

How It Works

Cost

Sample Cost Table

Prerequisites

Amazon Bedrock Setup Instructions

Operating System

Deployment Steps

Using AWS Console

Using AWS CLI

Deployment Validation

Using AWS Console

Using AWS CLI

Running the Guidance

Next Steps

Cleanup

Using AWS Console

Using AWS CLI

Notices

License

Authors

Contribution

About

Releases

Packages

Contributors 4

License

aws-solutions-library-samples/guidance-for-similarity-search-based-retrieval-augmented-generation-on-aws

Folders and files

Latest commit

History

Repository files navigation

Guidance for Q&A System with Similarity Search-based Retrieval Augmented Generation (RAG) on AWS

Overview

Architecture

How It Works

Cost

Sample Cost Table

Prerequisites

Amazon Bedrock Setup Instructions

Operating System

Deployment Steps

Using AWS Console

Using AWS CLI

Deployment Validation

Using AWS Console

Using AWS CLI

Running the Guidance

Next Steps

Cleanup

Using AWS Console

Using AWS CLI

Notices

License

Authors

Contribution

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Packages