Skip to content

Commit

Permalink
Updated read me to include ingestion construct and add existing S3 bu…
Browse files Browse the repository at this point in the history
…cket as optional input to RAG module
  • Loading branch information
saikatak committed May 22, 2024
1 parent 87c7b65 commit 301206e
Show file tree
Hide file tree
Showing 5 changed files with 213 additions and 6 deletions.
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,11 +42,11 @@ See deployment steps in the [Deployment Guide](DEPLOYMENT.md).

### FMOps Modules

| Type | Description |
|-----------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------|
| [SageMaker JumpStart Foundation Model Endpoint Module](modules/fmops/sagemaker-jumpstart-fm-endpoint/README.md) | Creates an endpoint for a SageMaker JumpStart Foundation Model. |
| [SageMaker Hugging Face Foundation Model Endpoint Module](modules/fmops/sagemaker-hugging-face-endpoint/README.md) | Creates an endpoint for a SageMaker Hugging Face Foundation Model. |
| [AppSync Question and Answering RAG Model Endpoint Module](modules/fmops/qna-rag/README.md) | Creates an Graphql endpoint for a Question and answering RAG model using custom input knowledge base
| Type | Description |
|--------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------|
| [SageMaker JumpStart Foundation Model Endpoint Module](modules/fmops/sagemaker-jumpstart-fm-endpoint/README.md) | Creates an endpoint for a SageMaker JumpStart Foundation Model. |
| [SageMaker Hugging Face Foundation Model Endpoint Module](modules/fmops/sagemaker-hugging-face-endpoint/README.md) | Creates an endpoint for a SageMaker Hugging Face Foundation Model. |
| [AppSync Knowledge Base Ingestion and Question and Answering RAG Module](modules/fmops/qna-rag/README.md) | Creates an Graphql endpoint for ingestion of data and and use ingested as knowledge base for a Question and Answering model using RAG. |


### MWAA Modules
Expand Down
195 changes: 195 additions & 0 deletions modules/fmops/qna-rag/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,10 @@ Question and Answering using RAG Architecture
- `os-security-group-id` - Security group of open search cluster
- `vpc-id` - VPC id

#### Optional

- `input-asset-bucket` - Input asset bucket that is used to store input documents

### Module Metadata Outputs

- `IngestionGraphqlApiId` - Ingestion Graphql API ID.
Expand Down Expand Up @@ -64,3 +68,194 @@ parameters:
key: VpcId

```
After deploying the Seedfarmer stack, Upload the file to be ingested into the input S3 bucket(If no input S3 bucket is provided in manifest, a bucket with name 'input-assets-bucket-dev-<AWSAccountNumber>' will be created by the construct)

The document summarization workflow can be invoked using GraphQL APIs. First invoke Subscription call followed by mutation call.

The code below provides an example of a mutation call and associated subscription to trigger a pipeline call and get status notifications:

Subscription call to get notifications about the ingestion process:

```
subscription MySubscription {
updateIngestionJobStatus(ingestionjobid: "123") {
files {
name
status
imageurl
}
}
}
_________________________________________________
Expected response:
{
"data": {
"updateIngestionJobStatus": {
"files": [
{
"name": "a.pdf",
"status": "succeed",
"imageurl":"s3presignedurl"
}
]
}
}
}
```
Where:
- ingestionjobid: id which can be used to filter subscriptions on client side
The subscription will display the status and name for each file
- files.status: status update of the ingestion for the file specified
- files.name: name of the file stored in the input S3 bucket

Mutation call to trigger the ingestion process:

```
mutation MyMutation {
ingestDocuments(ingestioninput: {
embeddings_model:
{
provider: Bedrock,
modelId: "amazon.titan-embed-text-v1",
streaming: true
},
files: [{status: "", name: "a.pdf"}],
ingestionjobid: "123",
ignore_existing: true}) {
files {
imageurl
status
}
ingestionjobid
}
}
_________________________________________________
Expected response:
{
"data": {
"ingestDocuments": {
"ingestionjobid": null
}
}
}
```
Where:
- files.status: this field will be used by the subscription to update the status of the ingestion for the file specified
- files.name: name of the file stored in the input S3 bucket
- ingestionjobid: id which can be used to filter subscriptions on client side
- embeddings_model: Based on type of modality (text or image ) the model provider , model id can be used.



After ingesting the input files , the QA process can be invoked using GraphQL APIs. First invoke Subscription call followed by mutation call.

The code below provides an example of a mutation call and associated subscription to trigger a question and get response notifications. The subscription call will wait for mutation requests to send the notifications.

Subscription call to get notifications about the question answering process:

```
subscription MySubscription {
updateQAJobStatus(jobid: "123") {
sources
question
answer
jobstatus
}
}
____________________________________________________________________
Expected response:
{
"data": {
"updateQAJobStatus": {
"sources": [
""
],
"question": "<base 64 encoded question>",
"answer": "<base 64 encoded answer>",
"jobstatus": "Succeed"
}
}
}
```

Where:

- jobid: id which can be used to filter subscriptions on client side
- answer: response to the question from the large language model as a base64 encoded string
- sources: sources from the knowledge base used as context to answer the question
- jobstatus: status update of the question answering process for the file specified

Mutation call to trigger the question:

```
mutation MyMutation {
postQuestion(filename: "",
embeddings_model:
{
modality: "Text",
modelId: "amazon.titan-embed-text-v1",
provider: Bedrock,
streaming: false
},
filename:"<file_name>"
jobid: "123",
jobstatus: "",
qa_model:
{
provider: Bedrock,
modality: "Text",
modelId: "anthropic.claude-v2:1",
streaming: false,
model_kwargs: "{\"temperature\":0.5,\"top_p\":0.9,\"max_tokens_to_sample\":250}"
},
question:"<base 64 encoded question>",
responseGenerationMethod: RAG
,
retrieval:{
max_docs:10
},
verbose:false
) {
jobid
question
verbose
filename
answer
jobstatus
responseGenerationMethod
}
}
____________________________________________________________________
Expected response:
{
"data": {
"postQuestion": {
"jobid": null,
"question": null,
"verbose": null,
"filename": null,
"answer": null,
"jobstatus": null,
"responseGenerationMethod": null
}
}
}
```

Where:

- jobid: id which can be used to filter subscriptions on client side
- jobstatus: this field will be used by the subscription to update the status of the question answering process for the file specified
- qa_model.modality/embeddings_model.modality: Applicable values Text or Image
- qa_model.modelId/embeddings_model.modelId: Model to process Q&A. example - anthropic.claude-v2:1,Claude-3-sonnet-20240229-v1:0
- retrieval.max_docs: maximum number of documents (chunks) retrieved from the knowledge base if the Retrieveal Augmented Generation (RAG) approach is used
- question: question to ask as a base64 encoded string
- verbose: boolean indicating if the [LangChain chain call verbosity](https://python.langchain.com/docs/guides/debugging#chain-verbosetrue) should be enabled or not
- streaming: boolean indicating if the streaming capability of Bedrock is used. If set to true, tokens will be send back to the subscriber as they are generated. If set to false, the entire response will be sent back to the subscriber once generated.
- filename: optional. Name of the file stored in the input S3 bucket, in txt format.
- responseGenerationMethod: optional. Method used to generate the response. Can be either RAG or LONG_CONTEXT. If not provided, the default value is LONG_CONTEXT.
3 changes: 2 additions & 1 deletion modules/fmops/qna-rag/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ def _param(name: str) -> str:
cognito_pool_id = os.getenv(_param("COGNITO_POOL_ID"))
os_domain_endpoint = os.getenv(_param("OS_DOMAIN_ENDPOINT"))
os_security_group_id = os.getenv(_param("OS_SECURITY_GROUP_ID"))

input_asset_bucket_name = os.getenv(_param("INPUT_ASSET_BUCKET"))

if not vpc_id:
raise ValueError("Missing input parameter vpc-id")
Expand All @@ -44,6 +44,7 @@ def _param(name: str) -> str:
os_domain_endpoint=os_domain_endpoint,
os_security_group_id=os_security_group_id,
os_index_name="rag-index",
input_asset_bucket_name=input_asset_bucket_name,
env=aws_cdk.Environment(
account=os.environ["CDK_DEFAULT_ACCOUNT"],
region=os.environ["CDK_DEFAULT_REGION"],
Expand Down
10 changes: 10 additions & 0 deletions modules/fmops/qna-rag/stack.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@
from cdklabs.generative_ai_cdk_constructs import QaAppsyncOpensearch
from aws_cdk import Stack
from aws_cdk import aws_ec2 as ec2
from aws_cdk import aws_s3 as s3
from aws_cdk import (
aws_opensearchservice as os,
aws_cognito as cognito,
Expand All @@ -22,6 +23,7 @@ def __init__(
os_domain_endpoint: str,
os_security_group_id: str,
os_index_name: str,
input_asset_bucket_name: str,
**kwargs,
) -> None:
super().__init__(
Expand Down Expand Up @@ -52,6 +54,13 @@ def __init__(
"myuserpool",
user_pool_id=cognito_pool_id,
)

if input_asset_bucket_name:
input_asset_bucket = s3.Bucket.from_bucket_name(
self, "input-assets-bucket", input_asset_bucket_name
)
else:
input_asset_bucket = None
# 1. Create Ingestion pipeline
rag_ingest_resource = RagAppsyncStepfnOpensearch(
self,
Expand All @@ -60,6 +69,7 @@ def __init__(
existing_opensearch_domain=os_domain,
open_search_index_name=os_index_name,
cognito_user_pool=user_pool_loaded,
existing_input_assets_bucket_obj=input_asset_bucket,
)

self.security_group_id = rag_ingest_resource.security_group.security_group_id
Expand Down
1 change: 1 addition & 0 deletions modules/fmops/qna-rag/tests/test_stack.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ def stack_model_package_input() -> cdk.Stack:
os_domain_endpoint=os_domain_endpoint,
os_security_group_id=os_security_group_id,
os_index_name="sample",
input_asset_bucket_name="input-bucket",
env=cdk.Environment(
account="111111111111",
region="us-east-1",
Expand Down

0 comments on commit 301206e

Please sign in to comment.