Skip to content

Latest commit

 

History

History
186 lines (142 loc) · 5.28 KB

File metadata and controls

186 lines (142 loc) · 5.28 KB

S3 Event Processing with AWS CDK

This project implements an event-driven architecture using S3, EventBridge, and Lambda functions using AWS CDK.

Architecture Overview

The project sets up:

  • An S3 bucket with specific prefixes (samples/, documents/, samples-output/, documents-output/)
  • EventBridge rules to monitor S3 events
  • Three Lambda functions for processing files:
    • Samples Processor: Processes files in samples/
    • Documents Processor: Processes files in documents/
    • Post Processor: Processes files in output directories

Prerequisites

  • Python 3.9 or higher
  • AWS CLI configured with appropriate credentials
  • Node.js and npm (for AWS CDK CLI)
  • AWS CDK CLI installed (npm install -g aws-cdk)

Project Structure

deployment/
├── app.py                                   # Main CDK app entry point
├── cdk.json                                 # CDK configuration
├── cdk.context.json                         # CDK context
├── requirements.txt                         # Python dependencies
├── docs
│   ├── a_lending_01_deployment.md
│   ├── a_lending_02_setup_blueprints.md
│   ├── a_lending_03_run_flow.md
├── stacks/
│   └── lending_flow_stack.py                # Main stack
└── lambda/                                  # Lambda Function Code
     └── lending_flow/
        ├── samples_processor/
        │   └── index.py
        ├── documents_processor/
        │   └── index.py
        │── documents_post_processor/
        │   └── index.py
        ├── samples_post_processor/
            └── index.py

Setup Instructions

  1. Create and activate a virtual environment:
cd guidance-for-multimodal-data-processing-using-amazon-bedrock-data-automation/deployment
python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate
  1. Install required dependencies:
pip install --upgrade pip
pip install -r requirements.txt
  1. Go to the layer directory and install lambda layer dependencies into the python subdirectory:
cd lambda/lending_flow/layer/
pip install -r requirements.txt --target python
cd ../../..
  1. Bootstrap AWS CDK (first-time only):
cdk bootstrap

For more details, read the AWS CDK Bootstrap Instructions

  1. Deploy the lending flow stack:
cdk deploy lending-flow --require-approval never --context data_project_name=my-lending-project

General cdk commands

cdk synth   # Synthesize CloudFormation template
cdk diff    # Review changes
cdk deploy  # Deploy stack

Lambda Functions

The stack deploys three lambda functions as decribe below

Samples Processor

  • Triggered by files uploaded to samples/ prefix
  • Processes sample files
  • Outputs results to samples-output/ prefix

Documents Processor

  • Triggered by files uploaded to documents/ prefix
  • Processes document files
  • Outputs results to documents-output/ prefix

Post Processor

  • Triggered by files created in either output prefix
  • Performs final processing on output files

Deployment Validation

  • Open CloudFormation console and verify the status of the template with the name starting with lending-flow.

Useful Commands

# CDK Commands
cdk ls          # List all stacks
cdk synth       # Synthesize CloudFormation template
cdk deploy      # Deploy stack
cdk diff        # Compare deployed stack with current state
cdk destroy     # Remove stack



## Clean Up

To remove all resources:
```bash
cdk destroy

Environment Variables

Each Lambda function uses the following environment variables:

# Samples/Documents Processor
BUCKET_NAME     # S3 bucket name
OUTPUT_PREFIX   # Output directory prefix

# Post Processor
BUCKET_NAME     # S3 bucket name

Security

  • Lambda functions use least-privilege permissions
  • S3 bucket is configured with appropriate access policies
  • EventBridge rules are scoped to specific prefixes

Troubleshooting

  1. Deployment Issues:

    • Verify AWS credentials are configured
    • Ensure CDK is bootstrapped in your account/region
    • Check CloudFormation console for detailed error messages
  2. Runtime Issues:

    • Check CloudWatch Logs for Lambda function errors
    • Verify S3 event notifications are enabled
    • Ensure IAM permissions are correct
  3. Common Errors:

    • "Resource not found": Ensure resources exist and permissions are correct
    • "Access denied": Check IAM roles and policies
    • "Invalid handler": Verify Lambda function handler names

Development

To modify the stack:

  1. Update lending_flow_stack.py
  2. Update Lambda function code in respective directories
  3. Run cdk diff to review changes
  4. Deploy with cdk deploy

Contributing

  1. Create a new branch for features
  2. Update documentation as needed
  3. Test changes thoroughly
  4. Submit pull request

Useful Links