-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add Batch Processing Utility #337
feat: Add Batch Processing Utility #337
Conversation
…s-dotnet into develop
Thanks a lot for your first contribution! Please check out our contributing guidelines and don't hesitate to ask whatever you need. |
I'm still finishing off a few things, but wanted to get the draft PR going. My goal is to get the feature ready for review during the weekend. |
…ttribute' enabling async execution of the batch processing logic.
…H_PROCESSING_MAX_PARALLELISM).
…ables, via the batch processing attribute and via dynamic configuration passed at runtime.
…ing results. We are now down to zero warnings.
The PR is now more or less ready to be reviewed :) Will sync with @hjgraca before submitting. |
…a-dotnet into feature/batch-processing
The results were not being cleared, new instance created when processing
Add helper for xunit order tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have committed all changes into the PR.
Thanks for reviewing @amirkaws |
…a-dotnet into feature/batch-processing
Kudos, SonarCloud Quality Gate passed! 0 Bugs No Coverage information |
Awesome work, congrats on your first merged pull request and thank you for helping improve everyone's experience! |
Issue number: #168
Summary
Changes
This PR adds the Batch Processing Utility to Powertools for AWS Lambda (.NET).
With this, we can bring support for simple utilization of the AWS Lambda function response type
ReportBatchItemFailures
where partial batch item failures can be reported, and thereby help reducing the number of items that are being re-processed. The utility will automatically monitor the processing of each item within a batch, and report which items failed to be processed. This enable developers to focus on writing business logic while benefiting from the Batch Processing Utility automatically doing the reporting of partial failures within a batch.User experience
Add the nuget package:
Consuming batches from SQS
Function handler using attribute-based configuration:
Implementation of
CustomSqsRecordHandler
:With this, the framework automatically creates an instance of a batch processor and uses the provided record handler for the per-record processing logic. All logic around catching exceptions and keeping track of partial batch item failures is handled automatically by the framework. On top of that, there are FIFO specific logic around handling partial failures - from the docs:
... and this is also handled automatically by the framework.
Consuming batches from DynamoDB Streams
Consuming batches from Kinesis Data Streams
Consuming batches using the utility
Consuming batches using processor and handler from IoC
Configuration
The Batch Processing Utility supports the following configuration:
MaxDegreeOfParallelism
This is used to control the parallelism of the batch item processing. With a value of1
, the processing is done sequentially (default). Sequential processing is recommended when preserving order is important - i.e. with SQS FIFIO queues. With a value> 1
, the processing is done in parallel. Doing parallel processing can enable processing to complete faster, i.e., when processing does downstream service calls. With a value of-1
, the parallelism is automatically configured to be the vCPU count of the Lambda function. Internally, the Batch Processing Utility utilizes Parallel.ForEachAsync Method and the ParallelOptions.MaxDegreeOfParallelism Property to enable this functionality.ErrorHandlingPolicy
This is used to control the error handling policy of the batch item processing. With a value ofDeriveFromEvent
(default), the specific BatchProcessor, determines the policy based on the incoming event. For example, theSqsBatchProcessor
looks at theEventSourceArn
to determine if theErrorHandlingPolicy
should beStopOnFirstBatchItemFailure
(for FIFO queues) orContinueOnBatchItemFailure
(for standard queues). ForStopOnFirstBatchItemFailure
the batch processor stops processing and marks any remaining records as batch item failures. ForContinueOnBatchItemFailure
the batch processor continues processing batch items regardless of item failures.The configuration items can be set in different ways:
POWERTOOLS_
environment variables.BatchProcesser
attribute.ProcessingOptions
object passed toBatchProcessor.ProcessAsync()
.Checklist
Please leave checklist items unchecked if they do not apply to your change.
Is this a breaking change?
NoRFC issue number: #168
Checklist:
Acknowledgment
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.