-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pipeline DLQ #3857
Comments
If I understand correctly, right now, only sink failure are sent to DLQ, that might explains a bug I am seeing in my pipeline. |
@jw-amazon , Yes, only sinks failures are sent to the DLQ currently. |
My proposal of approach to this is different: We can define a dlq as extension such that any pipeline plugin can integrate with:
In pipeline definition, user no longer need to explicitly specify dlq details within any plugin (except for enabling or disabling dlq). The plugin integration with DLQ extension will be taken care of in the code logic. Alternatively, we can move DLQ out of the extensions and make it a standalone global config in data-prepper-config.yaml:
|
@chenqi0805 , The proposal in this issue creates a specific pipeline for handling failed events. This can allow for processing data before putting in the DLQ. I think there could be some overlap between your idea of extensions and the pipeline DLQ. For one, this proposal has a default failure pipeline. Perhaps that can be configurable via an extension.
|
In terms of implementation, I propose that we create a new interface in
We may also want some way to include additional information on the failures. Then, sinks or processors call write failed events. For example:
Within
When creating the pipeline in |
Is your feature request related to a problem? Please describe.
Provide a way to send all failed events to a global/pipeline-level DLQ. Failed events any where in the pipeline (sources, processors, and sinks) are sent directly to this DLQ. This will eventually replace sink level DLQs we have today.
Describe the solution you'd like
Preferred solution (based on @dlvenable's initial thoughts and a discussion meeting)
And each sub-pipeline in the yaml may have an entry pointing to this as follows
In addition, there may be an option to have a default pipeline which is used if no failure pipeline is mentioned in a sub-pipeline
And finally an implicit failure pipeline which is created without any entries in the YAML file. The implicit failure pipeline will send all failed events to
stdout
This requires changes to code in many places and so it is better to introduce a new API (For example,
executeWithFailures()
API in processors which will return both output records and failed records). Data Prepper core code can then take the failed records and send them to appropriate failure pipeline (configured failure pipeline or default failure pipeline or implicit failure pipeline). Similarly new API at source and sink level maybe added. Once the API is added, code may be modified slowly so that all sources/sinks/processors use this new API.Having a separate pipeline for failure, allows the same pipeline to be used by multiple pipelines. And also makes it possible to write sub-pipelines under it and do conditional routing etc.
Describe alternatives you've considered (Optional)
Instead of new API in processors/sources/sinks, we could have a global singleton for DLQEvents managed by DLQEventsManager which each source uses in its constructor and failed events are handed over to this DLQEventsManager which will route the events to failure pipeline. I think this approach is also ok.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: