Pipeline DLQ #3857

kkondaka · 2023-12-13T22:18:29Z

Is your feature request related to a problem? Please describe.
Provide a way to send all failed events to a global/pipeline-level DLQ. Failed events any where in the pipeline (sources, processors, and sinks) are sent directly to this DLQ. This will eventually replace sink level DLQs we have today.

Describe the solution you'd like
Preferred solution (based on @dlvenable's initial thoughts and a discussion meeting)

Option to define failure pipeline in the YAML file like

my-failure-pipeline:
  type: failure
  sink:
    - s3:
        bucket: "..."
        codec:
          ndjson:

And each sub-pipeline in the yaml may have an entry pointing to this as follows

sample-pipeline:
  failure-pipeline: my-failure-pipeline
  source:
     ...
  processor:
     ...
  sink:
     ...

In addition, there may be an option to have a default pipeline which is used if no failure pipeline is mentioned in a sub-pipeline

default-failure-pipeline:
  type: failure
  sink:
    - s3:
        bucket: "..."
        codec:
          ndjson:

And finally an implicit failure pipeline which is created without any entries in the YAML file. The implicit failure pipeline will send all failed events to stdout

This requires changes to code in many places and so it is better to introduce a new API (For example, executeWithFailures() API in processors which will return both output records and failed records). Data Prepper core code can then take the failed records and send them to appropriate failure pipeline (configured failure pipeline or default failure pipeline or implicit failure pipeline). Similarly new API at source and sink level maybe added. Once the API is added, code may be modified slowly so that all sources/sinks/processors use this new API.

Having a separate pipeline for failure, allows the same pipeline to be used by multiple pipelines. And also makes it possible to write sub-pipelines under it and do conditional routing etc.

Describe alternatives you've considered (Optional)
Instead of new API in processors/sources/sinks, we could have a global singleton for DLQEvents managed by DLQEventsManager which each source uses in its constructor and failed events are handed over to this DLQEventsManager which will route the events to failure pipeline. I think this approach is also ok.

Additional context
Add any other context or screenshots about the feature request here.

The text was updated successfully, but these errors were encountered:

jw-amazon · 2023-12-14T10:05:49Z

If I understand correctly, right now, only sink failure are sent to DLQ, that might explains a bug I am seeing in my pipeline.

dlvenable · 2023-12-14T15:35:50Z

If I understand correctly, right now, only sink failure are sent to DLQ, that might explains a bug I am seeing in my pipeline.

@jw-amazon , Yes, only sinks failures are sent to the DLQ currently.

chenqi0805 · 2024-01-24T20:36:22Z

My proposal of approach to this is different: We can define a dlq as extension such that any pipeline plugin can integrate with:
data-prepper-config.yaml

extensions:
    dlq:
       store:
            s3:
                bucket: test-bucket
                key_path_prefix: dlq/

In pipeline definition, user no longer need to explicitly specify dlq details within any plugin (except for enabling or disabling dlq). The plugin integration with DLQ extension will be taken care of in the code logic.

Alternatively, we can move DLQ out of the extensions and make it a standalone global config in data-prepper-config.yaml:

dlq:
     store:
          s3:
              bucket: test-bucket
              key_path_prefix: dlq/

dlvenable · 2024-01-24T22:43:14Z

@chenqi0805 , The proposal in this issue creates a specific pipeline for handling failed events. This can allow for processing data before putting in the DLQ.

I think there could be some overlap between your idea of extensions and the pipeline DLQ. For one, this proposal has a default failure pipeline.

Perhaps that can be configurable via an extension.

extensions:
  dlq:
    default-failure-pipeline:
      type: failure
      sink:
      - s3:
          bucket: "..."
          codec:
            ndjson:

dlvenable · 2024-10-23T18:38:06Z

In terms of implementation, I propose that we create a new interface in data-prepper-api: FailurePipeline.

public interface FailurePipeline {
  void sendFailedEvents(Collection<Record<Event>> events);
}

We may also want some way to include additional information on the failures.

Then, sinks or processors call write failed events. For example:

try {
  doSomething(events);
} catch(Exception e) {
  failurePipeline.sendFailedEvents(events);
}

Within data-prepper-core, this interface can have an implementation which acts as the source for the failure pipeline.

class FailurePipelineSource implements Source<Record<Event>>, FailurePipeline {

  private Buffer buffer;

  @Override
  public void start(Buffer buffer) {
    this.buffer = buffer;
  }

  @Override
  void sendFailedEvents(Collection<Record<Event>> events) {
    buffer.writeAll(events);
  }
}

When creating the pipeline in data-prepper-core, we can add an instance of FailurePipelineSource into the Pipeline class. I think that pipeline authors should be freed from having to think about writing the source: configuration.

kkondaka added the untriaged label Dec 13, 2023

github-project-automation bot added this to Data Prepper Tracking Board Dec 13, 2023

github-project-automation bot moved this to Unplanned in Data Prepper Tracking Board Dec 13, 2023

dlvenable added enhancement New feature or request and removed untriaged labels Dec 19, 2023

dlvenable mentioned this issue Jan 2, 2024

[BUG] Incomplete route set leads to duplicates when E2E ack is enabled. #3866

Closed

dlvenable mentioned this issue Feb 19, 2024

Catch processor exceptions instead of shutting down #4155

Merged

4 tasks

dlvenable mentioned this issue Mar 12, 2024

DLQ: Kafka sources and/or sinks #4267

Open

dlvenable mentioned this issue Jun 18, 2024

Dynamic key_path_prefix in S3 DLQ #4634

Open

dlvenable mentioned this issue Aug 1, 2024

Add sinks under extensions #4799

Open

dlvenable changed the title ~~provide pipeline level DLQ~~ Pipeline DLQ Sep 6, 2024

dlvenable added the Roadmap:Stability/Availability/Resiliency Project-wide roadmap label label Sep 6, 2024

dlvenable added this to Data Prepper Project Roadmap and OpenSearch Roadmap Sep 6, 2024

github-project-automation bot moved this to New in OpenSearch Roadmap Sep 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline DLQ #3857

Pipeline DLQ #3857

kkondaka commented Dec 13, 2023

jw-amazon commented Dec 14, 2023

dlvenable commented Dec 14, 2023

chenqi0805 commented Jan 24, 2024 •

edited

Loading

dlvenable commented Jan 24, 2024

dlvenable commented Oct 23, 2024

Pipeline DLQ #3857

Pipeline DLQ #3857

Comments

kkondaka commented Dec 13, 2023

jw-amazon commented Dec 14, 2023

dlvenable commented Dec 14, 2023

chenqi0805 commented Jan 24, 2024 • edited Loading

dlvenable commented Jan 24, 2024

dlvenable commented Oct 23, 2024

chenqi0805 commented Jan 24, 2024 •

edited

Loading