Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Incomplete route set leads to duplicates when E2E ack is enabled. #3866

Closed
engechas opened this issue Dec 28, 2023 · 5 comments · Fixed by #3959
Closed

[BUG] Incomplete route set leads to duplicates when E2E ack is enabled. #3866

engechas opened this issue Dec 28, 2023 · 5 comments · Fixed by #3959
Assignees
Labels
bug Something isn't working
Milestone

Comments

@engechas
Copy link
Collaborator

Describe the bug
This is not technically a bug, but is easy to miss and leads to unexpected behavior. If a user has acknowledgements enabled for their source and routes enabled, but the routes do not cover all possibilities of the incoming data, a portion of the data will never end up in a sink or DLQ. This prevents the ack from being acknowledged and the sink permanently loops on a set of incoming data.

Example:

  • S3-SQS source pipeline with acks enabled
  • A single route is defined where '/myfield == "my value"'

If one of the S3 objects contains /myfield with a value of not my value, then it is not routed to a sink so the acknowledgement set cannot complete successfully. The ack timeout expires and the object re-enters the queue to be processed again.

To Reproduce
Steps to reproduce the behavior: Documented above

Expected behavior
There are a few ways this could be handled, I am not sure which is correct.

  1. Any records not matching a route end up in a DLQ
  2. Acknowledgements consider records with no route being dropped as intentional and still send a positive ack in this case
  3. Validation prevents a pipeline from being created if acks are enabled and the routes form an incomplete set of all possibilities
@dlvenable
Copy link
Member

Overall, I think the solution should be proposal 2 - unrouted events are acknowledged. If the user creates a pipeline-level DLQ, then we could route those events there (per proposal 1). The pipeline-level DLQ (#3857) could allow for combining both proposals in this way.

@kkondaka
Copy link
Collaborator

kkondaka commented Jan 2, 2024

@dlvenable you mean proposal 2 with events sent to DLQ, right? we do not want to drop events and send positive acknowledgements due to routing mis-configurations. Since acknowledgements are introduced to provide durability, silent dropping is not the right thing. We should send them to DLQ. So, I think this would be proposal 1, not proposal 2.

@dlvenable
Copy link
Member

@kkondaka , I'd like to split two concepts - durability and pipeline correctness. Durability relates to not losing data. Correctness relates to whether what the pipeline is defined to do is correct or not.

If we acknowledge an un-routed event, we have not lost durability. The configuration is stating that these events should not go to any sink. Now, there may be a pipeline correctness concern in such a pipeline. Did the customer mean to drop this data by not routing it? I think that is part of what @engechas ' proposal 3 is providing.

By using acknowledgements to verify correctness, we have a fairly large behavioral difference between acknowledgements on or off. We treat un-routed events just fine with acknowledgments off. I tend to think that we should keep acknowledgments focused only on durability. And not try to use them to solve correctness issues. I'd rather explore other solutions for pipeline correctness.

@kkondaka
Copy link
Collaborator

kkondaka commented Jan 2, 2024

@dlvenable I am fine with proposal 3. When acks are enabled, we should not silently drop any events to provide durability. What that means is that the config must explicitly drop events, the data-prepper should not drop events just because there is no route.

@dlvenable
Copy link
Member

We solved this with proposal 2. Un-routed events send acknowledgements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Development

Successfully merging a pull request may close this issue.

3 participants