[PubSub] Avoid processing pubsub messages whose ack deadline has already expired #3734
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This stems from the same investigation work as #3633 but I believe is a better solution to the problem. It also relates to this guidance https://cloud.google.com/pubsub/docs/pull#dealing-with-large-backlogs-of-small-messages
I don't feel the current client makes a best-effort attempt to reach at-most-once processing. While it is not a service guarantee the move to StreamingPull has made it a common occurrence in our setup to see backlogs of messages that simply can not be consumed due to acks being rejected.
I want to emphasise this, I do not mean simply we get duplicates, I mean we get so many duplicates that the backlog never finishes. We had a 400k message backlog open for over 48 hours trying to hack a solution to this, it was not until we switched away from the default StreamingPull subscriber and use the raw pull methods that we were able to chew through this backlog in under and hour.
This change checks whether an ack deadline has already passed before handing the message off to the MessageHandler, we know the ack would be rejected so why knowingly double process the message.
Most controversial in here is that I made the subscriber fetch the subscription metadata so it has the default deadline at hand to take in to account. Without this, setting
setMessageDeadlineSeconds(Duration.ZERO)
results in nothing being processed as the code thinks the message is already expired.If pulling that metadata is a big issue, I think assuming the default as the minimum deadline allowed (10s) would also work at the cost of nacking and re-pulling additional messages in cases that have a longer default deadline.
With all this in place I can effectively revert #3633 and allow the message to be picked up and handled in the standard flow, which tidies the code up somewhat.
I have a sample program here - https://github.com/csainty/pubsub-fun/blob/master/src/main/java/App.java - that assuming your core-count is similar to mine will receive up to 40% duplicate messages due to ack rejection.