-
Notifications
You must be signed in to change notification settings - Fork 227
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duplicate messages exactly every 15 minutes #68
Comments
I believe the lifespan of each stream created via @kir-titievsky does that sound like the expected behavior? Is there something on the client we should be doing differently to prevent these redeliveries? |
From my reading of the code, the Node subscription client assumes that it will not receive messages with ack ids that it is already leasing. This is evident based on how it tracks and clears ack ids in the internal inventory array, with no checking of an existing duplicate ack id. IMHO the current client is broken in this regard, as after receiving a duplicate message / ack id, calling I've monkey patched my own subscriptions const openConnectionOld = sub.openConnection_;
sub.openConnection_ = function () {
openConnectionOld.call(this, ...arguments);
const pool = this.connectionPool;
pool.removeAllListeners('message');
pool.on('message', (message) => {
// If we already have the ackId leased, ignore
if (this.inventory_.lease.indexOf(message.ackId) >= 0) {
console.log(`Already leasing ack ID ${message.ackId}. Ignoring...`);
return;
}
if (!this.hasMaxMessages_()) {
this.emit('message', this.leaseMessage_(message));
return;
}
if (!pool.isPaused) {
pool.pause();
}
message.nack();
});
}; I could submit a PR, but I'm not sure if the duplicate messages are indicative of some other, upstream problem. Perhaps the ack id inventory would be better implemented using a |
@rossj I agree with you, however the client was built with specifications from the PubSub team and this particular issue has been brought up several times before. I think we want input from the PubSub team on what the correct solution to this issue actually is. |
@kir-titievsky just a friendly ping :) |
@ctavan @rossj This should get better this week. The behavior had to do with how treated messages that were sent to a client but not yet acknowledged when a streamingPull connection was broken down and rebuilt. A feature rolling out right now (and expected to be completely out tomorrow) should ensure that nothing special -- like duplicate delivery-- happens when connections are rebuilt. @ctavan Might you check in on your metrics in two days and tell me if you still see this? |
Thanks @kir-titievsky, we'll check again in a few days! Any chance that the changes might affect #73 as well? |
@ctavan have you been experiencing this issue lately? |
I'm currently out of office. I'll try to reproduce as soon as possible, but it might take me a few more days. |
No worries!
…On Thu, Mar 22, 2018, 5:45 AM Christoph Tavan ***@***.***> wrote:
I'm currently out of office. I'll try to reproduce as soon as possible,
but it might take me a few more days.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#68 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABoNA698WUcAaFlsE4rUDmSax9gAS4X5ks5tgoO7gaJpZM4SD7o5>
.
|
@callmehiphop @kir-titievsky I have finally found the time to check this again. First of all I re-ran the test using I have then upgraded to So even though I cannot confirm that there was a server-side fix for the issue (since it still persists with the old version of the node client) I will still go ahead and close this issue since it is gone since the latest refactoring of this client. Thanks for your support! |
@ctavan glad to hear it, thanks for all your help! |
* chore: lock files maintenance * chore: lock files maintenance
* chore: setup nighty build in CircleCI * chore: setup nighty build in CircleCI
Environment details
Steps to reproduce
This is most likely related to the discussion in #2 (comment) (and following comments in that thread). However since the discussion there somewhat faded out I want to report my findings in a new issue.
My subscriber is consuming messages at a rate of roughly 500/s and it is receiving small batches of duplicate messages exactly every 15 minutes. Those batches typically contain between 100 and 400 duplicate messages. Here's a plot of the number of duplicates over time:
Most of the duplicates are being delivered to my subscriber within less then a second. Here's a histogram of the durations between redeliveries in milliseconds:
As you can see, the batches of duplicates coincide with spikes in Stackdriver graphs on
StreamingPull Operations
andStreamingPull Acknowledge Requests
(please note that Stackdriver shows Berlin time while the above graph shows UTC, hence 1h time difference):From the comments in the other thread I did not really understand whether the behavior we see is actually expected. What's the reason for this to happen precisely every 15 minutes?
Even though the absolute number of duplicates is well below 1%, this still looks pretty odd, unexpected and unnecessary. I'd love to understand better what's causing this issue and how it could potentially be fixed.
/cc @kir-titievsky @callmehiphop @rossj
The text was updated successfully, but these errors were encountered: