You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We're seeing significant issues around duplicate message delivery for queues with backlogs. When a queue is running in near-realtime, we see 0 duplicates. If there is a backlog, there is some interesting behavior that happens with the queue cache and we see 20-30% duplicate message deliveries. The queue is processing quickly, so we are not exceeding the queue server timeout (vt_ack_wait=300).
Here's a chart for the last 24 hours showing duplicate rates for a queue with significant backlog.
Over the same period is a queue processing more messages with 0 duplicates until we stopped our queue consumer for 15 minutes to build up a backlog. Once it worked through those, there were no more duplicates.
Possibly related to this, we see weird behavior when new consumers connect to the message manager. For context on the chart:
Ready to run: time_next <= NOW()
Waiting to run: time_next > NOW()
Failed: time_next = MaxInt64 (this is our internal usage)
You can see a large jump in the status of 1M messages that coincides with us increasing consumers. We see this every time consumers change, and these metrics are collected via an out of band query, so the table itself must be changing. I can't explain what is happening. We see similar behavior if we run a query to reschedule messages that are failed or already acked.
My gut feeling is that something is happening in the message cache, but I don't have any data yet to support that.
The text was updated successfully, but these errors were encountered:
We're seeing significant issues around duplicate message delivery for queues with backlogs. When a queue is running in near-realtime, we see 0 duplicates. If there is a backlog, there is some interesting behavior that happens with the queue cache and we see 20-30% duplicate message deliveries. The queue is processing quickly, so we are not exceeding the queue server timeout (
vt_ack_wait=300
).Here's a chart for the last 24 hours showing duplicate rates for a queue with significant backlog.
Over the same period is a queue processing more messages with 0 duplicates until we stopped our queue consumer for 15 minutes to build up a backlog. Once it worked through those, there were no more duplicates.
Possibly related to this, we see weird behavior when new consumers connect to the message manager. For context on the chart:
time_next <= NOW()
time_next > NOW()
time_next = MaxInt64
(this is our internal usage)You can see a large jump in the status of 1M messages that coincides with us increasing consumers. We see this every time consumers change, and these metrics are collected via an out of band query, so the table itself must be changing. I can't explain what is happening. We see similar behavior if we run a query to reschedule messages that are failed or already acked.
My gut feeling is that something is happening in the message cache, but I don't have any data yet to support that.
The text was updated successfully, but these errors were encountered: