Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preventing PubSub.Receive from preloading too many messages #1097

Closed
cmoad opened this issue Aug 8, 2018 · 5 comments
Closed

Preventing PubSub.Receive from preloading too many messages #1097

cmoad opened this issue Aug 8, 2018 · 5 comments
Assignees
Labels
api: pubsub Issues related to the Pub/Sub API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. 🚨 This issue needs some love. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@cmoad
Copy link

cmoad commented Aug 8, 2018

Client

PubSub
cloud.google.com/go - v0.25.0

Describe Your Environment

GKE. Services are docker images from "scratch" with go binary.

Expected Behavior

PubSub streaming client should have a configuration that limits the number of messages pulled in to process. We are setting MaxOutstandingMessages to control how many messages are processed concurrently, but the client floods the process with messages that are not completed within 30 minutes. Individual messages are typically consumed within 0.5-5 seconds.

I'm hoping there is an easy answer to this we are missing. To be clear. This is only an issue when processing a backlog of items. We're not perfect and sometimes new data brings introduces an error that we have to fix in order to continue processing.

Actual Behavior

Upon starting our service the process is flooded with StreamingPull Operations. So many in fact that we cannot process them within the 30 minute window of MaxExtension, so our processes begin working on and acking messages that have already expired causing a lot of duplicative work.

Below is a screenshot of our Stackdriver metrics demonstrating the problem. I was told by GCP support that the orange line on Acknowledge Requests means messages acked within deadline. The green line means messsages acked outside of the deadline. Orange == good. Green == bad. (no comment)

At this time we were running 12 instances of our service, each with a MaxOutstandingMessages set to 3. In particular we are indexing to an Elasticsearch cluster, so we can't simply scale up beyond the capabilities of our cluster.

At the beginning of the period we are only acking messages outside of the extension window. Then we restart the services and start acking messages within the window. After 30 minutes we are back to only acking messages outside the window. You can see the effect in our Undelivered Messages (queue size). It's flatlined or going up slowly during the periods that we are successfully processing messages beyond the max extension window. This extra work adds a lot of load to our Elasticsearch cluster and makes it really hard to catch up from a backlog.

screen shot 2018-08-08 at 4 05 39 pm
screen shot 2018-08-08 at 4 05 55 pm

@JustinBeckwith JustinBeckwith added the triage me I really want to be triaged. label Aug 9, 2018
@jba
Copy link
Contributor

jba commented Aug 10, 2018

This is WIP. See #1088.

@jba jba self-assigned this Aug 10, 2018
@jba jba added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. api: pubsub Issues related to the Pub/Sub API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. labels Aug 10, 2018
@JustinBeckwith JustinBeckwith removed the triage me I really want to be triaged. label Aug 10, 2018
@cmoad
Copy link
Author

cmoad commented Aug 10, 2018

Thank you. #1088 would address our issue perfectly.

@JustinBeckwith JustinBeckwith added 🚨 This issue needs some love. and removed 🚨 This issue needs some love. labels Aug 15, 2018
@JustinBeckwith JustinBeckwith added 🚨 This issue needs some love. and removed 🚨 This issue needs some love. labels Aug 25, 2018
@JustinBeckwith JustinBeckwith added 🚨 This issue needs some love. and removed 🚨 This issue needs some love. labels Sep 4, 2018
@JustinBeckwith JustinBeckwith added 🚨 This issue needs some love. and removed 🚨 This issue needs some love. labels Sep 9, 2018
@JustinBeckwith JustinBeckwith added 🚨 This issue needs some love. and removed 🚨 This issue needs some love. labels Sep 19, 2018
@jba
Copy link
Contributor

jba commented Sep 28, 2018

#1088 is done. Please give it a try and report back here.

@jeanbza
Copy link
Contributor

jeanbza commented Oct 12, 2018

Friendly ping - does anyone mind me closing this issue now?

@jeanbza
Copy link
Contributor

jeanbza commented Oct 12, 2018

Closing this due to two weeks of inactivity. Give a shout if it remains something to discuss, happy to re-open.

@jeanbza jeanbza closed this as completed Oct 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: pubsub Issues related to the Pub/Sub API. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. 🚨 This issue needs some love. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

4 participants