-
Notifications
You must be signed in to change notification settings - Fork 598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bigquery job.promise() is not documented? and always polling at a fixed interval of 500ms? #2710
Comments
@stephenplusplus are there any reasons why we hardcode the polling interval and/or don't use a back off in operations? |
I think that makes sense, although I can't find any official recommendations from the Google API docs. Do you know of any @c0b, or best practices in general for what should be our default? Related: This was brought up in #1886, and discarded because GAPIC APIs already do backoff and allow configuration of retry parameters. Maybe we should look at what they do and copy it here. |
I did a search "best practices backing off strategy site:cloud.google.com" got this doc on storage, and indeed saw more places (like errors trouble shooting guide) mentioned the exponential backoff strategy https://cloud.google.com/storage/docs/exponential-backoff
another reason we need this is the bigquery API requests limit is 100/s isn't that big, my client side ETL script is constantly doing some hundreds of BATCH mode query, that could fill up the 100/s very soon get rateLimitError if use the https://cloud.google.com/bigquery/quotas
|
the |
The only thing I'm stuck on is we are talking about polling as opposed to error handling. We aren't retrying in the traditional sense, where the first request failed, so we must try again. We are only making a follow-up request because the job didn't complete yet. That's why I'm drawn to the constant interval, although I'm completely fine going with another value than 500ms, and/or allowing it to be configurable. I'm also fine with introducing the backoff strategy that GAPIC uses, since it's probably safe to assume if they're doing it, it's the right thing for us to do. These are the options (source):
These are the default settings for longrunning operations (source): {
initialRetryDelayMillis: 100,
retryDelayMultiplier: 1.3,
maxRetryDelayMillis: 60000,
initialRpcTimeoutMillis: null,
rpcTimeoutMultiplier: null,
maxRpcTimeoutMillis: null,
totalTimeoutMillis: null
} @callmehiphop WDYT? |
I think that's fine, should we make a setter of some kind to allow users to configure these options? operation
.setPollingOptions({ retryDelayMultiplier: 2 })
.on('error', console.errror)
.on('complete', () => console.log('yay')); ? |
That sounds nice, how would we handle the methods that handle job polling internally, e.g. https://github.com/GoogleCloudPlatform/google-cloud-node/blob/7160d17a3b3c0f3af4bff900cc9ac64831564498/packages/bigquery/src/table.js#L377? Accept a new option, like |
This issue was moved to googleapis/nodejs-bigquery#13 |
from the bigquery job api I was only aware the
complete
event to listen to a job with a callback when job completed, till recently I found from some gist shared code I got that ajob.promise()
is available, since our application uses node v6 and recently upgraded to v8; the promise api fits the code better, and works with async await model; wonder should you at least document it?https://googlecloudplatform.github.io/google-cloud-node/#/docs/bigquery/0.9.6/bigquery/job
On the other hand, I spent some time figured out how was this default
job.promise()
working, I found the call trace down to theOperation
's setTimeout self.startPolling of every 500ms, so it's polling at a hard coded interval of 500ms? while in many gcloud products best practices a backing off strategry of retrying is preferred,https://github.com/GoogleCloudPlatform/google-cloud-node/blob/master/packages/common/src/operation.js#L184
this behavior of polling 500ms may be acceptable (or wanted) for some cases, for our ETL scripts which runs hundreds of query jobs concurrently in BATCH mode is just not so efficient, for this ETL purpose I have a piece of code already in use in production for a long while, implemented the backing off strategry, it supports an optional options obj parameters of
waitbegin
(default to 500ms) andwaitmax
(default to 10s)so with this API, it's similar to
job.promise()
we can write code like this, but internally it's doing a backing off strategy of retrying retrieve metadata;or with async await
the console.log lines give us transparency of how healthy each job runs, state transition from 'PENDING' to 'RUNNING' to 'DONE'
I'm not sure this strategy can be in the
Operation
for all the@google-cloud/...
packages, but at least works for bigquery job; let me know if you like the code.The text was updated successfully, but these errors were encountered: