Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery: Add QueryResults.reload() function which calls getQueryResults #3506

Closed
tswast opened this issue Jun 15, 2017 · 6 comments
Closed
Assignees
Labels
api: bigquery Issues related to the BigQuery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@tswast
Copy link
Contributor

tswast commented Jun 15, 2017

https://cloud.google.com/bigquery/docs/reference/rest/v2/jobs/getQueryResults

One of the things I learned when talking to the BigQuery API folks is that the recommended way to run a query is to create a query job and then call getQueryResults() until the jobComplete property is true.

The reason for this is that a call to get the job resource will return immediately, whereas getQueryResults will return after a timeout or the query is complete, whichever comes first. With getQueryResults users get their data faster.

This means the code for waiting for a query job to complete would change from

while True:
   query_job.reload()
    if query_job.state == 'DONE':
        return
    time.sleep(1)

to something like

query_results = query_job.results()
while not query_results.complete:
    query_results.reload()
    # No need for sleep, since reload should wait for timeout
@tswast tswast added the api: bigquery Issues related to the BigQuery API. label Jun 15, 2017
@tswast
Copy link
Contributor Author

tswast commented Jun 15, 2017

I figured out that I can force a reload with

    query_iterator = query_results.fetch_data()
    try:
       six.next(iter(query_iterator))
    except StopIteration:
        pass

but obviously, that's not ideal for the purpose of waiting for a job to finish.

@tseaver
Copy link
Contributor

tseaver commented Jun 19, 2017

If you are going to iterate over the rows anyway, then that second example becomes:

for row in query_results.fetch_data():
    do_something_with(row)

That certainly seems like "idiomatic" usage. Does that accomplish what you need?

@tswast
Copy link
Contributor Author

tswast commented Jun 19, 2017

No. The results might not all be there. That's why I need a reload function to check if the query is complete or not. (A bit hard to test, since it requires a query that takes longer than the timeout value, which defaults to 10 seconds).

@tswast tswast changed the title BigQuery: Add QueryResults.reload() function which calls getQueryResults BigQuery: make QueryResults implement the futures interface Aug 1, 2017
@tswast
Copy link
Contributor Author

tswast commented Aug 1, 2017

A reload function probably isn't the best solution. After thinking about this, I think QueryResults should also implement the futures interface.

@theacodes
Copy link
Contributor

I don't think QueryResults should implement the future interface, rather, I think that QueryJob (which already has the futures interface) should use the getQueryResults API method instead of getJob.

@lukesneeringer lukesneeringer added priority: p2 Moderately-important priority. Fix may not be included in next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. and removed priority: p2 Moderately-important priority. Fix may not be included in next release. labels Aug 7, 2017
@tswast tswast changed the title BigQuery: make QueryResults implement the futures interface BigQuery: Add QueryResults.reload() function which calls getQueryResults Aug 11, 2017
@tswast
Copy link
Contributor Author

tswast commented Aug 11, 2017

After I've played around with some proposed designs for GA, I agree with you, Jon. QueryJob's implementation of futures interface will be sufficient.

I've changed this request back to asking for .reload() method since there does need to be a more straightforward way to actually call getQueryResults.

Closing this issue since I'm tracking internally in GA redesign doc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the BigQuery API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

4 participants