Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add page iterator to ReadRowsStream #7680

Merged
merged 3 commits into from
Apr 16, 2019

Conversation

tswast
Copy link
Contributor

@tswast tswast commented Apr 8, 2019

This allows readers to read blocks (called pages for compatibility with
BigQuery client library) one at a time from a stream. This enables use
cases such as progress bar support and streaming workers that expect
pandas DataFrames.

Towards #7654

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Apr 8, 2019
@tswast tswast force-pushed the bqstorage-rowiterator branch 2 times, most recently from f7bce5b to df5500e Compare April 9, 2019 18:18
@tswast tswast changed the title WIP: allow parsing a bqstorage stream to dataframes, one page at a time Add page iterator to ReadRowsStream Apr 9, 2019
@tswast tswast marked this pull request as ready for review April 9, 2019 18:19
@tswast tswast requested a review from crwilcox as a code owner April 9, 2019 18:19
@tswast tswast added the api: bigquerystorage Issues related to the BigQuery Storage API. label Apr 9, 2019
@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Apr 15, 2019
]
avro_schema = _bq_to_avro_schema(bq_columns)
read_session = _generate_read_session(avro_schema)
bq_blocks_1 = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit: the naming of the "blocks" variables reads oddly to me. You're testing a single block with multiple rows, not multiple blocks. It's consistent with existing tests, just seems odd.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this does contain 2 blocks. black just formatted it in such a way that it's not clear.

I've separated each block into its own variable and added a comment for why there are two groups of blocks. Hopefully this makes it clearer.

This allows readers to read blocks (called pages for compatibility with
BigQuery client library) one at a time from a stream. This enables use
cases such as progress bar support and streaming workers that expect
pandas DataFrames.
@tswast tswast merged commit d3bcf77 into googleapis:master Apr 16, 2019
@tswast tswast deleted the bqstorage-rowiterator branch April 16, 2019 20:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquerystorage Issues related to the BigQuery Storage API. cla: yes This human has signed the Contributor License Agreement. 🚨 This issue needs some love.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants