Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support passing struct data to the DB API #718

Merged
merged 26 commits into from
Jul 1, 2021

Conversation

jimfulton
Copy link
Contributor

Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • [ x] Appropriate docs were updated (if necessary)

Fixes #717 🦕

@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label Jun 23, 2021
@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label Jun 23, 2021
@jimfulton jimfulton marked this pull request as ready for review June 23, 2021 21:19
@jimfulton jimfulton requested a review from a team June 23, 2021 21:19
@jimfulton jimfulton requested a review from a team as a code owner June 23, 2021 21:19
@jimfulton jimfulton requested review from shollyman and removed request for a team June 23, 2021 21:19
@jimfulton
Copy link
Contributor Author

This also, as a side effect makes the pandas system tests run faster by changing the scope of the dataset_id fixture to session.

@jimfulton
Copy link
Contributor Author

A non-trivial part of this is allowing type parameters, as in string(10), which it then strips off.

This allows users to specify type parameters matching the table definitions. But we have to strip them off because they aren't allowed past table creation. I'm not sure the benefit of this is worth the extra complexity.

@plamut
Copy link
Contributor

plamut commented Jun 24, 2021

Quick comment - it would be helpful to reason about the code if helpers had an example or two in their docstrings, explaining the format of a typical (sub)string they process. You know, just to give readers some context. :)

Copy link
Contributor

@plamut plamut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two quick comments for a start, will have another look next time with a fresh head. :)

(some examples in helpers' docstrings would also be appreciated)

google/cloud/bigquery/dbapi/_helpers.py Outdated Show resolved Hide resolved
google/cloud/bigquery/dbapi/_helpers.py Outdated Show resolved Hide resolved
@jimfulton
Copy link
Contributor Author

Quick comment - it would be helpful to reason about the code if helpers had an example or two in their docstrings, explaining the format of a typical (sub)string they process. You know, just to give readers some context. :)

Good idea! Done.

Copy link
Contributor

@plamut plamut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spotted a few minor things, but otherwise this looks good!

See if we could reduce code duplication with a reasonable effort, but we can do it separately if it proves to be too complex for now.

google/cloud/bigquery/dbapi/_helpers.py Outdated Show resolved Hide resolved
google/cloud/bigquery/dbapi/_helpers.py Show resolved Hide resolved
google/cloud/bigquery/dbapi/_helpers.py Outdated Show resolved Hide resolved
docs/dbapi.rst Show resolved Hide resolved
google/cloud/bigquery/dbapi/cursor.py Show resolved Hide resolved
Comment on lines 309 to 314
raise NotImplementedError(
f"STRUCT-like parameter values are not supported"
f"{' (parameter ' + name + ')' if name else ''},"
f" unless an explicit type is give in the parameter placeholder"
f" (e.g. '%({name if name else ''}:struct<...>)s')."
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much more informative, great!

google/cloud/bigquery/dbapi/_helpers.py Outdated Show resolved Hide resolved
Copy link
Contributor

@plamut plamut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the refactoring.

I'll wait with merging a bit, just in case @shollyman has something to add.

@tswast tswast self-requested a review June 29, 2021 16:33
google/cloud/bigquery/dbapi/_helpers.py Outdated Show resolved Hide resolved
raise exceptions.ProgrammingError(f"Invalid parameter type, {type_}")
tname, sub = m.group(1, 2)
if tname.upper() == "ARRAY":
sub_type = complex_query_parameter_type(None, sub, base)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows for arrays of scalars, right? Looks like it from looking at the code, but then I'm a bit confused why we need the scalar logic in lines 183-197.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the code on line 183-197 because we support arrays and structs of scalars. Basically, we allow arrays of anything (other than arrays). To handle that, we recurse to handle whatever's inside the <>s, which might be a simple declaration like INT64.

@jimfulton jimfulton added kokoro:force-run Add this label to force Kokoro to re-run the tests. kokoro:run Add this label to force Kokoro to re-run the tests. labels Jun 29, 2021
@yoshi-kokoro yoshi-kokoro removed kokoro:run Add this label to force Kokoro to re-run the tests. kokoro:force-run Add this label to force Kokoro to re-run the tests. labels Jun 29, 2021
@plamut
Copy link
Contributor

plamut commented Jul 1, 2021

Last call - anybody else wants to add anything here? If not, I'll merge it soon. 🙂

@jimfulton jimfulton merged commit 38b3ef9 into master Jul 1, 2021
@jimfulton jimfulton deleted the riversnake-dbi-struct-types branch July 1, 2021 14:49
gcf-merge-on-green bot pushed a commit that referenced this pull request Jul 13, 2021
Supersedes #711.


## [2.21.0](https://www.github.com/googleapis/python-bigquery/compare/v2.20.0...v2.21.0) (2021-07-13)


### Features

* Add max_results parameter to some of the `QueryJob` methods. ([#698](https://www.github.com/googleapis/python-bigquery/issues/698)) ([2a9618f](https://www.github.com/googleapis/python-bigquery/commit/2a9618f4daaa4a014161e1a2f7376844eec9e8da))
* Add support for decimal target types. ([#735](https://www.github.com/googleapis/python-bigquery/issues/735)) ([7d2d3e9](https://www.github.com/googleapis/python-bigquery/commit/7d2d3e906a9eb161911a198fb925ad79de5df934))
* Add support for table snapshots. ([#740](https://www.github.com/googleapis/python-bigquery/issues/740)) ([ba86b2a](https://www.github.com/googleapis/python-bigquery/commit/ba86b2a6300ae5a9f3c803beeb42bda4c522e34c))
* Enable unsetting policy tags on schema fields. ([#703](https://www.github.com/googleapis/python-bigquery/issues/703)) ([18bb443](https://www.github.com/googleapis/python-bigquery/commit/18bb443c7acd0a75dcb57d9aebe38b2d734ff8c7))
* Make it easier to disable best-effort deduplication with streaming inserts. ([#734](https://www.github.com/googleapis/python-bigquery/issues/734)) ([1246da8](https://www.github.com/googleapis/python-bigquery/commit/1246da86b78b03ca1aa2c45ec71649e294cfb2f1))
* Support passing struct data to the DB API. ([#718](https://www.github.com/googleapis/python-bigquery/issues/718)) ([38b3ef9](https://www.github.com/googleapis/python-bigquery/commit/38b3ef96c3dedc139b84f0ff06885141ae7ce78c))


### Bug Fixes

* Inserting non-finite floats with `insert_rows()`. ([#728](https://www.github.com/googleapis/python-bigquery/issues/728)) ([d047419](https://www.github.com/googleapis/python-bigquery/commit/d047419879e807e123296da2eee89a5253050166))
* Use `pandas` function to check for `NaN`. ([#750](https://www.github.com/googleapis/python-bigquery/issues/750)) ([67bc5fb](https://www.github.com/googleapis/python-bigquery/commit/67bc5fbd306be7cdffd216f3791d4024acfa95b3))


### Documentation

* Add docs for all enums in module. ([#745](https://www.github.com/googleapis/python-bigquery/issues/745)) ([145944f](https://www.github.com/googleapis/python-bigquery/commit/145944f24fedc4d739687399a8309f9d51d43dfd))
* Omit mention of Python 2.7 in `CONTRIBUTING.rst`. ([#706](https://www.github.com/googleapis/python-bigquery/issues/706)) ([27d6839](https://www.github.com/googleapis/python-bigquery/commit/27d6839ee8a40909e4199cfa0da8b6b64705b2e9))
gcf-merge-on-green bot pushed a commit that referenced this pull request Jul 14, 2021
🤖 I have created a release \*beep\* \*boop\*
---
## [2.21.0](https://www.github.com/googleapis/python-bigquery/compare/v2.20.0...v2.21.0) (2021-07-14)


### Features

* add always_use_jwt_access ([#714](https://www.github.com/googleapis/python-bigquery/issues/714)) ([92fbd4a](https://www.github.com/googleapis/python-bigquery/commit/92fbd4ade37e0be49dc278080ef73c83eafeea18))
* add max_results parameter to some of the QueryJob methods ([#698](https://www.github.com/googleapis/python-bigquery/issues/698)) ([2a9618f](https://www.github.com/googleapis/python-bigquery/commit/2a9618f4daaa4a014161e1a2f7376844eec9e8da))
* add support for decimal target types ([#735](https://www.github.com/googleapis/python-bigquery/issues/735)) ([7d2d3e9](https://www.github.com/googleapis/python-bigquery/commit/7d2d3e906a9eb161911a198fb925ad79de5df934))
* add support for table snapshots ([#740](https://www.github.com/googleapis/python-bigquery/issues/740)) ([ba86b2a](https://www.github.com/googleapis/python-bigquery/commit/ba86b2a6300ae5a9f3c803beeb42bda4c522e34c))
* enable unsetting policy tags on schema fields ([#703](https://www.github.com/googleapis/python-bigquery/issues/703)) ([18bb443](https://www.github.com/googleapis/python-bigquery/commit/18bb443c7acd0a75dcb57d9aebe38b2d734ff8c7))
* make it easier to disable best-effort deduplication with streaming inserts ([#734](https://www.github.com/googleapis/python-bigquery/issues/734)) ([1246da8](https://www.github.com/googleapis/python-bigquery/commit/1246da86b78b03ca1aa2c45ec71649e294cfb2f1))
* Support passing struct data to the DB API ([#718](https://www.github.com/googleapis/python-bigquery/issues/718)) ([38b3ef9](https://www.github.com/googleapis/python-bigquery/commit/38b3ef96c3dedc139b84f0ff06885141ae7ce78c))


### Bug Fixes

* inserting non-finite floats with insert_rows() ([#728](https://www.github.com/googleapis/python-bigquery/issues/728)) ([d047419](https://www.github.com/googleapis/python-bigquery/commit/d047419879e807e123296da2eee89a5253050166))
* use pandas function to check for NaN ([#750](https://www.github.com/googleapis/python-bigquery/issues/750)) ([67bc5fb](https://www.github.com/googleapis/python-bigquery/commit/67bc5fbd306be7cdffd216f3791d4024acfa95b3))


### Documentation

* add docs for all enums in module ([#745](https://www.github.com/googleapis/python-bigquery/issues/745)) ([145944f](https://www.github.com/googleapis/python-bigquery/commit/145944f24fedc4d739687399a8309f9d51d43dfd))
* omit mention of Python 2.7 in `CONTRIBUTING.rst` ([#706](https://www.github.com/googleapis/python-bigquery/issues/706)) ([27d6839](https://www.github.com/googleapis/python-bigquery/commit/27d6839ee8a40909e4199cfa0da8b6b64705b2e9))
---


This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support bound struct parameters with explicit types in db api interface
4 participants