Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: detect obsolete BQ Storage extra at runtime #666

Merged
merged 8 commits into from
May 20, 2021

Conversation

plamut
Copy link
Contributor

@plamut plamut commented May 17, 2021

Closes #629.

This PR adds logic to detect obsolete versions of BQ Storage extra and emit a warning or raise an error where applicable.

Traced all BQ Storage-related code and call paths and I think the changes here cover them all. As a bonus, we get a graceful fallback to tabledata.list in most cases (a warning is emitted instead of raising an error).

  • Make sure to open an issue as a bug/issue before writing your code! That way we can discuss the change, evaluate designs, and agree on the general idea
  • Ensure the tests and linter pass
  • Code coverage does not decrease (if any source code was changed)
  • Appropriate docs were updated (if necessary)

@google-cla google-cla bot added the cla: yes This human has signed the Contributor License Agreement. label May 17, 2021
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/python-bigquery API. label May 17, 2021
@plamut plamut marked this pull request as ready for review May 18, 2021 11:05
@plamut plamut requested a review from a team May 18, 2021 11:05
@plamut plamut requested a review from a team as a code owner May 18, 2021 11:05
@plamut plamut requested review from stephaniewang526 and tswast and removed request for a team May 18, 2021 11:05
@plamut
Copy link
Contributor Author

plamut commented May 18, 2021

Will fix the coverage error son, one line not hit by the tests.

@jimfulton jimfulton self-requested a review May 18, 2021 14:06
Copy link
Contributor

@jimfulton jimfulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can be a little DRYer if we reuse _create_bqstorage_client more after adding some arguments to it.

google/cloud/bigquery/client.py Show resolved Hide resolved
google/cloud/bigquery/dbapi/cursor.py Outdated Show resolved Hide resolved
google/cloud/bigquery/_pandas_helpers.py Outdated Show resolved Hide resolved
google/cloud/bigquery/magics/magics.py Outdated Show resolved Hide resolved
@plamut
Copy link
Contributor Author

plamut commented May 18, 2021

I think we can be a little DRYer if we reuse _create_bqstorage_client more after adding some arguments to it.

Sounds reasonable, I'll probably give it a shot. 👍

@jimfulton
Copy link
Contributor

I'm proposing that _create_bqstorage_client becomes something like:

    def _create_bqstorage_client(self, bqstorage_client=None, client_info=None, client_options=None):
        """Create a BigQuery Storage API client using this client's credentials.

        If a client cannot be created due to missing dependencies, raise a
        warning and return ``None``.

        Returns:
            Optional[google.cloud.bigquery_storage.BigQueryReadClient]:
                A BigQuery Storage API client.
        """
        try:
            from google.cloud import bigquery_storage
        except ImportError:
            warnings.warn(
                "Cannot create BigQuery Storage client, the dependency "
                "google-cloud-bigquery-storage is not installed."
            )
            return None

        try:
            _verify_bq_storage_version()
        except LegacyBigQueryStorageError as exc:
            warnings.warn(str(exc))
            return None

        if bqstorage_client is None:
            bqstorage_client = bigquery_storage.BigQueryReadClient(
                credentials=self._credentials,
                client_info=client_info,
                client_options=client_options,
                )

        return bqstorage_client

and maybe gets renamed :) and that it's called even when the caller is given a client.

If the library is out of date, the caller will end up with None even if given a value.

The method is renamed to _ensure_bqstorage_client() and now performs
a check if BQ Storage dependency is recent enough.
The check is now performed in dbapi.Connection, which is sufficient.
The methods in higher layers already do the same check before
a BQ Storage client instance is passed to
_pandas_helpers._download_table_bqstorage() helper.
Lean more heavily on client._ensure_bqstorage_client() to de-duplicate
logic.
@plamut
Copy link
Contributor Author

plamut commented May 19, 2021

I checked the call chains and if I didn't miss anything, the only place left where a bad BQ Storage client could leak down is dbapi.Connection, which I sealed. Some of the lower-level checks turned out to be redundant, thus removed them:

  • In dbapi.Cursor._bqstorage_fetch()
  • In _pandas_helpers._download_table_bqstorage()

Refactored Client._make_bqstorage_client as suggested and de-duplicated logic in magics and dbapi.

Apart from the client's factory, the other central place where we check if BQ Storage client is compatible is table._validate_bqstorage() (and issues a warning if it isn't). The top level table methods use this check and set bqstorage_client to None, if necessary, before sending it down the call chain to lower level utility functions.

Edit: Will fix the two coverage misses tomorrow.

Copy link
Contributor

@jimfulton jimfulton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@plamut plamut merged commit bd7dbda into googleapis:master May 20, 2021
@plamut plamut deleted the iss-629 branch May 20, 2021 07:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support google-cloud-bigquery-storage v1
2 participants