Skip to content

PyPi release 1.6.4

Compare
Choose a tag to compare
@mikedarcy mikedarcy released this 03 Feb 00:32
· 47 commits to master since this release

Release Notes

Added Google Cloud Storage fetch handler for handling gs:// URLs in fetch.txt.

Note that this is a soft dependency and you must install the gcloud CLI on the system where you will be running
bdbag in order for this handler to function.

Enabling "requester pays":

This handler supports the requester pays usage pattern by allowing the billable project_id to be specified in the auth_params object for
a corresponding keychain.json entry for a matching gs:// URI pattern.

For example, to configure (and allow) requester pays for a GS bucket, you would add a keychain.json entry similar
to the following:

{
    "uri": "gs://gcs-bdbag-integration-testing/",
    "auth_type": "gcs-credentials",
    "auth_params": {
        "project_id": "bdbag-204999",
        "allow_requester_pays": true
    }
}

You can also explicitly disallow requester pays at the client-side in the following ways:

  • Set allow_requester_pays to false
  • Omit the allow_requester_pays field.
  • Omit the project_id field.
  • Omit the auth_params object entirely.

Note that if you do any of the above, data retrieval requests to buckets which have requester pays enabled will fail.
The use case for this configuration option is to ensure that you don't pay for requests when requester pays
is disabled on the bucket. Per the following GCS documentation:

Important: Buckets that have Requester Pays disabled still accept requests that include a billing project, 
and charges are applied to the billing project supplied in the request. 
Consider any billing implications prior to including a billing project in all of your requests.

IMPORTANT NOTE:

At the time of this writing, when using gcloud-CLI from Google Cloud SDK 416.0.0 and previous, it is
possible to still be billed for bucket usage even if you've disallowed requester pays for a given bucket in
keychain.json. This is because the gcloud init process requires that you specify a default project_id and this
project id is subsequently stored in the application_default_credentials.json file used by the GCS APIs
(which the bdbag fetch handler uses) as quota_project_id. If this value is present it will be passed on all GCS API
calls as a fallback regardless even if explicitly not passed to the API call.
This can be worked around by removing the quota_project_id from application_default_credentials.json.

Using service account credentials:

It is also possible to specify a service_account_credentials_file which is a file path referencing a service account
credentials JSON file provided by Google Cloud Storage. For example:

{
    "uri": "gs://bdbag-dev/",
    "auth_type": "gcs-credentials",
    "auth_params": {
        "project_id": "bdbag-204400",
        "service_account_credentials_file": "/home/bdbag/bdbag-204400-41babdd46e24.json"
    }
}