Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add missing generation arguments for bucket and blob methods. #7023

Closed
wants to merge 4 commits into from

Conversation

jmcarp
Copy link

@jmcarp jmcarp commented Dec 25, 2018

Allow setting generation for getting and deleting blobs, and respect generation for rewriting.

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Dec 25, 2018
@tseaver tseaver added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. api: storage Issues related to the Cloud Storage API. labels Jan 3, 2019
@tseaver tseaver requested a review from frankyn January 3, 2019 22:49
@tseaver
Copy link
Contributor

tseaver commented Jan 3, 2019

What is implemented here looks OK, but seems like it might not be comprehensive enough (e.g., should Blob.reload / Blob.exists / Blob.delete support generations?).

@frankyn PTAL.

@jmcarp
Copy link
Author

jmcarp commented Jan 4, 2019

Thanks @tseaver. I took a stab at adding generations to mixed-in methods like reload and update. WDYT?

@jmcarp
Copy link
Author

jmcarp commented Jan 12, 2019

Ping @tseaver @frankyn when you have time.

@frankyn
Copy link
Member

frankyn commented Jan 12, 2019

Thanks for the ping. Looking over it now.

Copy link
Member

@frankyn frankyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a first pass. Let's follow-up next week. Thank you for your patience.

@@ -257,6 +257,16 @@ def user_project(self):
"""
return self.bucket.user_project

@property
def query_params(self):

This comment was marked as spam.

if self.user_project is not None:
params["userProject"] = self.user_project
return params

def blob(self, blob_name, chunk_size=None, encryption_key=None, kms_key_name=None):
"""Factory constructor for blob object.

This comment was marked as spam.

@jmcarp
Copy link
Author

jmcarp commented Jan 14, 2019

Thanks @frankyn. Updated the PR.

Copy link
Member

@frankyn frankyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add system tests for this feature as well in (https://github.com/googleapis/google-cloud-python/blob/master/storage/tests/system.py) with respect to a generation:

  • blob.update()
  • blob.patch()
  • blob.delete()
  • bucket.get_object()

connection = _Connection({"name": BLOB_NAME})
client = _Client(connection)
bucket = self._make_one(name=NAME)
blob = bucket.get_blob(BLOB_NAME, client=client, generation=GENERATION)

This comment was marked as spam.

@jmcarp
Copy link
Author

jmcarp commented Jan 18, 2019

Updated. Looks like we already exercise the Blob constructor with generation in https://github.com/jmcarp/google-cloud-python/blob/blob-generations/storage/tests/unit/test_blob.py#L104-L109.

@jmcarp
Copy link
Author

jmcarp commented Jan 26, 2019

Ping @frankyn when you have time; ready for another look.

Copy link
Member

@frankyn frankyn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, lgtm. Just had two last comments then I'll approve. @tseaver do you see anything else?

blob0.content_language = "en"
blob0.patch()
self.assertEqual(blob0.content_language, "en")
self.assertIsNone(blob1.content_language)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this required? Seems unnecessary based on the previous line.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're patching the language of the 0th generation of the blob, so we expect the metadata of the 1st generation to still be None.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahhh I missed that it's blob1 and not blob0 thank you.

blob0.metadata = metadata
blob0.update()
self.assertEqual(blob0.metadata, metadata)
self.assertIsNone(blob1.metadata)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, doesn't look necessary if metadata for sure is not None.

@JustinBeckwith JustinBeckwith added the 🚨 This issue needs some love. label Feb 7, 2019
@yoshi-automation yoshi-automation added 🚨 This issue needs some love. and removed 🚨 This issue needs some love. labels Feb 7, 2019
@jmcarp
Copy link
Author

jmcarp commented Feb 12, 2019

Anything else I should do on this @frankyn @JustinBeckwith?

@frankyn
Copy link
Member

frankyn commented Feb 12, 2019

@crwilcox I'm good with the merge wdyt?

Copy link
Contributor

@tseaver tseaver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jmcarp Thanks very much for the patch, and apologies for taking so long to review it.

@@ -387,6 +406,9 @@ def exists(self, client=None):
# minimize the returned payload.
query_params = {"fields": "name"}

if self.generation:
query_params["generation"] = self.generation

if self.user_project is not None:
query_params["userProject"] = self.user_project
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this block should use the _query_params property, e.g.:

        # minimize the returned payload.
        query_params = self._query_params
        query_params["fields"] = "name"

@@ -1485,6 +1509,9 @@ def rewrite(self, source, token=None, client=None):
if token:
query_params["rewriteToken"] = token

if source.generation:
query_params["sourceGeneration"] = source.generation

if self.user_project is not None:
query_params["userProject"] = self.user_project

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likewise here:

        query_params = self._query_params

        if token:
            query_params["rewriteToken"] = token

        if source.generation:
            query_params["sourceGeneration"] = source.generation

        if self.kms_key_name is not None:
            query_params["destinationKmsKeyName"] = self.kms_key_name

params = {}
if self.user_project is not None:
params["userProject"] = self.user_project
return params
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is shadowed by both Blob and Bucket: let's make it virtual, e.g.:

    @property
    def _query_params(self):
        """Default query parameters."""
        raise NotImplementedError

That way we don't have to add test coverage for it, either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at how it is used in reload, I've changed my mind: lets just get rid of the duplicate version in Bucket.


if generation is not None:
query_params["generation"] = generation

blob = Blob(
bucket=self, name=blob_name, encryption_key=encryption_key, **kwargs
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather create the blob with the expected generation here, and then use its _query_params property, e.g.:

        client = self._require_client(client)
        blob = Blob(
            bucket=self, name=blob_name, encryption_key=encryption_key, generation=generation, **kwargs
        )
        query_params = blob._query_params

        try:
            headers = _get_encryption_headers(encryption_key)
            response = client._connection.api_request(
                method="GET",
                path=blob.path,
                query_params=query_params,
                headers=headers,
                _target_object=blob,
            )

Now that I look at it more closely, I'm not sure why this method doesn't just call blob.reload().

@@ -825,6 +859,9 @@ def delete_blob(self, blob_name, client=None):
client = self._require_client(client)
query_params = {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use self._query_params here as well, e.g.:

        query_params = self._query_params

        if generation is not None:
            query_params["generation"] = generation

        blob_path = Blob.path_helper(self.path, blob_name)

params = {}
if self.user_project is not None:
params["userProject"] = self.user_project
return params
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add explicit test coverage for the branches in this property.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per above, let's just delete this version of the property, which duplicates the one in the base class.

@@ -12,6 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

import io
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import will be unused if you switch to blob.upload_from_string below.

@@ -576,7 +583,7 @@ def test_delete(self):
bucket._blobs[BLOB_NAME] = 1
blob.delete()
self.assertFalse(blob.exists())
self.assertEqual(bucket._deleted, [(BLOB_NAME, None)])
self.assertEqual(bucket._deleted, [(BLOB_NAME, None, blob.generation)])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please duplicate this test, once with an explicit generation, and make this assertion be clear for both cases (None for the case where the blob is created without a generation).

client = _Client(connection)
source_bucket = _Bucket(client=client)
source_blob = self._make_one(SOURCE_BLOB, bucket=source_bucket)
source_blob._set_properties({"generation": SOURCE_GENERATION})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please pass the generation to _make_one, e.g.:

        source_blob = self._make_one(SOURCE_BLOB, bucket=source_bucket, generation=GENERATION)

NAME = "name"
BLOB_NAME = "blob-name"
GENERATION = 1512565576797178
connection = _Connection({"name": BLOB_NAME})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For clarity, add the generation to the properties here, and then assert that it is read in correctly below:

        connection = _Connection({"name": BLOB_NAME, generation=GENERATION})

@tseaver tseaver added needs work This is a pull request that needs a little love. kokoro:force-run Add this label to force Kokoro to re-run the tests. and removed 🚨 This issue needs some love. labels Feb 20, 2019
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Feb 20, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the Cloud Storage API. cla: yes This human has signed the Contributor License Agreement. needs work This is a pull request that needs a little love. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants