Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support legacy compressed properties back and forth #183

Merged
merged 4 commits into from
Sep 6, 2019
Merged

support legacy compressed properties back and forth #183

merged 4 commits into from
Sep 6, 2019

Conversation

cguardia
Copy link
Contributor

@cguardia cguardia commented Sep 3, 2019

Though this should work according to what we've been told, I have not tested it yet with true legacy data, so not merging this immediately, but I think it's ready for review.

@cguardia cguardia requested a review from chrisrossi September 3, 2019 10:29
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Sep 3, 2019
@cguardia
Copy link
Contributor Author

cguardia commented Sep 3, 2019

As far as I can see, the backend does in fact honor the meaning field. I believe we can merge this once it's reviewed and wait for more user feedback.

Copy link
Contributor

@chrisrossi chrisrossi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is ready yet. See comments in code.

@@ -324,7 +324,7 @@ class Person(Model):


_MEANING_PREDEFINED_ENTITY_USER = 20
_MEANING_URI_COMPRESSED = "ZLIB"
_MEANING_URI_COMPRESSED = 22
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can just be _MEANING_COMPRESSED. The URI was a string, which is no longer supported by Datastore.

@@ -604,6 +604,12 @@ def new_entity(key):

continue

if value is not None and isinstance(prop, BlobProperty):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if instead of special casing this here in _entity_from_ds_entity, there should be some generalized way for properties to take into account (or not) the meaning field when deserializing themselves. I just added a _to_datastore method to Property in order to deal with #184, and I wonder if an overridable _from_datastore might be helpful here.

if name in ds_entity._meanings:
meaning, value = ds_entity._meanings[name]
if meaning == _MEANING_URI_COMPRESSED:
prop._compressed = True
Copy link
Contributor

@chrisrossi chrisrossi Sep 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks suspicious to me. The problem we're trying to solve here is that the software has defined _compressed for this Property, but it might be stored differently in Datastore. In fact, some entities might have stored this property compressed and some might have stored it uncompressed. We want to read it back out correctly, regardless. But changing the _compressed flag changes is for all entities, not just this entity. It also impacts how the property gets written back out, which should probably match what's defined in Python, regardless of how it's been written in the past.

I think the move here is to ignore prop._compressed and only use meaning when reading the property in. Likewise, when writing data back out, we should make the meaning (and the data) match the user's definition for _compressed. At no point should we modify _compressed from what the user has provided.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right. Will take a look.

_MEANING_URI_COMPRESSED,
self._to_base_type(self._get_user_value(entity)),
)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else:
entity._meanings[self._name] = (???, value)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not so sure about this, since we'd have to know the meaning values for everything and I'm not sure _meanings are set up for other types in the backend.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My point here is just that we need to tell the database when it's not compressed, as well. If _compressed is False, but it was previously written compressed, then we don't want to pass a compressed meaning back to the Datastore. Maybe the move is actually:

entity._meanings.pop(self._name, None)

Or, maybe the move is we don't have an entity._meanings attribute, and generate meanings from scratch when serializing, maybe by overriding _to_datastore.

I'm leaning towards that last option, myself, since we don't really need the meanings in NDB except when serializing and deserializing and we don't want to use the meanings from deserializing when reserializing.

@cguardia
Copy link
Contributor Author

cguardia commented Sep 6, 2019

@chrisrossi it's taken me a bit to wrap this stuff around my head, but I think it's going in the right direction. Could you please take a look when possible?

@cguardia cguardia added the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Sep 6, 2019
@yoshi-kokoro yoshi-kokoro removed the kokoro:force-run Add this label to force Kokoro to re-run the tests. label Sep 6, 2019
Copy link
Contributor

@chrisrossi chrisrossi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks excellent. Thank you!

@@ -2414,11 +2442,48 @@ def _from_base_type(self, value):
decompressed.
"""
if self._compressed and not isinstance(value, _CompressedValue):
if not value.startswith(b"x\x9c"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment with a reference URL to document for this file signature would be good. Maybe declare it as a constant and have the documentation at the constant declaration, so we only need to document it once.

if isinstance(value, _CompressedValue):
value = value.z_val
data[self._name] = value
if not value.startswith(b"x\x9c"):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use documented constant.

indicate we are getting a compressed value.
"""
if self._name in ds_entity._meanings:
meaning, dummy = ds_entity._meanings[self._name]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better?

meaning = ds_entity._meanings[self._name][0]

@cguardia cguardia merged commit 7bdd01f into googleapis:master Sep 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants