-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support legacy compressed properties back and forth #183
support legacy compressed properties back and forth #183
Conversation
As far as I can see, the backend does in fact honor the meaning field. I believe we can merge this once it's reviewed and wait for more user feedback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is ready yet. See comments in code.
google/cloud/ndb/model.py
Outdated
@@ -324,7 +324,7 @@ class Person(Model): | |||
|
|||
|
|||
_MEANING_PREDEFINED_ENTITY_USER = 20 | |||
_MEANING_URI_COMPRESSED = "ZLIB" | |||
_MEANING_URI_COMPRESSED = 22 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can just be _MEANING_COMPRESSED. The URI was a string, which is no longer supported by Datastore.
google/cloud/ndb/model.py
Outdated
@@ -604,6 +604,12 @@ def new_entity(key): | |||
|
|||
continue | |||
|
|||
if value is not None and isinstance(prop, BlobProperty): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if instead of special casing this here in _entity_from_ds_entity
, there should be some generalized way for properties to take into account (or not) the meaning field when deserializing themselves. I just added a _to_datastore
method to Property
in order to deal with #184, and I wonder if an overridable _from_datastore
might be helpful here.
google/cloud/ndb/model.py
Outdated
if name in ds_entity._meanings: | ||
meaning, value = ds_entity._meanings[name] | ||
if meaning == _MEANING_URI_COMPRESSED: | ||
prop._compressed = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks suspicious to me. The problem we're trying to solve here is that the software has defined _compressed
for this Property, but it might be stored differently in Datastore. In fact, some entities might have stored this property compressed and some might have stored it uncompressed. We want to read it back out correctly, regardless. But changing the _compressed
flag changes is for all entities, not just this entity. It also impacts how the property gets written back out, which should probably match what's defined in Python, regardless of how it's been written in the past.
I think the move here is to ignore prop._compressed
and only use meaning
when reading the property in. Likewise, when writing data back out, we should make the meaning (and the data) match the user's definition for _compressed
. At no point should we modify _compressed
from what the user has provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right. Will take a look.
google/cloud/ndb/model.py
Outdated
_MEANING_URI_COMPRESSED, | ||
self._to_base_type(self._get_user_value(entity)), | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
else:
entity._meanings[self._name] = (???, value)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not so sure about this, since we'd have to know the meaning values for everything and I'm not sure _meanings
are set up for other types in the backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My point here is just that we need to tell the database when it's not compressed, as well. If _compressed
is False
, but it was previously written compressed, then we don't want to pass a compressed meaning back to the Datastore. Maybe the move is actually:
entity._meanings.pop(self._name, None)
Or, maybe the move is we don't have an entity._meanings
attribute, and generate meanings from scratch when serializing, maybe by overriding _to_datastore
.
I'm leaning towards that last option, myself, since we don't really need the meanings in NDB except when serializing and deserializing and we don't want to use the meanings from deserializing when reserializing.
@chrisrossi it's taken me a bit to wrap this stuff around my head, but I think it's going in the right direction. Could you please take a look when possible? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks excellent. Thank you!
google/cloud/ndb/model.py
Outdated
@@ -2414,11 +2442,48 @@ def _from_base_type(self, value): | |||
decompressed. | |||
""" | |||
if self._compressed and not isinstance(value, _CompressedValue): | |||
if not value.startswith(b"x\x9c"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment with a reference URL to document for this file signature would be good. Maybe declare it as a constant and have the documentation at the constant declaration, so we only need to document it once.
google/cloud/ndb/model.py
Outdated
if isinstance(value, _CompressedValue): | ||
value = value.z_val | ||
data[self._name] = value | ||
if not value.startswith(b"x\x9c"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use documented constant.
google/cloud/ndb/model.py
Outdated
indicate we are getting a compressed value. | ||
""" | ||
if self._name in ds_entity._meanings: | ||
meaning, dummy = ds_entity._meanings[self._name] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better?
meaning = ds_entity._meanings[self._name][0]
Though this should work according to what we've been told, I have not tested it yet with true legacy data, so not merging this immediately, but I think it's ready for review.