Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix datastore: unprefixed ids in client #1226

Merged
merged 3 commits into from
Nov 18, 2015

Conversation

mpeg
Copy link
Contributor

@mpeg mpeg commented Nov 18, 2015

If datastore id comes back prefixed from the API, and Client.datastore_id is not prefixed, the 'Keys do not match dataset ID' exception is raised.

Fixed to use _dataset_ids_equal for the comparison instead, which compares ignoring valid prefixes and seems to be the preferred comparison method used elsewhere in the code.

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Nov 18, 2015
@dhermes
Copy link
Contributor

dhermes commented Nov 18, 2015

@mpeg Thanks for bringing this up. I'm trying to understand what this fixes.

Can you provide a stacktrace when that ValueError was raised previously?

@mpeg
Copy link
Contributor Author

mpeg commented Nov 18, 2015

@dhermes Sure ! It's my first patch so wasn't sure how much detail to give

For the dataset Client class, I've got the dataset_id set from an env variable to mpeg-personal (my real dataset name)

Here's a REPL and stack trace highlighting the way I encountered this issue:

>>> from gcloud import datastore
>>> gcd = datastore.Client(namespace='namespace')
>>> gcd.dataset_id
'mpeg-personal'
>>> existing_key = gcd.get(gcd.key('Test', 5629499534213120)).key
>>> existing_key
<Key[{'id': 5629499534213120, 'kind': 'Test'}], dataset=e~mpeg-personal>
>>> gcd.get(existing_key)
Traceback (most recent call last):
 File "<stdin>", line 1, in <module>
 File "/usr/local/lib/python3.4/dist-packages/gcloud/datastore/client.py", line 263, in get
 deferred=deferred)
 File "/usr/local/lib/python3.4/dist-packages/gcloud/datastore/client.py", line 293, in get_multi
 raise ValueError('Keys do not match dataset ID')
ValueError: Keys do not match dataset ID

It all comes down to that string comparison done in the lines my patch changes, which does a comparison ignoring dataset prefixes (as the API will accept an unprefixed dataset_id).

The unexpected behaviour for me was that I couldn't store the key that came back and reuse it.

@dhermes
Copy link
Contributor

dhermes commented Nov 18, 2015

Really great stacktrace, reproduced on my end! Thanks. Hopefully this will all be mitigated with v1beta3 but for now this fix is needed. Will comment on PR.

if ids != [self.dataset_id]:
raise ValueError('Keys do not match dataset ID')
ids = set(key.dataset_id for key in keys)
for ds_id in ids:

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

@dhermes
Copy link
Contributor

dhermes commented Nov 18, 2015

So the only real issue with this is that we don't have a unit test to make sure the failure doesn't occur again. I can help with that if you like or you can give it a spin yourself?

I really appreciate the contribution.

@mpeg
Copy link
Contributor Author

mpeg commented Nov 18, 2015

Sure, I can add the unit test, I'm thinking to the gcloud.datastore.test_client tests, add a test that gets a mock entity with a prefixed dataset_id ?

creds = object()
client = self._makeOne(credentials=creds)
client.connection._add_lookup_result([entity_pb1,
entity_pb2,

This comment was marked as spam.

@mpeg
Copy link
Contributor Author

mpeg commented Nov 18, 2015

Added a test, styled after the test_get_multi_hit_multiple_keys_same_dataset and test_get_multi_hit_multiple_keys_different_dataset tests, as it's an extension of those, makes sure:

  • dataset ids that are prefixed with "e~" or "s~" don't raise the ValueError exception
  • prefixed dataset ids actually get the correct entity values (mostly in case of the mock classes changing in the future)

@dhermes
Copy link
Contributor

dhermes commented Nov 18, 2015

Really great test thanks! I made some comments but overall it is about perfect (which is not something you expect from a first contrib).

@dhermes
Copy link
Contributor

dhermes commented Nov 18, 2015

It just occurred to me that it'd be smart to have another similar test that does fail when the IDs disagree, e.g.

dataset_id1 = 'e~foo'
dataset_id2 = 's~bar'

@mpeg
Copy link
Contributor Author

mpeg commented Nov 18, 2015

Yeah I agree with your style comments (sorry about the pep8 failure-to-indent), didn't add a failing test like that since it's largely covered by test_get_multi_hit_multiple_keys_different_dataset but perhaps would be good in case it all changes in the future and the issue comes back.

@mpeg
Copy link
Contributor Author

mpeg commented Nov 18, 2015

Should all be good now, added the extra test case for e~foo and s~bar too.

Thanks for the help, hopefully first of many

Btw, I'm no git wizard but I can rebase my branch to squash the extra commits if you'd prefer, let me know

@dhermes
Copy link
Contributor

dhermes commented Nov 18, 2015

LGTM

Thanks again!

dhermes added a commit that referenced this pull request Nov 18, 2015
Fix datastore: unprefixed ids in client
@dhermes dhermes merged commit 6fdb5b1 into googleapis:master Nov 18, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants