-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remote caching #130
Remote caching #130
Conversation
Hi Takashi, Thanks again for your extensive work and interest. We're also proceeding with an implementation of memcache (Redis) caching at this time and unfortunately we won't be able to cleanly merge this PR as-is as it will conflict with our ongoing work. The engineer working on caching will try and incorporate your code into ours opportunistically without merging. It might be a good idea to coordinate further work; if you could reach out via a github issue or any other way to let us know if you want to begin work on a major contribution like this, we may be able to avoid crossing wires in the future. I'll close this PR for now, though we may still be able to pull some code into our branch when we're ready. Thanks again for your interest. |
Fine. I'll just stop work on this and look forward that your work will be done! And sure, I'll let you know beforehand at next time ;) Thanks, |
@takashi8 This is a very high quality PR. You had gotten further along on this than I have. I may end up stealing a good bit/most of this. Thank you! I was going with "global cache" to refer to the cache, but hadn't decided if we should change the API yet, or not. We're trying to minimize migration pain, but it may be the right thing to do. |
@chrisrossi Thanks! I'm glad to hear that. I agree that changing interface or not is quite difficult matter for this case, cause there should be huge apps using current interface... IMHO Cascade old name for old (with warning?) and prepare another named argument for new looks a good compromisation. The word "memcache" is too concrete and (for now) misleading... |
return result | ||
if result is None: | ||
yield remote_cache.cache_set_locked(self) | ||
yield remote_cache.cache_start_cas(self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm noticing here that when remote_cache_locked
is True
, we call cache_start_cas
but then, below, we don't call cache_cas
. If I'm reading this right, there's not really any reason to call cache_start_cas
if it's locked, because we're not then going to try and call cache_cas
. Let me know if I'm reading this incorrectly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remote_cache_locked
actually means the result value is 0, so in this case remote_cache_locked should be False
.
- if result == Entity, cache is set, so return it.
- if result == 0 (or b'0' or '0'), cache is set and locked, so do nothing for cache.
- if result == None, cache is not set (so not locked), so start check and set session.
And this lock with 0 mechanism is quite fragile, because put() or delete() also set 0 value.
If any updating action started between cache_set_locked and cache_set, setting cache from get() should fail. So cache_start_cas and cache_cas do the work.
This came from original implementation in py27, and sounds reasonable for me.
entity_pb = _entity_to_protobuf(self) | ||
key_pb = yield _datastore_api.put(entity_pb, _options) | ||
if key_pb: | ||
ds_key = helpers.key_from_protobuf(key_pb) | ||
self._key = key_module.Key._from_ds_key(ds_key) | ||
|
||
context = context_module.get_context() | ||
if use_remote_cache and not context.in_transaction(): | ||
yield remote_cache.cache_delete(self.key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to go ahead and store the entity here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just a copy from py27 implementation.
https://github.com/GoogleCloudPlatform/datastore-ndb-python/blob/master/ndb/context.py#L684
PR for remote caching
Overview
This PR is based on the premise that we now cannot rely on any specific remote cache backend (unlike memcache for GAE standard 1st generation).
So this PR defines the adapter class (
ndb.RemoteCacheAdapter
) to allow developers to implement their own logic to handle actual operation with their backend. (memcache, redis, and so on)Below is a sample adapter implementation for Cloud Memorystore.
https://gist.github.com/takashi8/06f2ab86451c6c96862e8f1d2e0b4966
Topics
This is kind of large change, and I felt like many things need discussion.
Threading
For simplicity I didn't implement any threading for remote cache operation.
Developers should implement background fetching, or it blocks current thread in eventloop.idle or eventloop.current. (Sample is just blocking)
If cache fetch is much faster than datastore fetch, it can be omitted, maybe.
Context.batch
Currently all batches are added to context.batches.
Maybe making dedicated property is better.
Ignored Features
I ignored several features that seems to be redundant or difficult to support.
Renamed "memcache"
Now that remote cache is not always memcache, I replaced many "memcache" words with "remote_cache" for integrity.
Still there are room to
Thanks in advance