Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote caching #130

Closed
wants to merge 10 commits into from
Closed

Remote caching #130

wants to merge 10 commits into from

Conversation

takashi8
Copy link
Contributor

@takashi8 takashi8 commented Jul 1, 2019

PR for remote caching

Overview

This PR is based on the premise that we now cannot rely on any specific remote cache backend (unlike memcache for GAE standard 1st generation).
So this PR defines the adapter class (ndb.RemoteCacheAdapter) to allow developers to implement their own logic to handle actual operation with their backend. (memcache, redis, and so on)

Below is a sample adapter implementation for Cloud Memorystore.
https://gist.github.com/takashi8/06f2ab86451c6c96862e8f1d2e0b4966

Topics

This is kind of large change, and I felt like many things need discussion.

Threading

For simplicity I didn't implement any threading for remote cache operation.
Developers should implement background fetching, or it blocks current thread in eventloop.idle or eventloop.current. (Sample is just blocking)
If cache fetch is much faster than datastore fetch, it can be omitted, maybe.

Context.batch

Currently all batches are added to context.batches.
Maybe making dedicated property is better.

Ignored Features

I ignored several features that seems to be redundant or difficult to support.

  • Context.memcache_* functions
  • Options.use_datastore
  • Options.max_memcache_items
  • Options.memcache_deadline
  • memcache.MAX_VALUE_SIZE
  • Autobatcher.add_once

Renamed "memcache"

Now that remote cache is not always memcache, I replaced many "memcache" words with "remote_cache" for integrity.
Still there are room to

  • cascade them (e.g. apply use_memcache value to use_remote_cache automatically).
  • just define and raise NoLongerImplementedError.
  • do not replace and keep compatible interface as much as we can.

Thanks in advance

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Jul 1, 2019
@andrewsg
Copy link
Contributor

andrewsg commented Jul 2, 2019

Hi Takashi,

Thanks again for your extensive work and interest. We're also proceeding with an implementation of memcache (Redis) caching at this time and unfortunately we won't be able to cleanly merge this PR as-is as it will conflict with our ongoing work. The engineer working on caching will try and incorporate your code into ours opportunistically without merging.

It might be a good idea to coordinate further work; if you could reach out via a github issue or any other way to let us know if you want to begin work on a major contribution like this, we may be able to avoid crossing wires in the future.

I'll close this PR for now, though we may still be able to pull some code into our branch when we're ready. Thanks again for your interest.

@andrewsg andrewsg closed this Jul 2, 2019
@takashi8
Copy link
Contributor Author

takashi8 commented Jul 3, 2019

Fine. I'll just stop work on this and look forward that your work will be done!
It was a good exercise.

And sure, I'll let you know beforehand at next time ;)

Thanks,

@chrisrossi
Copy link
Contributor

@takashi8 This is a very high quality PR. You had gotten further along on this than I have. I may end up stealing a good bit/most of this. Thank you!

I was going with "global cache" to refer to the cache, but hadn't decided if we should change the API yet, or not. We're trying to minimize migration pain, but it may be the right thing to do.

@takashi8
Copy link
Contributor Author

takashi8 commented Jul 9, 2019

@chrisrossi Thanks! I'm glad to hear that.

I agree that changing interface or not is quite difficult matter for this case, cause there should be huge apps using current interface... IMHO Cascade old name for old (with warning?) and prepare another named argument for new looks a good compromisation. The word "memcache" is too concrete and (for now) misleading...
"global cache" sounds better, but maybe like somewhat "per process" thing, like "global variable"...?

return result
if result is None:
yield remote_cache.cache_set_locked(self)
yield remote_cache.cache_start_cas(self)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm noticing here that when remote_cache_locked is True, we call cache_start_cas but then, below, we don't call cache_cas. If I'm reading this right, there's not really any reason to call cache_start_cas if it's locked, because we're not then going to try and call cache_cas. Let me know if I'm reading this incorrectly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remote_cache_locked actually means the result value is 0, so in this case remote_cache_locked should be False.

  • if result == Entity, cache is set, so return it.
  • if result == 0 (or b'0' or '0'), cache is set and locked, so do nothing for cache.
  • if result == None, cache is not set (so not locked), so start check and set session.

And this lock with 0 mechanism is quite fragile, because put() or delete() also set 0 value.
If any updating action started between cache_set_locked and cache_set, setting cache from get() should fail. So cache_start_cas and cache_cas do the work.

This came from original implementation in py27, and sounds reasonable for me.

entity_pb = _entity_to_protobuf(self)
key_pb = yield _datastore_api.put(entity_pb, _options)
if key_pb:
ds_key = helpers.key_from_protobuf(key_pb)
self._key = key_module.Key._from_ds_key(ds_key)

context = context_module.get_context()
if use_remote_cache and not context.in_transaction():
yield remote_cache.cache_delete(self.key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason not to go ahead and store the entity here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chrisrossi pushed a commit that referenced this pull request Aug 2, 2019
Makes use of @takashi8 's surprise #130 contribution.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants