-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request for advice: how to set modifiedIndex of wait #2039
Comments
@erictune @lavalamp I think the bottom half part of the section should help to explain the problem https://github.com/coreos/etcd/blob/master/Documentation/2.0/api.md#waiting-for-a-change. Also we plan to make this a little bit easier to handle . Basically, etcd keeps all the versions of the keys until the the application ask etcd to compact the versions. |
Thanks, that explains it! |
Even when applications can ask etcd to compact versions, the applications would still need to coordinate to ensure all watchers are reopened prior to that. It's possible that when the version compaction happens, existing watchers should be closed with a final event indicating what the latest etcdIndex is, and that's the value that those watchers should begin watching at. I think it's extremely difficult for clients to solve this correctly without some sort of window notification coming from etcd. Even today, a watcher that sees 1000 events go by without an event should really receive an update that tells them the window has been updated and they should resynchronize to the latest etcd-index. That has to come in-band with the watch (since an out-of-band request can't ensure that the watcher's clock and the oob requester's clock are aligned). |
As I said, we plan to make it easier to reason about. There is no way to completely solve the problem unless you have unlimited memory/disk. So you always need to prepare for a fresh restart.
Compaction will not affect the current watcher. You will never compact into the future or actually near to current time.
Well. A simpler solution is to have a application global timer to record the known progress of etcd. That can be shared among all your watchers. Notification per watcher is not necessary I think. |
We've talked about a single global watch anyway on the master, so maybe we can move to that and then provide the update mechanism in our own channel for our clients.
|
I am talking about compacting the history that is generated hours ago. The lag should be at most seconds level. So probably you do not need to worry about it. |
Ah, so as long as no watch in the app goes without waiting on an event longer than the compaction window (so the client detects hung connections prior to a compaction removing history) we can guarantee the window. I think that answers my other question about behavior in changed store - thanks.
|
Would it be possible for the watch to heartbeat (potentially optionally) with an entry every X raft indices? Chosen well for that window that removes the need for more complex logic.
|
this is doable. Can you please open an issue for this as a feature request? I am going to close this issue. |
Opened #2048.
|
Over at Kubernetes, we just discovered coreos/fleet#408.
I see that fleet used coreos/fleet#411 as a fix.
If I understand that fix correctly, when the client tries to wait with
waitIndex
which is too old, and the client gets an EcodeEventIndexCleared event, then triesidx = idx + 1
until it finds a suitable index.I see that was subsequently replaced with a new implementation, it doesn't use indexes (coreos/fleet@bdf5f72), though I don't quite follow how that works.
What is the current best practice for a client that wants to maintain an eventually consistent copy of a subset of data stored in etcd, without excessive polling?
That is, I have a loop like this
The tricky part is, what
waitIndex
to use? It has to be large enough so that it does not cause anEcodeEventIndexCleared
but small enough so that it does not miss changes that happened after the initial GET. As coreos/fleet#408 points out, the minimum correctwaitIndex
might be larger than anymodifiedIndex
in the GET response. Is there way to get the maximummodifiedIndex
. Obviously, I could GET every key, but if I am only interested in few keys, that seems inefficient. Is there a way to just get the global maximummodifiedIndex
? Is there a way to get it at the same time as I GET a subset of keys, so that I can be sure I am talking to the same replica?The text was updated successfully, but these errors were encountered: