Request for advice: how to set modifiedIndex of wait #2039

erictune · 2015-01-05T20:21:50Z

Over at Kubernetes, we just discovered coreos/fleet#408.

I see that fleet used coreos/fleet#411 as a fix.
If I understand that fix correctly, when the client tries to wait with waitIndex which is too old, and the client gets an EcodeEventIndexCleared event, then tries idx = idx + 1 until it finds a suitable index.
I see that was subsequently replaced with a new implementation, it doesn't use indexes (coreos/fleet@bdf5f72), though I don't quite follow how that works.

What is the current best practice for a client that wants to maintain an eventually consistent copy of a subset of data stored in etcd, without excessive polling?

That is, I have a loop like this

  while true {
      GET a bunch of keys I am interested in
      wait at some waitIndex
     handle event or error
 }

The tricky part is, what waitIndex to use? It has to be large enough so that it does not cause an EcodeEventIndexCleared but small enough so that it does not miss changes that happened after the initial GET. As coreos/fleet#408 points out, the minimum correct waitIndex might be larger than any modifiedIndex in the GET response. Is there way to get the maximum modifiedIndex. Obviously, I could GET every key, but if I am only interested in few keys, that seems inefficient. Is there a way to just get the global maximum modifiedIndex? Is there a way to get it at the same time as I GET a subset of keys, so that I can be sure I am talking to the same replica?

The text was updated successfully, but these errors were encountered:

erictune · 2015-01-05T20:22:11Z

@lavalamp

xiang90 · 2015-01-05T20:29:47Z

@erictune @lavalamp I think the bottom half part of the section should help to explain the problem https://github.com/coreos/etcd/blob/master/Documentation/2.0/api.md#waiting-for-a-change.

Also we plan to make this a little bit easier to handle . Basically, etcd keeps all the versions of the keys until the the application ask etcd to compact the versions.

erictune · 2015-01-05T20:43:40Z

Thanks, that explains it!

smarterclayton · 2015-01-05T21:53:11Z

Even when applications can ask etcd to compact versions, the applications would still need to coordinate to ensure all watchers are reopened prior to that. It's possible that when the version compaction happens, existing watchers should be closed with a final event indicating what the latest etcdIndex is, and that's the value that those watchers should begin watching at.

I think it's extremely difficult for clients to solve this correctly without some sort of window notification coming from etcd. Even today, a watcher that sees 1000 events go by without an event should really receive an update that tells them the window has been updated and they should resynchronize to the latest etcd-index. That has to come in-band with the watch (since an out-of-band request can't ensure that the watcher's clock and the oob requester's clock are aligned).

xiang90 · 2015-01-05T22:15:04Z

Even when applications can ask etcd to compact versions, the applications would still need to coordinate to ensure all watchers are reopened prior to that.

As I said, we plan to make it easier to reason about. There is no way to completely solve the problem unless you have unlimited memory/disk. So you always need to prepare for a fresh restart.

It's possible that when the version compaction happens, existing watchers should be closed with a final event indicating what their latest update is, and that's the value that those watchers would begin watching at.

Compaction will not affect the current watcher. You will never compact into the future or actually near to current time.

Even today, a watcher that sees 1000 events go by should really receive an update that tells them the window has been updated and they should resynchronize to the latest etcd-index.

Well. A simpler solution is to have a application global timer to record the known progress of etcd. That can be shared among all your watchers. Notification per watcher is not necessary I think.

smarterclayton · 2015-01-05T22:32:14Z

On Jan 5, 2015, at 5:15 PM, Xiang Li [email protected] wrote:

Even when applications can ask etcd to compact versions, the applications would still need to coordinate to ensure all watchers are reopened prior to that.

As I said, we plan to make it easier to reason about. There is no way to completely solve the problem unless you have unlimited memory/disk. So you always need to prepare for a fresh restart.

It's possible that when the version compaction happens, existing watchers should be closed with a final event indicating what their latest update is, and that's the value that those watchers would begin watching at.

Compaction will not affect the current watcher. You will never compact into the future or actually near to current time.

Even today, a watcher that sees 1000 events go by should really receive an update that tells them the window has been updated and they should resynchronize to the latest etcd-index.

Well. A simpler solution is to have a application global timer to record the known progress of etcd. That can be shared among all your watchers. Notification per watcher is not necessary I think.

Good point, although what if the watcher lags on the server due to latency and gc? Without doing a single watch all from the application, I don't know that individual calls to watch can reason about what events have been delivered.

We've talked about a single global watch anyway on the master, so maybe we can move to that and then provide the update mechanism in our own channel for our clients.

—
Reply to this email directly or view it on GitHub.

xiang90 · 2015-01-05T22:36:20Z

Good point, although what if the watcher lags on the server due to latency and gc? Without doing a single watch all from the application, I don't know that individual calls to watch can reason about what events have been delivered.

I am talking about compacting the history that is generated hours ago. The lag should be at most seconds level. So probably you do not need to worry about it.

smarterclayton · 2015-01-05T22:43:00Z

Ah, so as long as no watch in the app goes without waiting on an event longer than the compaction window (so the client detects hung connections prior to a compaction removing history) we can guarantee the window. I think that answers my other question about behavior in changed store - thanks.

On Jan 5, 2015, at 5:36 PM, Xiang Li [email protected] wrote:

Good point, although what if the watcher lags on the server due to latency and gc? Without doing a single watch all from the application, I don't know that individual calls to watch can reason about what events have been delivered.

I am talking about compacting the history that is generated hours ago. The lag should be at most seconds level. So probably you do not need to worry about it.

—
Reply to this email directly or view it on GitHub.

smarterclayton · 2015-01-05T23:15:14Z

Would it be possible for the watch to heartbeat (potentially optionally) with an entry every X raft indices? Chosen well for that window that removes the need for more complex logic.

On Jan 5, 2015, at 5:36 PM, Xiang Li [email protected] wrote:

Good point, although what if the watcher lags on the server due to latency and gc? Without doing a single watch all from the application, I don't know that individual calls to watch can reason about what events have been delivered.

I am talking about compacting the history that is generated hours ago. The lag should be at most seconds level. So probably you do not need to worry about it.

—
Reply to this email directly or view it on GitHub.

xiang90 · 2015-01-06T22:32:42Z

@smarterclayton

Would it be possible for the watch to heartbeat (potentially optionally) with an entry every X raft indices?

this is doable. Can you please open an issue for this as a feature request? I am going to close this issue.

smarterclayton · 2015-01-06T23:56:44Z

Opened #2048.

On Jan 6, 2015, at 5:32 PM, Xiang Li [email protected] wrote:

Closed #2039.

—
Reply to this email directly or view it on GitHub.

erictune changed the title ~~Request for advice: how to set modifiedIndex of watch~~ Request for advice: how to set modifiedIndex of wait Jan 5, 2015

erictune mentioned this issue Jan 5, 2015

kube-proxy doesn't recover from outdated events kubernetes/kubernetes#3216

Closed

xiang90 closed this as completed Jan 6, 2015

This was referenced Jan 6, 2015

Auto-increment watch index? #2047

Closed

Allow clients to request watch heartbeats to refresh the clients window #2048

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request for advice: how to set modifiedIndex of wait #2039

Request for advice: how to set modifiedIndex of wait #2039

erictune commented Jan 5, 2015

erictune commented Jan 5, 2015

xiang90 commented Jan 5, 2015

erictune commented Jan 5, 2015

smarterclayton commented Jan 5, 2015

xiang90 commented Jan 5, 2015

smarterclayton commented Jan 5, 2015

xiang90 commented Jan 5, 2015

smarterclayton commented Jan 5, 2015

smarterclayton commented Jan 5, 2015

xiang90 commented Jan 6, 2015

smarterclayton commented Jan 6, 2015

Request for advice: how to set modifiedIndex of wait #2039

Request for advice: how to set modifiedIndex of wait #2039

Comments

erictune commented Jan 5, 2015

erictune commented Jan 5, 2015

xiang90 commented Jan 5, 2015

erictune commented Jan 5, 2015

smarterclayton commented Jan 5, 2015

xiang90 commented Jan 5, 2015

smarterclayton commented Jan 5, 2015

xiang90 commented Jan 5, 2015

smarterclayton commented Jan 5, 2015

smarterclayton commented Jan 5, 2015

xiang90 commented Jan 6, 2015

smarterclayton commented Jan 6, 2015