Node reconnection causes modified index to jump by 1 #7253

dansimone · 2017-01-30T19:44:33Z

I've noticed a situation where the modified index of a key increases by 1 more than expected:

Cluster consists of Leader, Follower1, Follower2.
Set Key1 on Leader, the current modified index is X.
Kill Follower1.
Set Key1 on Leader, the current modified index is X+1.
Restart Follower1.
Set Key1 on Leader. The expected modified index is X+2. However, the actual modified index is X+3.

I'm pretty sure what's happening is that the internals of the Raft algorithm are triggering an implicit 'set' when resyncing the state of Key1 on Follower1 in #6. This could be interpreted as expected, but here's why I think this is wrong - regardless of the state of the cluster behind the scenes, modified index is a part of the Etcd public API. There could easily be a client out there running a waitForIndex=X+3 after #4, and expecting this to not return until 2 updates to Key1 have been made. The fact that a cluster member went offline and reconnected should be completely transparent to that client.

Reproducer steps using the Etcd Docker image:

Start the cluster:

export IP=<docker_host_ip>
  
# Create data volume dirs
mkdir /tmp/data0
mkdir /tmp/data1
mkdir /tmp/data2
  
# Start Leader1 (run this in its own terminal)
docker run --rm -p 4001:4001 -p 2380:2380 -p 2379:2379 -v /tmp/data0:/data --name etcd0 quay.io/coreos/etcd:v3.0.6 /usr/local/bin/etcd -name etcd0 -advertise-client-urls http://$IP:2379,http://$IP:4001 -listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 -initial-advertise-peer-urls http://$IP:2380 -listen-peer-urls http://0.0.0.0:2380 -initial-cluster-token etcd-cluster-1 -initial-cluster etcd0=http://$IP:2380,etcd1=http://$IP:3380,etcd2=http://$IP:4380 --data-dir=/data
  
# Start Follower1 (run this in its own terminal)
docker run --rm -p 5001:4001 -p 3380:2380 -p 3379:2379 -v /tmp/data1:/data --name etcd1 quay.io/coreos/etcd:v3.0.6 /usr/local/bin/etcd -name etcd1 -advertise-client-urls http://$IP:3379,http://$IP:5001 -listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 -initial-advertise-peer-urls http://$IP:3380 -listen-peer-urls http://0.0.0.0:2380 -initial-cluster-token etcd-cluster-1 -initial-cluster etcd0=http://$IP:2380,etcd1=http://$IP:3380,etcd2=http://$IP:4380 --data-dir=/data
  
# Start Follower2 (run this in its own terminal)
docker run --rm -p 7001:4001 -p 4380:2380 -p 4379:2379 -v /tmp/data2:/data --name etcd2 quay.io/coreos/etcd:v3.0.6 /usr/local/bin/etcd -name etcd2 -advertise-client-urls http://$IP:4379,http://$IP:7001 -listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 -initial-advertise-peer-urls http://$IP:4380 -listen-peer-urls http://0.0.0.0:2380 -initial-cluster-token etcd-cluster-1 -initial-cluster etcd0=http://$IP:2380,etcd1=http://$IP:3380,etcd2=http://$IP:4380 --data-dir=/data

Initialize Key1:

curl -X PUT http://$IP:2379/v2/keys/message -d value="test"
{"action":"set","node":{"key":"/message","value":"test","modifiedIndex":139847,"createdIndex":139847},"prevNode":{"key":"/message","value":"test","modifiedIndex":139846,"createdIndex":139846}}

Kill Follower1 (Ctrl-C) on Follower1 terminal:
Do another set on Key1:

curl -X PUT http://$IP:2379/v2/keys/message -d value="test"
{"action":"set","node":{"key":"/message","value":"test","modifiedIndex":139848,"createdIndex":139848},"prevNode":{"key":"/message","value":"test","modifiedIndex":139847,"createdIndex":139847}}

Bring back Follower1

docker run --rm -p 5001:4001 -p 3380:2380 -p 3379:2379 -v /tmp/data1:/data --name etcd1 quay.io/coreos/etcd:v3.0.6 /usr/local/bin/etcd -name etcd1 -advertise-client-urls http://$IP:3379,http://$IP:5001 -listen-client-urls http://0.0.0.0:2379,http://0.0.0.0:4001 -initial-advertise-peer-urls http://$IP:3380 -listen-peer-urls http://0.0.0.0:2380 -initial-cluster-token etcd-cluster-1 -initial-cluster etcd0=http://$IP:2380,etcd1=http://$IP:3380,etcd2=http://$IP:4380 --data-dir=/data

Do another set on Key1:

curl -X PUT http://$IP:2379/v2/keys/message -d value="test"
{"action":"set","node":{"key":"/message","value":"test","modifiedIndex":139850,"createdIndex":139850},"prevNode":{"key":"/message","value":"test","modifiedIndex":139848,"createdIndex":139848}}

The text was updated successfully, but these errors were encountered:

gyuho · 2017-01-30T22:20:35Z

I think this is expected; when follower comes back, it registers itself to v2 store with member attributes like:

/0/members/1640829d9eea5cfb/attributes false {"name":"my-etcd-2","clientURLs":["http://localhost:22379"]}

v3 has separate kv space, so v3 indexes won't be affected by this.

Defer to @xiang90 for double-checking?

xiang90 · 2017-01-30T22:24:17Z

@dansimone There is no guarantees on the v2 index other than it is increasing. It is actually an etcd index, not store index. For v3, we have a store revision, which users can rely on.

xiang90 · 2017-01-31T18:34:01Z

expecting this to not return until 2 updates to Key1 have been made.

And this is only true if your system only has one client. Once you etcd cluster has multiple concurrent clients, it wont hold. I do not think it is a good way to think about index. Do not rely on index to count number of update on one key.

I am closing this issue since the v2 api is pretty much freeze. We only do bug fixes.

jlamillan · 2017-01-31T18:51:32Z

If the system can arbitrarily increase modifiedIndex on its own, how can we reliably check / watch whether a key has been updated without reading the actual value?

xiang90 · 2017-01-31T18:54:01Z

whether a key has been updated without reading the actual value?

The key has modified index. You can check that.

xiang90 closed this as completed Jan 31, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Node reconnection causes modified index to jump by 1 #7253

Node reconnection causes modified index to jump by 1 #7253

dansimone commented Jan 30, 2017

gyuho commented Jan 30, 2017 •

edited

Loading

xiang90 commented Jan 30, 2017

xiang90 commented Jan 31, 2017

jlamillan commented Jan 31, 2017

xiang90 commented Jan 31, 2017

Node reconnection causes modified index to jump by 1 #7253

Node reconnection causes modified index to jump by 1 #7253

Comments

dansimone commented Jan 30, 2017

gyuho commented Jan 30, 2017 • edited Loading

xiang90 commented Jan 30, 2017

xiang90 commented Jan 31, 2017

jlamillan commented Jan 31, 2017

xiang90 commented Jan 31, 2017

gyuho commented Jan 30, 2017 •

edited

Loading