Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcdctl elect results in deadlock: won election never reported to winner, if key is a prefix of existing key #6278

Closed
glycerine opened this issue Aug 27, 2016 · 6 comments · Fixed by #6284

Comments

@glycerine
Copy link
Contributor

It appears that a lone candidate cannot win some elections.

go1.7
osx 10.11.6 El Capitan
export ETCDCTL_API=3 in the env
etcd at tip, as of this writing, tip was: c388b2f22f1285e0a686d0594c85a053016df010

# terminal 1: start etcd
$ etcd
...

# terminal 2: create a key 'xyz'
$ etcdctl put xyz hello

# terminal 2: and run an election on key xy (note that xy is a prefix of xyz)
$ etcdctl elect xy xy-should-win
observed result: <hangs, no output, as if the election was lost>

expected result: output that the value xy-should-win is now the value of xy, so we know that we've won the election.

terminal 3: query all values, notice that xy-should-win has one the elction, but was never notified of this fact
jaten@jatens-MacBook-Pro ~ $ etcdctl get a z
xy/694d56cc2663a25c
xy-should-win
xyz
hello
jaten@jatens-MacBook-Pro ~ $ 


additional note: when reproducing this under 3 etcd servers in a cluster, the following is a typical 2-line complainst also issued in the etcd logs. Several examples follow:

------------------
14:59:16 etcd2 | 2016-08-27 14:59:16.233151 W | etcdserver: apply entries took too long [17.5324\
48ms for 1 entries]
14:59:16 etcd2 | 2016-08-27 14:59:16.233171 W | etcdserver: avoid queries with large range/delet\
e range!
-===============
15:04:24 etcd3 | 2016-08-27 15:04:24.750401 W | etcdserver: apply entries took too long [11.377838ms for 1 entries]
15:04:24 etcd3 | 2016-08-27 15:04:24.750421 W | etcdserver: avoid queries with large range/delete range!
-------------
15:04:58 etcd2 | 2016-08-27 15:04:58.998189 W | etcdserver: apply entries took too long [17.416764ms for 1 entries]
15:04:58 etcd2 | 2016-08-27 15:04:58.998202 W | etcdserver: avoid queries with large range/delete range!
==================
15:05:53 etcd3 | 2016-08-27 15:05:53.874651 W | etcdserver: apply entries took too long [11.660702ms for 1 entries]
15:05:53 etcd3 | 2016-08-27 15:05:53.874676 W | etcdserver: avoid queries with large range/delete range!
@xiang90
Copy link
Contributor

xiang90 commented Aug 27, 2016

@glycerine The prefix conflict bit should be improved.

For the query warning issue, are you running etcd on HDD? It seems that the fsync takes more than 10ms. We probably will make the warning time longer to 100ms or so. At least, etcd should not spam warning on normal HDD.

@xiang90
Copy link
Contributor

xiang90 commented Aug 27, 2016

/cc @heyitsanthony

@heyitsanthony
Copy link
Contributor

The client code is watching on xy* instead of xy/* so it's picking up 'xyz', which is a bug. Mutexes are also affected. Also missing docs about what part of the keyspace it'll eat.

@xiang90 xiang90 added this to the v3.1.0 milestone Aug 28, 2016
@glycerine
Copy link
Contributor Author

For the query warning issue, are you running etcd on HDD?

I'm running on SSD; stock Apple SSD 512G.

@xiang90
Copy link
Contributor

xiang90 commented Aug 28, 2016

@glycerine Just noticed that you were running 3 members on the same osx machine. Probably there are IO or CPU contentions. It is unrelated to this bug.

@glycerine
Copy link
Contributor Author

Just noticed that you were running 3 members on the same osx machine. Probably there are IO or CPU contentions. It is unrelated to this bug.

Agreed. Let's ignore that part.

glycerine added a commit to glycerine/etcd that referenced this issue Aug 28, 2016
slash after the key prefix. The result was deadlock due to
waiting on wrong keys.

Fixes etcd-io#6278
glycerine added a commit to glycerine/etcd that referenced this issue Aug 29, 2016
After winning an election or obtaining a lock, we
auto-append a slash after the provided key prefix.
This avoids the previous deadlock due to waiting
on the wrong key.

Fixes etcd-io#6278
glycerine added a commit to glycerine/etcd that referenced this issue Aug 29, 2016
After winning an election or obtaining a lock, we
auto-append a slash after the provided key prefix.
This avoids the previous deadlock due to waiting
on the wrong key.

Fixes etcd-io#6278
glycerine added a commit to glycerine/etcd that referenced this issue Aug 29, 2016
After winning an election or obtaining a lock, we
auto-append a slash after the provided key prefix.
This avoids the previous deadlock due to waiting
on the wrong key.

Fixes etcd-io#6278
glycerine added a commit to glycerine/etcd that referenced this issue Aug 29, 2016
After winning an election or obtaining a lock, we
auto-append a slash after the provided key prefix.
This avoids the previous deadlock due to waiting
on the wrong key.

Fixes etcd-io#6278
glycerine added a commit to glycerine/etcd that referenced this issue Aug 29, 2016
After winning an election or obtaining a lock, we
auto-append a slash after the provided key prefix.
This avoids the previous deadlock due to waiting
on the wrong key.

Fixes etcd-io#6278
glycerine added a commit to glycerine/etcd that referenced this issue Aug 29, 2016
After winning an election or obtaining a lock, we
auto-append a slash after the provided key prefix.
This avoids the previous deadlock due to waiting
on the wrong key.

Fixes etcd-io#6278
glycerine added a commit to glycerine/etcd that referenced this issue Aug 29, 2016
After winning an election or obtaining a lock, we
auto-append a slash after the provided key prefix.
This avoids the previous deadlock due to waiting
on the wrong key.

Fixes etcd-io#6278
glycerine added a commit to glycerine/etcd that referenced this issue Aug 29, 2016
After winning an election or obtaining a lock, we
auto-append a slash after the provided key prefix.
This avoids the previous deadlock due to waiting
on the wrong key.

Fixes etcd-io#6278
glycerine added a commit to glycerine/etcd that referenced this issue Aug 30, 2016
After winning an election or obtaining a lock, we
auto-append a slash after the provided key prefix.
This avoids the previous deadlock due to waiting
on the wrong key.

Fixes etcd-io#6278
gyuho pushed a commit that referenced this issue Aug 31, 2016
After winning an election or obtaining a lock, we
auto-append a slash after the provided key prefix.
This avoids the previous deadlock due to waiting
on the wrong key.

Fixes #6278

Conflicts:
	clientv3/concurrency/election.go
	clientv3/concurrency/mutex.go
kragniz added a commit to kragniz/etcd that referenced this issue Oct 13, 2016
When running test suites for a client locally I'm getting spammed by log
lines such as:

    etcdserver: apply entries took too long [14.226771ms for 1 entries]

The comments in etcd-io#6278 mention there were future plans of increasing the
threshold for logging these warnings, but it hadn't been done yet.
kragniz added a commit to kragniz/etcd that referenced this issue Oct 13, 2016
When running test suites for a client locally I'm getting spammed by log
lines such as:

    etcdserver: apply entries took too long [14.226771ms for 1 entries]

The comments in etcd-io#6278 mention there were future plans of increasing the
threshold for logging these warnings, but it hadn't been done yet.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging a pull request may close this issue.

3 participants