-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix unintended deadlock on key prefixes #6284
Conversation
@xiang90, @heyitsanthony The CI tests seem to fail for "package not found" reasons I don't understand, and seem unrelated to the patch. |
@glycerine Sorry for the inconvenience. The current master is broken due to a testing issue. We first need to get that first tomorrow. |
@xiang90 -- no worries. As for the patch, I was wondering -- would it be better to push the change into the function waitDeletes(), so that any other future caller wouldn't make the same mistake? I didn't see any other uses of waitDeletes(), but I also can't tell if there are other expectations/legacy users I'm unaware of. |
I think we can actually add I even think we probably should add prefixes like It would be helpful if you can also write a test case for this in https://github.com/coreos/etcd/blob/master/integration/v3_election_test.go. With all that being side, @heyitsanthony knows this part better than me and might provide better opinions. |
@glycerine thanks for the patch. I agree with @xiang90 that it'd be cleaner to add the "/" to the prefix in Also please fix the commit title or CI won't accept it (the error is @xiang90 I feel like prefixing with |
Done:
|
cc4e96c
to
be0cc34
Compare
TravisCI passes. Semaphoreci points out that TestTxnReadRetry and TestWatchReconnRequest are flakey tests, independent of this change. @xiang90, @heyitsanthony: ready to merge? |
// Also a good basic test of single candidate election | ||
// and simultaneous correct observation of an election. | ||
// | ||
func TestElectionOnPrefixOfExistingKey(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is more complicated than it needs to be. I think this should trigger the old, broken behavior with a lot less code:
clus := NewClusterV3(t, &ClusterConfig{Size : 1})
defer clus.Terminate(t)
cli := clus.RandClient()
if _, err := cli.Put(context.TODO(), "testa"); err != nil {
t.Fatal(err)
}
s, serr := concurrency.NewSession(cli)
if serr != nil {
return serr
}
e := concurrency.NewElection(s, "test")
ctx, cancel := context.WithTimeout(context.TODO(), 5*time.Second)
err := e.Campaign(ctx, "abc")
cancel()
if err != nil {
t.Fatal(err)
}
@glycerine Can you rebase with the master? We just fixed the CI issue on master. Sorry about the inconvenience. |
@heyitsanthony Cool. This is really a pretty small change and obviously you know the code much better than me. Why do you go ahead and merge your test version and fix per your preference. |
@glycerine are you abandoning this PR or what? Just update the test and I can merge this...? |
After winning an election or obtaining a lock, we auto-append a slash after the provided key prefix. This avoids the previous deadlock due to waiting on the wrong key. Fixes etcd-io#6278
@heyitsanthony :Okay, rebased and updated test. |
lgtm |
@glycerine Thanks for the contribution! |
cool -- @heyitsanthony one quick question for future reference so I understand how these tests and context interact -- why is cancel() called after the Campaign() call? -- the Campaign() has already succeeded or timed-out by that point, so why is it needed?
|
@glycerine That is the intended behavior for releasing the resource associated with the context. See the doc in context pkg: https://godoc.org/golang.org/x/net/context#WithCancel |
After winning a campaign, the call to waitDeletes() was missing a
slash after the key prefix. The result was deadlock due to
waiting on wrong keys; if a prior key exists that has the new key
as a prefix.
Fixes #6278