Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clientv3/integration: add TestBalancerUnderNetworkPartitionWatch #8762

Merged
merged 1 commit into from
Oct 27, 2017

Conversation

gyuho
Copy link
Contributor

@gyuho gyuho commented Oct 25, 2017

@xiang90 Should we also test when leader becomes isolated as well?

This only tests:

  1. follower gets isolated
  2. WithRequireLeader detects leader lose
  3. Close watch stream with error

[EDIT] add test case with leader

#8711

@gyuho gyuho added the WIP label Oct 25, 2017
@xiang90
Copy link
Contributor

xiang90 commented Oct 26, 2017

Should we also test when leader becomes isolated as well?

we could.

defer watchCli.Close()

// wait for eps[follower] to be pinned
watchCli.Get(context.Background(), "foo")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check error

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

put a timeout instead of using background.

t.Fatal("expect open watch channel")
}
case <-time.After(3 * time.Second): // enough time to detect leader lose
t.Fatal("took too long to receive events")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

took too long to detect leader lost.

case <-time.After(3 * time.Second): // enough time to detect leader lose
t.Fatal("took too long to receive events")
}
select {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do not need this. this is testing the watch interface behavior.

@xiang90
Copy link
Contributor

xiang90 commented Oct 26, 2017

/cc @jpbetz PTAL.

this test tests if etcd client is smart enough to when the connected node has no leader (so watch will not pend forever just to receive nothing).

@gyuho gyuho force-pushed the partition-test branch 2 times, most recently from 3f8aa94 to 4a9bb7d Compare October 26, 2017 00:40
@gyuho gyuho removed the WIP label Oct 26, 2017
gyuho added a commit to gyuho/etcd that referenced this pull request Oct 26, 2017
@xiang90
Copy link
Contributor

xiang90 commented Oct 26, 2017

lgtm. defer to @jpbetz

@codecov-io
Copy link

Codecov Report

Merging #8762 into master will decrease coverage by 0.04%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8762      +/-   ##
==========================================
- Coverage   76.08%   76.04%   -0.05%     
==========================================
  Files         360      360              
  Lines       29629    29629              
==========================================
- Hits        22543    22530      -13     
- Misses       5516     5522       +6     
- Partials     1570     1577       +7
Impacted Files Coverage Δ
auth/simple_token.go 87.03% <0%> (-6.49%) ⬇️
proxy/grpcproxy/watch.go 91.92% <0%> (-3.11%) ⬇️
etcdserver/api/v3election/election.go 64.7% <0%> (-2.95%) ⬇️
etcdserver/api/v3rpc/watch.go 92.55% <0%> (-1.87%) ⬇️
rafthttp/msgappv2_codec.go 69.56% <0%> (-1.74%) ⬇️
rafthttp/peer.go 90.22% <0%> (-1.51%) ⬇️
pkg/adt/interval_tree.go 87.38% <0%> (-1.51%) ⬇️
clientv3/op.go 72.56% <0%> (-1.33%) ⬇️
clientv3/balancer.go 90.86% <0%> (-1.31%) ⬇️
rafthttp/stream.go 88.25% <0%> (-0.68%) ⬇️
... and 11 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0160cd7...e980bde. Read the comment docs.

@gyuho gyuho requested a review from jpbetz October 27, 2017 18:12
if err = ev.Err(); err != rpctypes.ErrNoLeader {
t.Fatalf("expected %v, got %v", rpctypes.ErrNoLeader, err)
}
case <-time.After(3 * time.Second): // enough time to detect leader lost
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our integration tests are starting to get filled with hard coded wait times that we believe are appropriate for various cases.

To help maintain the tests and keep flakiness to a minimum, should we establish a set of constants that we references from the integration tests instead? E.g instead of 3 here, maybe have a constant like maxLeaderChangeWait. Once we have a reasonable set of constants, then developers don't need to try to figure out what a good number of seconds is each time they write a test, and if we determine we need to increase the number to minimize test flakes based, we can change it in one place...

Just a though, we don't need to do this now. If this seams reasonable, maybe create an issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should. these random in place non-const timeouts def is not helping anything at all. and we are simply waiting for flakes.

Copy link
Contributor

@jpbetz jpbetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, added one comment that we might want to break out in to it's own issue

@gyuho
Copy link
Contributor Author

gyuho commented Oct 27, 2017

Merging. Let's address the random sleeps in the separate issue. Thanks!

@gyuho gyuho merged commit 732c405 into etcd-io:master Oct 27, 2017
@gyuho gyuho deleted the partition-test branch October 27, 2017 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants