-
Notifications
You must be signed in to change notification settings - Fork 616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix flaky tests #2808
Fix flaky tests #2808
Conversation
ecd71e3
to
30e7367
Compare
Codecov Report
@@ Coverage Diff @@
## master #2808 +/- ##
==========================================
- Coverage 62.24% 62.16% -0.08%
==========================================
Files 139 139
Lines 22339 22342 +3
==========================================
- Hits 13905 13889 -16
- Misses 6955 6977 +22
+ Partials 1479 1476 -3 |
CI came up green on the first run after I fixed the obvious issues. Running it a few more times, then we can merge this. |
30e7367
to
7c619ff
Compare
Update tests are still failing. I believe this is due to the same underlying issue of goroutines not running, but it's difficult to say |
I increased some timeout, and now I'm just hitting "rebuild" until this shows no sign of failing. |
I'm on 3 consecutive test runs with no failures. I think increasing the timeout has completed the fix. |
Nope, there it goes. Failed again. |
@@ -22,7 +38,7 @@ func WatchTaskCreate(t *testing.T, watch chan events.Event) *api.Task { | |||
if _, ok := event.(api.EventUpdateTask); ok { | |||
assert.FailNow(t, "got EventUpdateTask when expecting EventCreateTask", fmt.Sprint(event)) | |||
} | |||
case <-time.After(2 * time.Second): | |||
case <-time.After(3 * time.Second): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whats the rational behind increasing this by 50% here and 100% below?
}() | ||
|
||
<-started | ||
return stopped |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whats the use of the returned channel? It doesn't seem to be used in any of the calls
It is likely that a large portion of test flakiness, especially in CI, comes from the fact that swarmkit components under test are started in goroutines, but those goroutines never have an opportunity to run. This adds code ensuring those goroutines are scheduled and run, which should hopefully solve many inexplicably flaky tests. Additionally, increased test timeouts, to hopefully cover a few more flaky cases. Finally, removed direct use of the atomic package, in favor of less efficient but higher-level mutexes. Signed-off-by: Drew Erny <[email protected]>
Tests now passing several runs in a row. Gonna merge this. |
should we cherry pick this into the release branches? |
@thaJeztah maybe, but it's no rush to do so. |
…3 branch) full diff: moby/swarmkit@4fb9e96...bbe3418 changes included: - moby/swarmkit#2889 [19.03 backport] Fix update out of sequence and increase max recv gRPC message size for nodes and secrets Which relates to - moby#39531 integration-cli: fix swarm tests flakiness - docker-archive#345 [19.03 backport] integration-cli: fix swarm tests flakiness And includes backports of - moby/swarmkit#2808 Fix flaky tests - moby/swarmkit#2866 Swap gometalinter for golangci-lint - moby/swarmkit#2869 Increase max recv gRPC message size to initialize connection broker - related / similar to moby#38103 / docker-archive#102 cluster: set bigger grpc limit for array requests - related / similar to moby#39306 Increase max recv gRPC message size for nodes and secrets - fixes moby/swarmkit#2733 Error generated when messages size is too big - moby/swarmkit#2870 Fix update out of sequence Signed-off-by: Sebastiaan van Stijn <[email protected]>
…3 branch) full diff: moby/swarmkit@4fb9e96...bbe3418 changes included: - moby/swarmkit#2889 [19.03 backport] Fix update out of sequence and increase max recv gRPC message size for nodes and secrets Which relates to - moby/moby#39531 integration-cli: fix swarm tests flakiness - docker-archive/engine#345 [19.03 backport] integration-cli: fix swarm tests flakiness And includes backports of - moby/swarmkit#2808 Fix flaky tests - moby/swarmkit#2866 Swap gometalinter for golangci-lint - moby/swarmkit#2869 Increase max recv gRPC message size to initialize connection broker - related / similar to moby/moby#38103 / docker-archive/engine#102 cluster: set bigger grpc limit for array requests - related / similar to moby/moby#39306 Increase max recv gRPC message size for nodes and secrets - fixes moby/swarmkit#2733 Error generated when messages size is too big - moby/swarmkit#2870 Fix update out of sequence Signed-off-by: Sebastiaan van Stijn <[email protected]> Upstream-commit: f7dbee3eeaa1dd218116f85b8f60361acbd5b214 Component: engine
…v18.09) full diff: moby/swarmkit@142a737...5c86095 - moby/swarmkit#2892 [18.09 backport] Remove hardcoded IPAM config subnet value for ingress network - backport of moby/swarmkit#2890 Remove hardcoded IPAM config subnet value for ingress network - fixes [ENGORC-2651](https://docker.atlassian.net/browse/ENGORC-2651) - moby/swarmkit#2836 [18.09 backport] Switch to go 1.11 - backport of moby/swarmkit#2752 Switch to go 1.11 - moby/swarmkit#2901 [18.09 backport] Bump to golang 1.12.9 - backport of moby/swarmkit#2880 Bump to golang 1.12.9 - moby/swarmkit#2900 [18.09 backport] Fix update out of sequence and increase max recv gRPC message size for nodes and secrets - backport of moby/swarmkit#2762 Increased wait time on test utils WaitForCluster and WatchTaskCreate - backport of moby/swarmkit#2771 Allow using Configs as CredentialSpecs - **second commit only** (attempt to fix weirdly broken tests) - backport of moby/swarmkit#2808 Fix flaky tests - backport of moby/swarmkit#2866 Swap gometalinter for golangci-lint - backport of moby/swarmkit#2869 Increase max recv gRPC message size to initialize connection broker - related / similar to moby#38103 / docker-archive#102 cluster: set bigger grpc limit for array requests - related / similar to moby#39306 Increase max recv gRPC message size for nodes and secrets - fixes moby/swarmkit#2733 Error generated when messages size is too big - backport of moby/swarmkit#2870 Fix update out of sequence Signed-off-by: Sebastiaan van Stijn <[email protected]>
…v18.09) full diff: moby/swarmkit@142a737...5c86095 - moby/swarmkit#2892 [18.09 backport] Remove hardcoded IPAM config subnet value for ingress network - backport of moby/swarmkit#2890 Remove hardcoded IPAM config subnet value for ingress network - fixes [ENGORC-2651](https://docker.atlassian.net/browse/ENGORC-2651) - moby/swarmkit#2836 [18.09 backport] Switch to go 1.11 - backport of moby/swarmkit#2752 Switch to go 1.11 - moby/swarmkit#2901 [18.09 backport] Bump to golang 1.12.9 - backport of moby/swarmkit#2880 Bump to golang 1.12.9 - moby/swarmkit#2900 [18.09 backport] Fix update out of sequence and increase max recv gRPC message size for nodes and secrets - backport of moby/swarmkit#2762 Increased wait time on test utils WaitForCluster and WatchTaskCreate - backport of moby/swarmkit#2771 Allow using Configs as CredentialSpecs - **second commit only** (attempt to fix weirdly broken tests) - backport of moby/swarmkit#2808 Fix flaky tests - backport of moby/swarmkit#2866 Swap gometalinter for golangci-lint - backport of moby/swarmkit#2869 Increase max recv gRPC message size to initialize connection broker - related / similar to moby/moby#38103 / docker-archive/engine#102 cluster: set bigger grpc limit for array requests - related / similar to moby/moby#39306 Increase max recv gRPC message size for nodes and secrets - fixes moby/swarmkit#2733 Error generated when messages size is too big - backport of moby/swarmkit#2870 Fix update out of sequence Signed-off-by: Sebastiaan van Stijn <[email protected]> Upstream-commit: e06f07ef337ab890f211397d6b408b75a2512dc5 Component: engine
It is likely that a large portion of test flakiness, especially in CI, comes from the fact that swarmkit components under test are started in goroutines, but those goroutines never have an opportunity to run. This adds code ensuring those goroutines are scheduled and run, which should hopefully solve many inexplicably flaky tests.