-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
acceptance: TestRapidRestarts CI failures #29227
Comments
The failures are all timeouts. Usually the test succeeds in 5-10s. I've been able to reproduce failures locally by setting the timeout to 1m and running in a loop. |
I fooled myself by setting the 1m timeout and running with But I did find the real error in the failures above:
Seems like we're doing something bad early in startup. |
I've probably been staring at the sun too long, but it looks like we're definitely doing something that is not copacetic. Oh, this is apparently a bug in go1.10. Here is the problematic code path in go1.10:
And here is that same code in go1.11:
See golang/go@2fd1b52. I'm not sure what to do in the short term. I suppose we could add our own protection around |
I believe this test has been flaky since its introduction almost a year ago. Cc @tschottdorf for tickling bugs in the stdlib. |
In go1.10 and earlier, it was not safe to call `http.ServeMux.ServeHTTP` concurrently with `http.ServeMux.Handle`. (This is fixed in go1.11). In the interim, provide our own safeServeMux wrapper that provides proper locking. Fixes cockroachdb#29227 Release note: None
29230: server: deflake TestRapidRestarts r=benesch a=petermattis In go1.10 and earlier, it was not safe to call `http.ServeMux.ServeHTTP` concurrently with `http.ServeMux.Handle`. (This is fixed in go1.11). In the interim, provide our own safeServeMux wrapper that provides proper locking. Fixes #29227 Release note: None 29236: Revert "storage: enable the merge queue by default" r=tschottdorf a=benesch This reverts commit 98ca1d0. The merge queue will be reenabled once flaky tests are fixed. To reviewers: I'd much prefer to merge #29235. But if that gets stuck in code review or the flakiness reaches a breaking point, feel free to merge this instead. Co-authored-by: Peter Mattis <[email protected]> Co-authored-by: Nikhil Benesch <[email protected]>
In go1.10 and earlier, it was not safe to call `http.ServeMux.ServeHTTP` concurrently with `http.ServeMux.Handle`. (This is fixed in go1.11). In the interim, provide our own safeServeMux wrapper that provides proper locking. Fixes cockroachdb#29227 Release note: None
29259: release-2.1: server: deflake TestRapidRestarts r=benesch a=petermattis Backport 1/1 commits from #29230. /cc @cockroachdb/release --- In go1.10 and earlier, it was not safe to call `http.ServeMux.ServeHTTP` concurrently with `http.ServeMux.Handle`. (This is fixed in go1.11). In the interim, provide our own safeServeMux wrapper that provides proper locking. Fixes #29227 Release note: None Co-authored-by: Peter Mattis <[email protected]>
TestRapidRestarts
has failed on CI several times recently. For example:https://teamcity.cockroachdb.com/viewLog.html?buildId=867693&buildTypeId=Cockroach_UnitTests
https://teamcity.cockroachdb.com/viewLog.html?buildId=867685&buildTypeId=Cockroach_UnitTests_Acceptance
https://teamcity.cockroachdb.com/viewLog.html?buildId=864345&buildTypeId=Cockroach_UnitTests
The text was updated successfully, but these errors were encountered: