-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
🏗 Retry tests before failing them on Travis #22210
Conversation
Tested this in both local and Travis modes by running an always failing test and making sure it runs once in local mode, and 3 times in Travis mode. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mixed feelings about this... this doesn't reduce flakiness, it just hides it. Ideally, we should have some system in place to notify the build-cop when specific tests fail-retry-pass. By merging this PR, we're hiding the bad tests and risking shipping unstable code in the long run
I agree, mostly. Right now, there is a huge cost to development because PR authors have to rerun jobs due to flaky tests that are unrelated to their code changes. This PR will reduce that cost. Meanwhile, it's true that our current build-cop workflow does little to address the root cause of flaky tests other than to disable them. We have numerous tests that rely on external services. A more concerted effort to make all unit tests hermetic will go a long way in weeding out flakiness. Let's discuss this approach offline. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: @danielrozenberg's comment, I think this is fine. It automates the retry mechanism that developers end up doing themselves - I can see this saving a lot of time.
Still, I'm worried we're doing a set-it-and-forget-it on something that should not be forgotten |
See this line from my previous reply.
|
Discussed offline, merging this for now while we search for ways to make our tests more hermetic and more isolated from one another. |
During CI builds, we run ~15000 tests on Travis.
Often times, a single-digit number of tests will fail due to timing issues, and the Travis job needs to be retried to get a green build. This is frustrating for developers, and expensive due to wasted resources.
In this PR, we change the default number of test retries from 0 to 2 for Travis CI runs.
This should result in less random flakiness on Travis, and increase our trust in CI builds.
Partial fix for #14360