-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[chore] Re-run failed unit tests automatically #31253
Conversation
cc @open-telemetry/collector-contrib-approvers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Looking at the documentation, it looks like it will attempt two more times by default, which is reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's good to rerun automatically given the state of CI. Tolerating failures is always a tradeoff but we currently have so many failures that it's difficult to separate the worst offenders from the 1/million flukes. Retrying is a great way to separate these so we can get the worst offenders under control. The question in my mind is whether we should retry twice or only once.
We should keep in mind that retrying twice means a test which fails 1% of the time has only a 1/million chance of failing a given test run. We run CI ~100 times per day, so a 1% failure rate test would show up maybe once per quarter. On the other hand, retrying only once means that we would see the failure a couple times per month, which (in a less noisy CI environment) seems often enough to notice and fix/skip/remove.
@djaglowski Your argument makes sense to me and it's possible that re-running once could be enough. I changed the option to |
**Description:** Re-runs failed unit tests automatically. Follow up to open-telemetry#31163 This re-runs the tests once if there are less than 10 total test failures. This should speed up development, but it comes with the risk of missing real issues. I think given the current situation our CI is in this is acceptable, but I assume this PR is going to be controversial :) One improvement would be to keep this but auto-generate Github issues when a test fails and then passes on main's CI. **Link to tracking Issue:** Relates to open-telemetry#30880 (does not speed up individual tests but reduces the number of attempts to be made)
Description:
Re-runs failed unit tests automatically. Follow up to #31163
This re-runs the tests once if there are less than 10 total test failures.
This should speed up development, but it comes with the risk of missing real issues.
I think given the current situation our CI is in this is acceptable, but I assume this PR is going to be controversial :)
One improvement would be to keep this but auto-generate Github issues when a test fails and then passes on main's CI.
Link to tracking Issue: Relates to #30880 (does not speed up individual tests but reduces the number of attempts to be made)