-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: skip some tests under race
#116080
*: skip some tests under race
#116080
Conversation
60af77a
to
72ca206
Compare
All of these tests have OOM or timeout issues when running under `race`. Epic: CRDB-8308 Release note: None
72ca206
to
381c96b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know why all of these tests are OOMing under race
? It's a bit worrisome that it's not just a single test; quite a few seem to be OOMing all of a sudden.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DarrylWong, @dt, @herkolategan, and @mgartner)
The memory overhead for
We just started reporting test failures from EngFlow for |
Whereas before OOMs weren't being reported as test failures? |
Whereas before the OOM's were less likely to occur since unit test machines have ~192GB memory and nightly stress machines have ~32GB. It's just a more constrained environment and unlike old-style nightly stress, we manage memory on a per-test binary basis, rather than a "normal" test run that lets memory "float" between all concurrently running tests on the same machine. |
Got it, thanks. |
TFTRs! bors r=rail,srosenberg |
Build failed (retrying...): |
Build succeeded: |
@rickystewart is there an issue tracking all of these skipped tests under race due to memory limits? Are we planning to eventually raise the memory limits in order to run these tests? |
No, there is no tracking issue.
No. It's unclear what we would raise the limit to. The memory overhead of running stuff under |
Can we size up the hardware to match the hardware that was successfully running these tests in the previous TeamCity configuration? It will take time to reduce the overhead of these tests, and that's not work we've planned for in the near future. We'd rather not skip tests and lose coverage. |
For the old iteration of nightly stress the machines had 32GB memory. For running a single unit test, 32GB is extremely excessive IMO. One worry I have is that we'll "hide" memory leaks especially for tests that are not running under Currently the I'll make the request with the vendor today with the expectation that it will probably not be implemented until the new year. |
One data point is that some of the skips (e.g. I'll audit all skips-under-race that we merged in the last two weeks to see which could be explained by this, and will unskip them. |
I sent #116986 to unskip a subset of recent skips. |
These tests (among others) are un-skipped in #117833. #117894 skipped more tests, but these tests were generally already skipped under |
All of these tests have OOM or timeout issues when running under
race
.Epic: CRDB-8308
Release note: None