-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track which tests are run in CI #95995
Comments
@rustbot label +A-testsuite +A-rustbuild |
I think there's been several off hand comments but nothing super well documented. Likely #93717 or something like it would be done to prepare files, though for this we'd want to integrate into compiletest and/or libtest to get per-test results. Generally my thinking is that each test would get an identifier of some kind and then we'd just upload files to the CI bucket for "ran these tests". After that something like toolstate - though maybe running externally rather than on master CI in this repo - would do a diff and file issues/alert/something if there's a negative delta. Shouldn't be too hard but a good bit of plumbing to do for sure. A second aspect would be for the negative diff tool to have some asserts - e.g., I would like to know that tier 2 targets are actually getting built and distributed, tier 1 tested, etc. A third aspect would be a dry run of some kind to give us a complete list of possible tests - ideally pretty "simplistic" (like |
Ooh, #93717 looks really cool :) I'll keep an eye on that. Oh wow, per-test granularity! That would be awesome if it worked :D what would we compare it to? Is the idea to make sure that all CI runners are running the same tests? More tests are being added constantly so we couldn't use a golden reference or anything like that. |
(updated comment). |
Got it, makes sense :) I am worried though about what happens if tests are renamed - git rename detection is pretty imprecise, and it's not terribly rare to rename tests (e.g. #91037). Doctests are also likely to cause trouble, although we probably don't need to know each individual code block being run, just whether the test runner itself is built / run.
Just to clarify - is this meant to be "all tests that we could possibly support"? Or "all tests that x.py currently supports"? We can get a pretty coarse idea of the latter currently with the help paths ( An example of "possible tests not currently supported" are the codegen_xxx crates: #95518 |
My hope is that libtest integration (probably environment variable or so, and dumping JSON into a file) is enough to work for all of compiletest, rustdoc, and direct libtest usage - which would work out great for us in terms of ease of deploying it. (Those files would be consumed and repackaged by rustbuild to associate with target and other metadata, probably). I'm not particularly worried about renames. If we get a false positive, it just means closing the issue - no big deal for an obvious PR. If the rate is high enough we can explore how to reduce it, but I suspect this is less common than you might think. For the third aspect, I would say best effort. The more tests we pick up, the better, of course. Note that this is intended to be very granular - e.g., each ui test, not ui tests as a concept. (And maybe with added manual matrices of "compare-mode"...?) My hope is that once the underlying infra is in place - even with the simplest of tests - it will be easy for contributors to help plug gaps, and going forward we can try to mandate PRs to plug gaps by linting against unknown tests that were run or so. |
Maybe we could have a fuse filesystem which checks that all files are accessed at least once and if not prints out a list of those that haven't? That wouldn't help with doc tests, but it may help with other kinds of tests and dead files in general. |
@bjorn3 fuse sounds fun but it seems like strictly more work since we already have pretty intimate access to compiletest and libtest. Also I think FUSE only works on linux, and one of the goals here is to compare tests between different OSs. |
@pietroalbini mentions that since #111936 we track individual test names in build-metrics, which should make this a lot easier :) |
Currently, it's pretty easy to accidentally miss tests in CI - e.g. I recently found that compiler doctests haven't been enforced in a while: #95994. It would be great to track which tests are being run.
@Mark-Simulacrum did you have some ideas for this already in mind? How were you expecting the generated output to look / what sort of granularity did you want? I think just tracking
Step
runs should be pretty easy, but maybe not as useful as tracking cargo invocations? Having our own build system gives us some flexibility here.Originally posted by @Mark-Simulacrum in #95994 (comment)
The text was updated successfully, but these errors were encountered: