-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build,tests: replace go test
with with a new test runner that splits output in different files
#59045
Comments
cc @tbg to put this on your radar. |
FWIW, @RaduBerinde reports a failure where a test ( It's becoming clear that |
Could you be more specific?
But
My biggest concern about the On the whole, though, I think the bar for moving off Mind sharing the link to Radu's build. I am noticing that this is a logic test which uses |
I already analyzed this:
|
But that's the thing:
we want something better. I reckon we do want test parallelism; and this could be achieved by providing a different test harness for the purpose of CI (I do not propose to replace |
I took a look at that log and indeed the output from the
This would typically indicate that either something in the code called
I'm still not sure what exactly you're proposing here (if anything). My take-away is that our CI is janky and that that is a problem, which I agree with. Would you mind filing this issue into dev-inf and referring+closing this issue. The closest related issue on the dev-inf side that I found is https://github.com/cockroachdb/dev-inf/issues/53. |
I would like to propose that you've already put up so much with To start with, I am expecting as much information about sub-processes run by That everyone seems to think the status quo is acceptable is flabbergasting to me. Just the exit code would help distinguish between
And for several of these conditions, a core file is dumped, I want the test harness to give me a link to the core file to investigate. Additionally, given that multiple processes are running side-by-side, at the very least the thing should print out the PIDs of the test sub-processes, so that if two or more encounter problems it would be clear which processes the left-over files belong to (either core files or |
I also want the thing to stop mixing the test output with the stdout/stderr output by tests. The unix streams should be redirected to separate files, one per package. The stdout of the test harness should contain exclusively reports about a test's progress. |
So one concrete proposal is to have |
I don't think so. I've spent tons of time on this stuff and would rather not have that toil every week. I am trying to funnel this discussion to dev-inf because that is where it belongs, both by team boundary and because of the interplay with the bazelization project. I do in general agree that we're outgrowing |
I don't disagree that bazel could help here. We can probably leverage bazel as replacement to Note that the issue at top was to replace |
I think bazel would still invoke |
Having bazel run |
Bazel will help here, yes. You're able to define individual test targets within a package (so you could do something like defining 16 test targets for all the tests within pkg/kv/kvserver) and have them then execute in parallel. I think it does what Rafa is describing, which is building out the test binaries using |
Oh, sorry, I forgot about the distinction. I think it would build and
invoke as we prefer. But this may happen through go test yet, via the -c
and -exec flags. Not sure.
…On Fri, Jan 15, 2021, 18:35 kena ***@***.***> wrote:
Having bazel run go test, even just once per package, would not solve any
of our problems here. The system must build the test executable using go
test but then run it directly and get all its results -- all the things
that go test unhelpfully forgets to report.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#59045 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABGXPZHJZWQ35PO7B2IPJLTS2B4FLANCNFSM4WEB6VHA>
.
|
Yes, testing via Bazel is smarter than just dumbly wrapping The build targets are currently constructed monolithically, one test target per Go package, since that's what Gazelle does by default, but there's nothing stopping us from overriding that behavior where appropriate. We could do this either on a case-by-case basis if certain tests are known to be particularly troublesome or more broadly across the entire codebase if that's not possible. If the second is desired, a Gazelle feature request might be necessary to make that happen -- if that directive already exists, I can't find it in the docs. (Obviously, splitting tests up to that level of granularity will come with a perf impact, so ideally we wouldn't have to drill down that much.) As stated, there doesn't seem to be any action to take on this issue immediately. I think as specific issues with Bazel tests and/or their integration with TeamCity come up, we can address them accordingly. Completing the Bazel migration for tests will result in this issue being closed as a matter of course, because we wouldn't use |
@rickystewart before closing this issue, please explain here how a bazel-powered run reports:
(we need all 3 things as discrete artifacts. If a bazel-powered run is missing them, we need to keep this issue open until all 3 get properly collected.) |
#69378 should take care of the last bit here, so to review:
As we make more Bazelfied build configurations in TC, all of the above should remain true. If any of the above regress then it should be considered a bug. |
Note that I just looked at a recent bazel build to check what things look like now, and I do have a question - take a look here: If a test failed, would that always happen while prominently displaying the shard? I'm asking because it would be tough to go through up to 16 directories on any failure of the I also noticed we're actually not retaining the server logs:
We're having the test servers log to |
Good question. I'll take a look. We can always munge the logs e.g. to concatenate everything into a single file.
Yes (and again, after it's merged if some logs are still missing, please report it as a bug). |
This fulfills a long-standing request to capture left-over artifacts in `TMPDIR` (see cockroachdb#59045 (comment)). Bazel sets the `TEST_TMPDIR` environment variable for the temporary directory and expects all tests to write temporary files to that directory. In our Go tests, however, we consult the `TMPDIR` environment variable to find that directory. So we pull in a custom change to `rules_go` to copy `TEST_TMPDIR` to `TMPDIR`. Update `.bazelrc` to use `/artifacts/tmp` as the `TEST_TMPDIR`. Closes cockroachdb#59045. Closes cockroachdb#69372. Release justification: Non-production code change Release note: None
68983: backupccl: stop including historical databases in cluster backup Descs r=adityamaru a=pbardea A previous commit attempted to fix a bug where cluster backup would not include tables in dropped databases between incremental backups. That fixed aimed to find dropped databases and add it to the set of descriptors. However, this causes issues when a database is recreated with the same name. Rather than adding the dropped DBs to the Descriptors field on the backup manifest, this commit updates how DescriptorChanges are populated for cluster backups with revision history. Now, the initial scan of descriptors as of the start time will look for all descriptors in the cluster rather than just those that were resolved as of the end time of the backup. Release note (bug fix): Fix a bug where cluster revision-history backups may have included dropped descriptors in the "current" snapshot of descriptors on the cluster. Release justification: bug fix. Fix a bug where cluster revision-history backups may have included dropped descriptors in the "current" snapshot of descriptors on the cluster. 69378: bazel,ci: propagate `TEST_TMPDIR` down to go tests and capture artifacts r=jlinder a=rickystewart This fulfills a long-standing request to capture left-over artifacts in `TMPDIR` (see #59045 (comment)). Bazel sets the `TEST_TMPDIR` environment variable for the temporary directory and expects all tests to write temporary files to that directory. In our Go tests, however, we consult the `TMPDIR` environment variable to find that directory. So we pull in a custom change to `rules_go` to copy `TEST_TMPDIR` to `TMPDIR`. Update `.bazelrc` to use `/artifacts/tmp` as the `TEST_TMPDIR`. Closes #59045. Closes #69372. Release justification: Non-production code change Release note: None 69612: colflow: propagate concurrency info from vectorized to FlowBase r=yuzefovich a=yuzefovich **colflow: propagate concurrency info from vectorized to FlowBase** We've recently merged a change to introduce concurrency in the local flows. Those new concurrent goroutines are started by the vectorized parallel unordered synchronizer, and `FlowBase` isn't aware of them; as a result, `FlowBase.Wait` currently might not wait for all goroutines to exit (which is an optimization when there are no concurrent goroutines). This commit fixes the problem by propagating the information from the vectorized flow to the FlowBase. Addresses: #69419. Release note: None (no stable release with this bug) Release justification: bug fix to new functionality. **sql: loosen up the physical planning of parallel scans** This commit makes it so that in case we try to plan parallel TableReaders but encounter an error during planning, we fallback to having a single TableReader. Release note: None Release justification: bug fix to new functionality. Co-authored-by: Paul Bardea <[email protected]> Co-authored-by: Ricky Stewart <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]>
I did some initial investigation here, in case a test fails Bazel does indeed tell you which log file to look in:
So in this case the failure is in shard 1. (This example is from a local run, but in CI the path to the log will start with Not saying the experience can't be improved at all, but the immediate concern isn't an issue. |
The design of
go test
is defective in numerous ways which are costing a lot of time (and money) to CRL.By far the biggest two hurdles are:
Generally we need a new test engine able to run Go tests but without the jank and misplaced NIH decisions that went into the
go test
runner itself; something that runs each package in a different process with a different test controller and different output files for parallel tests.See comment below for details.
Epic CRDB-8306
The text was updated successfully, but these errors were encountered: