Check precise Windows `GIX_TEST_IGNORE_ARCHIVES` expectations on CI #1663

EliahKagan · 2024-11-09T01:18:42Z

This modifies the CI test-fixtures-windows job introduced in #1657, by making it so that, instead of checking that no more than the expected number of test cases fail, it checks that the exact test cases expected to fail are the ones that do fail. (The other check done before as part of this job, that no tests report errors, is still done.) The approach taken is:

To collect actual failures, parse the XML output saved by cargo nextest to get the package name and test name of each failing test and save them, one per line, to a file.

Getting this information is pretty straightforward in PowerShell. When iterating through testcase nodes, which are filtered by the presence of a failure subnode, $_.name gives the name of the test case, including :: qualifications within the containing package, and $_.classname gives the name of the containing package. Thus "$($_.classname) $($_.name)" identifies a test in the same way it is identified in human-readable cargo nextest output.

The unintuitive attribute name classname is because cargo nextest XML is in the JUnit format, which originated for Java tests where test cases are methods in classes. An alternative approach is to use $_.ParentNode.name instead of $_.classname. I've verified that this works too, but I don't know of any strong reason to prefer one over another, so I went with the more compact $_.classname.
To collect expected failures, use gh (the GitHub CLI) to download the text of #1358, and parse the first ```text code block with regex, extracting the list of failing tests and saving them, one per line, to a file.

One of the tests currently listed there, gix-ref-tests::refs packed::iter::performance, is a performance test that seems not to fail on CI (anymore?), though I still find it to fail locally. If it were to start failing on CI, we would want to know about it. So, before these changes, it was not included in the count. I've carried that over to the precise matching done here: tests that appear to be performance tests due to having performance in their names (not as part of any longer \w+ word) are not regarded as expected to fail.

It is not obvious that the general approach of downloading the information from an issue is the best one. Although I like this approach and I think it's not too cumbersome, this is admittedly the less elegant step. Other approaches, including hard-coding the expected failures in the workflow or another file, could be used if this approach is not wanted.
To compare them, use git diff --no-index with various other options, printing the entire list of failures as context if they differ. The job fails if there are any differences.

Because this neither a required check for PR auto-merge nor a dependency of one, I think failing not only if tests unexpectedly fail but also if any unexpectedly pass is unlikely to cause any significant problems, and that knowing whenever the set of failing tests changes in any way is worthwhile.

I intentionally excluded performance tests late in the process, so that there would be a commit whose CI results could be inspected that would verify that the check is capable of failing when the diff is nonempty (and that the output would display in a useful way, with effective colorization, etc., in this situation). That can be observed in this run output and compared to passing runs, such as the runs in this PR.

This modifies the `test-fixtures-windows` job that tests on Windows with `GIX_TEST_IGNORE_ARCHIVES=1` so that, instead of checking that no more than 14 failures occur, it checks that the failing tests are exactly those that are documented in GitoxideLabs#1358 as expected to fail. The initial check that no tests have *error* status is preserved, with only stylistic changes, and kept separate from the subsequent logic so that the output is clearer. The new steps are no longer conditional on `nextest` having exited with a failure status, since (a) that was probably unnecessary before and definitely unnecessary now, (b) at last for now, the comparison is precise, so it would be strange to pass if the diff were to have changes on *all* lines, and (c) this makes it slightly less likely that GitoxideLabs#1358 will accidentally stay open even once fixed. The current approach is to actually retrieve the list of tests expected to fail on Windows with `GIX_TEST_IGNORE_ARCHIVES=1` from the GitoxideLabs#1358 issue body. This has the advantage that it automatically keeps up to date with changes made to that issue description, but this is of course not the only possible approach for populating the expected value. Two changes should be made before this is ready: - As noted in the "FIXME" comment, the job should currently fail becuase the performance test reported to fail in GitoxideLabs#1358 is not being filtered out from the expected failures list. It's left in as of this commit, to verify that the job is capable of failing. (After that, the performance test should either be filtered out or removed from the list in GitoxideLabs#1358, but the former approach is currently preferable because I have not done diverse enough testing to check if the failure on my main Windows system is due to that system being too slow rather than a performance bug.) - The scratchwork file should be removed once no longer needed.

This is to fix the error: gh: To use GitHub CLI in a GitHub Actions workflow, set the GH_TOKEN environment variable. Example: env: GH_TOKEN: ${{ github.token }} InvalidOperation: D:\a\_temp\ba9d92b7-3a94-4f7c-b541-d19003f40e19.ps1:9 Line | 9 | $expected_failures = $match_info.Matches.Groups[1].Value -split "`n" … | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | Cannot index into a null array. Error: Process completed with exit code 1.

Including color in the diff of expected vs. actual failed tests makes the output easier to see (much as color helps in `nextest`). This commit also makes some stylistic changes to the command so it is easier to read.

This omits tests containing `performance` (and not as part of a larger "word", not even with `_`) from being expected to fail on CI with `GIX_TEST_IGNORE_ARCHIVES=1` on Windows. Currently there is one such test listed in GitoxideLabs#1358, `gix-ref-tests::refs packed::iter::performance`.

The relevant code is in the `test-fixtures-windows` CI job and it is working (both to fail the job when there is a mismatch and to have the job succeed when there is agreement).

Byron

Thanks a lot for making this happen!

I am very impressed by what can be done with powershell, it clearly is .NET in a shell :).

And indeed, like predicted, reading the text block from the issue breaks isolation too much and is the reason I'd love to see these extracted into a file, for example if that's OK with you.

Thanks again.

.github/workflows/ci.yml

Instead of retrieving them from GitoxideLabs#1358. (See discussion in GitoxideLabs#1663.)

Byron

Wonderful, thanks so much!

EliahKagan added 6 commits November 8, 2024 16:32

Colorize the diff

63473bc

Including color in the diff of expected vs. actual failed tests makes the output easier to see (much as color helps in `nextest`). This commit also makes some stylistic changes to the command so it is easier to read.

Remove scratchwork

b2ce048

The relevant code is in the `test-fixtures-windows` CI job and it is working (both to fail the job when there is a mismatch and to have the job succeed when there is agreement).

Clarify comment and code style

067e7d2

Byron requested changes Nov 9, 2024

View reviewed changes

.github/workflows/ci.yml Show resolved Hide resolved

.github/workflows/ci.yml Outdated Show resolved Hide resolved

Use a file with test-fixtures-windows expected failures

99238a7

Instead of retrieving them from GitoxideLabs#1358. (See discussion in GitoxideLabs#1663.)

EliahKagan requested a review from Byron November 9, 2024 08:32

Byron approved these changes Nov 9, 2024

View reviewed changes

Byron merged commit 1df68e4 into GitoxideLabs:main Nov 9, 2024
17 checks passed

EliahKagan deleted the run-ci/test-fixtures-windows-precise branch November 9, 2024 09:34

EliahKagan mentioned this pull request Nov 11, 2024

Improve CI permissions, auto-merge maintainability, and clarity #1668

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check precise Windows `GIX_TEST_IGNORE_ARCHIVES` expectations on CI #1663

Check precise Windows `GIX_TEST_IGNORE_ARCHIVES` expectations on CI #1663

EliahKagan commented Nov 9, 2024 •

edited

Loading

Byron left a comment

Byron left a comment

Check precise Windows GIX_TEST_IGNORE_ARCHIVES expectations on CI #1663

Check precise Windows GIX_TEST_IGNORE_ARCHIVES expectations on CI #1663

Conversation

EliahKagan commented Nov 9, 2024 • edited Loading

Byron left a comment

Choose a reason for hiding this comment

Byron left a comment

Choose a reason for hiding this comment

Check precise Windows `GIX_TEST_IGNORE_ARCHIVES` expectations on CI #1663

Check precise Windows `GIX_TEST_IGNORE_ARCHIVES` expectations on CI #1663

EliahKagan commented Nov 9, 2024 •

edited

Loading