Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI conda: Ignore baseline test failures #36678

Merged
merged 9 commits into from
Dec 19, 2023

Conversation

mkoeppe
Copy link
Contributor

@mkoeppe mkoeppe commented Nov 8, 2023

This makes the CI conda pass by marking affected files as "known baseline failures", thus reducing the noise that this workflow makes on PRs (including #36666 (comment))

Less intrusive version of #36372.

In addition to the failure marked there, here we collect a number of more failures observed over the course of a week in https://github.com/sagemath/sage/actions/workflows/ci-conda.yml?query=is%3Acompleted

📝 Checklist

  • The title is concise, informative, and self-explanatory.
  • The description explains in detail what this PR is about.
  • I have linked a relevant issue or discussion.
  • I have created tests covering the changes.
  • I have updated the documentation accordingly.

⌛ Dependencies

Copy link
Contributor

@tobiasdiez tobiasdiez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are only a couple of test failures in these files. These ignore patterns are way to coarse.

(removing blocker label for that reason to not hide possible issues, and because it's not working anyway in this context)

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 9, 2023

because it's not working anyway in this context

It's working as designed. Green checkmarks. Failures marked as "failed in baseline"
https://github.com/sagemath/sage/actions/runs/6804955526/job/18503541137?pr=36678#step:11:10205

@tobiasdiez
Copy link
Contributor

So if another new test in the same file fails, the conda ci is red?

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 9, 2023

Perhaps you should clarify what you think the purpose of the ci-conda workflows is.
To my understanding they are being run so that the major breakage that we have seen in the past (from unmonitored package upgrades in conda-forge) does not go unnoticed.
I don't think it's really the job of this workflow to duplicate the test coverage done by build.yml

Marking these files as baseline failures is done so that on unrelated PRs, the red checkmarks on the conda workflows do not annoy developers.

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 9, 2023

And obviously work is needed to actually diagnose, report, and fix these failures. The approach of #36372 ("skip_conda") hides these failures too well.

@mkoeppe mkoeppe added this to the sage-10.2 milestone Nov 9, 2023
@mkoeppe mkoeppe requested a review from kwankyu November 9, 2023 18:06
@tobiasdiez
Copy link
Contributor

Perhaps you should clarify what you think the purpose of the ci-conda workflows is. To my understanding they are being run so that the major breakage that we have seen in the past (from unmonitored package upgrades in conda-forge) does not go unnoticed. I don't think it's really the job of this workflow to duplicate the test coverage done by build.yml

They have the same purpose as the Build & Test workflow for people using the conda environment. So yes, they should duplicate the test coverage.

(In the long term, once the conda runs are a bit more stable and there are no more conda-specific issues, we can remove the Build & Test workflow).

@kwankyu
Copy link
Collaborator

kwankyu commented Nov 10, 2023

They have the same purpose as the Build & Test workflow for people using the conda environment. So yes, they should duplicate the test coverage.

B&T is mainly for checking sage library changes to help developers and reviewers. It is unlikely that some code change in the sage library works on a platform and fails on another. That is why B&T runs only on a specific decent platform and tests on all other platforms are done only for weekly release.

I do not use conda (linux neither). I regard conda as another platform. So it looks strange to me that currently 6 B&T workflows for conda platforms run for each PR. I think there is already too much duplication.

Perhaps conda is special, and separate B&T for conda is necessary for PRs (perhaps upgrading packages). Then why not just two B&T for conda (one for ubuntu and one for mac) for PRs and more tests for weekly release?

About #36372, I object to introducing conda-specific doctest tags into sage library. That is another cluttering of sage library, in addition to all doctest tags for modularization. We decided to live with modularization. But no platform-specific doctest tags, please.

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 10, 2023

it looks strange to me that currently 6 B&T workflows for conda platforms run for each PR. I think there is already too much duplication.

FWIW, I approved the change to run these jobs on all PRs in #36373. Tobias has already implemented a mitigation (see #36373 (comment)) there on my request by making it "fail-fast" and setting a max-parallel 2.

But I think at least the macOS jobs need reconsideration / additional adjustment. Given that we only have 5 macOS runners, launching 3 jobs (each 1 hour) for every PR is clearly too much.

For example, at the time that 10.2.rc1 was released (over an 1 hour ago), at least 7 "ci-conda" workflows were already in the queue (https://github.com/sagemath/sage/actions/workflows/ci-conda.yml?query=is%3Aqueued), with at least 14 macOS jobs waiting for a runner. As a result, the CI-macos run of 10.2.rc1 (https://github.com/sagemath/sage/actions/runs/6827163283) as well as the cibuildwheel job for macOS (https://github.com/sagemath/sage/actions/runs/6827163257/job/18568938031) have not been able to start yet.

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Nov 11, 2023

I think at least the macOS jobs need reconsideration / additional adjustment. Given that we only have 5 macOS runners, launching 3 jobs (each 1 hour) for every PR is clearly too much.

Implemented in #36694

@tobiasdiez
Copy link
Contributor

[...] If you know the exact test that fails [...]

... then you open an Issue to report it.

There is no need for adding a record of the exact failing test to the source code.

That example was for the situation where I have 15min extra time and want to use that time to fix a conda bug. With your solution here, I already may need 2 hours to just find out which test is actually failing.

Another example: #36372 (comment) it was very easy to recognize that another PR fixed a certain failure on conda because the info was there directly in the code. With the coarse ignore rules proposed in this PR here, that would have been not that easy. (That also brings up the question who is maintaining the exclusion list, and how?)

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Dec 3, 2023

I don't think this adds anything to the discussion here. I'll explain again:

  1. Test failure is noticed
  2. You record it in a GitHub Issue. There you include the necessary information so that it does not take 2 hours to reconstruct it.
  3. One adds stuff to the exclusion list to eliminate noise in the tests. It's not for humans to read, and definitely not for humans to use 2 hours of guessing time to reconstruct the failure.
  4. The exclusion list is maintained like everything else in Sage -- by PRs, not by specific people in charge.

@kwankyu
Copy link
Collaborator

kwankyu commented Dec 3, 2023

  1. Test failure is noticed
  2. You record it in a GitHub Issue. There you include the necessary information so that it does not take 2 hours to reconstruct it.
  3. One adds stuff to the exclusion list to eliminate noise in the tests.

and items in the exclusion list are commented with the the GitHub Issue numbers...

Copy link
Collaborator

@kwankyu kwankyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@mkoeppe mkoeppe removed this from the sage-10.2 milestone Dec 3, 2023
@mkoeppe mkoeppe requested a review from tornaria December 4, 2023 04:50
@tobiasdiez
Copy link
Contributor

tobiasdiez commented Dec 8, 2023

Matthias, can you please just stop adding "positive review" labels on your PRs. Thanks!

Labeling it as "r: invalid" and "s: needs review" as a motion to close this PR. There are clear short comings with the approach of this PR as described above.

@mkoeppe
Copy link
Contributor Author

mkoeppe commented Dec 8, 2023

There are clear short comings with the approach of this PR as described above.

Nope, all your claims have been refuted.

@vbraun vbraun merged commit 0a69c2a into sagemath:develop Dec 19, 2023
33 checks passed
@mkoeppe mkoeppe added this to the sage-10.3 milestone Dec 19, 2023
tobiasdiez added a commit to tobiasdiez/sage that referenced this pull request Dec 19, 2023
@mkoeppe mkoeppe deleted the ci_conda_baseline branch December 19, 2023 16:40
@tobiasdiez tobiasdiez mentioned this pull request Dec 23, 2023
5 tasks
@mkoeppe mkoeppe added the disputed PR is waiting for community vote, see https://groups.google.com/g/sage-devel/c/IgBYUJl33SQ label Dec 23, 2023
vbraun pushed a commit to vbraun/sage that referenced this pull request Feb 19, 2024
…nd 1 Linux job

    
<!-- ^^^^^
Please provide a concise, informative and self-explanatory title.
Don't put issue numbers in there, do this in the PR body below.
For example, instead of "Fixes sagemath#1234" use "Introduce new method to
calculate 1+1"
-->
<!-- Describe your changes here in detail -->

<!-- Why is this change required? What problem does it solve? -->
as discussed in
sagemath#36678 (comment),
sagemath#36616 (comment)

On pushes to a tag, all 3 Linux and 3 macOS jobs are still run, as
demonstrated in https://github.com/mkoeppe/sage/actions/runs/6829619271

<!-- If this PR resolves an open issue, please link to it here. For
example "Fixes sagemath#12345". -->
<!-- If your change requires a documentation PR, please link it
appropriately. -->

After merging, the branch protection rule https://github.com/sagemath/sa
ge/settings/branch_protection_rules/33965567 will need to be updated. It
controls what workflows show as "Required" in the PR Checks.

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->
<!-- If your change requires a documentation PR, please link it
appropriately -->
<!-- If you're unsure about any of these, don't hesitate to ask. We're
here to help! -->
<!-- Feel free to remove irrelevant items. -->

- [x] The title is concise, informative, and self-explanatory.
- [ ] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation accordingly.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on
- sagemath#12345: short description why this is a dependency
- sagemath#34567: ...
-->

<!-- If you're unsure about any of these, don't hesitate to ask. We're
here to help! -->
    
URL: sagemath#36694
Reported by: Matthias Köppe
Reviewer(s): Kwankyu Lee, Tobias Diez
vbraun pushed a commit to vbraun/sage that referenced this pull request Feb 22, 2024
…nd 1 Linux job

    
<!-- ^^^^^
Please provide a concise, informative and self-explanatory title.
Don't put issue numbers in there, do this in the PR body below.
For example, instead of "Fixes sagemath#1234" use "Introduce new method to
calculate 1+1"
-->
<!-- Describe your changes here in detail -->

<!-- Why is this change required? What problem does it solve? -->
as discussed in
sagemath#36678 (comment),
sagemath#36616 (comment)

On pushes to a tag, all 3 Linux and 3 macOS jobs are still run, as
demonstrated in https://github.com/mkoeppe/sage/actions/runs/6829619271

<!-- If this PR resolves an open issue, please link to it here. For
example "Fixes sagemath#12345". -->
<!-- If your change requires a documentation PR, please link it
appropriately. -->

After merging, the branch protection rule https://github.com/sagemath/sa
ge/settings/branch_protection_rules/33965567 will need to be updated. It
controls what workflows show as "Required" in the PR Checks.

### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->
<!-- If your change requires a documentation PR, please link it
appropriately -->
<!-- If you're unsure about any of these, don't hesitate to ask. We're
here to help! -->
<!-- Feel free to remove irrelevant items. -->

- [x] The title is concise, informative, and self-explanatory.
- [ ] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [ ] I have created tests covering the changes.
- [ ] I have updated the documentation accordingly.

### ⌛ Dependencies

<!-- List all open PRs that this PR logically depends on
- sagemath#12345: short description why this is a dependency
- sagemath#34567: ...
-->

<!-- If you're unsure about any of these, don't hesitate to ask. We're
here to help! -->
    
URL: sagemath#36694
Reported by: Matthias Köppe
Reviewer(s): Kwankyu Lee, Tobias Diez
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c: scripts disputed PR is waiting for community vote, see https://groups.google.com/g/sage-devel/c/IgBYUJl33SQ p: critical / 2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants