CI Linux: Build the default platform in one job #36627

mkoeppe · 2023-11-02T02:31:11Z

This job is the combination of the "default-pre" job and the first half of the "default" job (with make ptest removed). It takes about 4 hours and so it fits robustly in the 6h job limit. (Test run: https://github.com/mkoeppe/sage/actions/runs/6727624213/job/18285698639)

Together with #36616, which deliberately clogs the pipeline with ~50 "standard-pre" jobs, this gives the following behavior when a release tag is pushed:

"default" is scheduled
~45 "standard-pre" jobs are scheduled (runtime 0.5 to 4 hours)
~45 "minimal-pre" jobs (with max-parallel = 20) are scheduled (3 to 4 hours)
this is calibrated (and can be recalibrated later) so as to keep CI jobs from PRs waiting until the Docker image built by "default" is available.

During the clogging time, developers can rebase their PRs to the latest develop.

📝 Checklist

The title is concise, informative, and self-explanatory.
The description explains in detail what this PR is about.
I have linked a relevant issue or discussion.
I have created tests covering the changes.
I have updated the documentation accordingly.

⌛ Dependencies

Depends on CI Linux: Increase max_parallel for standard-pre, decrease for standard, minimal-pre #36616 (merged here)

…to 50

…5 parallel jobs

…structive clogging

…ult_one_job

github-actions · 2023-11-02T03:51:30Z

Documentation preview for this PR (built with commit c1d97ca; changes) is ready! 🎉

kwankyu · 2023-11-02T08:57:04Z

I presume that "standard-" jobs are better to finish as fast as possible for early checking, but "minimal-" and others are not urgent. Then to give more chances for PR workflows to get in, shouldn't we set max-parallel for "minimal-" and all others small such as 10?

mkoeppe · 2023-11-02T15:36:15Z

The "minimal-pre" jobs are not urgent; in fact, before this PR, they are run after the "standard" jobs.

In this PR, I am running the "minimal-pre" earlier precisely so that PR jobs are blocked until the new Docker image is ready (~4 hours). The "standard-pre" jobs by themselves are not enough to do this: We have 60 runners, but only 45 "standard-pre" jobs (runtime 0.5 to 4 hours).

kwankyu · 2023-11-02T22:59:49Z

The "minimal-pre" jobs are not urgent; in fact, before this PR, they are run after the "standard" jobs.

In this PR, I am running the "minimal-pre" earlier precisely so that PR jobs are blocked until the new Docker image is ready (~4 hours). The "standard-pre" jobs by themselves are not enough to do this: We have 60 runners, but only 45 "standard-pre" jobs (runtime 0.5 to 4 hours).

Ah, I missed that you removed needs: [standard]. So you fill 60 - 45 = 15 runners with "minimal-pre" jobs. OK.

What I suggested seems independent from what you are doing here and Tobias' PR may be used later for that, if this PR is not sufficient to solve the problem we are tackling.

OK.

kwankyu

LGTM.

mkoeppe · 2023-11-02T23:05:56Z

What I suggested seems independent from what you are doing here and Tobias' PR may be used later for that, if this PR is not sufficient to solve the problem we are tackling.

Yes, there are more knobs that we can adjust. We'll observe and iterate

mkoeppe · 2023-11-02T23:06:11Z

Thanks for reviewing!

tobiasdiez · 2023-11-03T13:33:18Z

Combining the default jobs is a nice improvement.

Thinking more about this, I still don't get how the "we clog everything"-strategy is suppose to work. You are blocking 5 * 60 = 300 hours of cpu time from jobs outside the ci-linux. For this you get a reduction of 4-5h of cpu time per PR (1-1.5h x 3 jobs). So in order to get back your investment of cpu time, you would need 60 PR that get updated in the first 5 hours. Such a high number is very unrealistic.

What you are optimizing is the overall cpu time spent, where you get a reduction by about 5h x PRs updated in the first 5 hours. But what should be optimized (at least in my opinion) is the average time until a build & test workflow reports its results. I don't get happy by the fact that the build & test workflow for my PR only takes 4 hours instead of 5 of execution time, if I have to wait 20 hours until it gets started.

mkoeppe · 2023-11-05T17:50:09Z

Observations about how it is playing out in the 10.2.rc0 run: #36616 (comment)

Matthias Koeppe added 4 commits October 31, 2023 11:16

.github/workflows/ci-linux.yml (standard-pre): Increase max_parallel …

ada9c34

…to 50

.github/workflows/ci-linux.yml: Reduce 'standard', 'minimal-pre' to 2…

35910e2

…5 parallel jobs

.github/workflows/ci-linux.yml (default): Build in one job, do not test

9024197

.github/workflows/ci-linux.yml: Unleash minimal-pre jobs for more con…

af12b88

…structive clogging

mkoeppe self-assigned this Nov 2, 2023

mkoeppe added the c: scripts label Nov 2, 2023

Merge branch 'ci_linux_standard_pre_more_parallel' into ci_linux_defa…

c1d97ca

…ult_one_job

mkoeppe added the s: needs review label Nov 2, 2023

mkoeppe requested a review from kwankyu November 2, 2023 06:33

kwankyu approved these changes Nov 2, 2023

View reviewed changes

kwankyu added s: positive review and removed s: needs review labels Nov 2, 2023

vbraun merged commit 7354bdb into sagemath:develop Nov 5, 2023
20 of 21 checks passed

github-actions bot removed the s: positive review label Nov 5, 2023

mkoeppe added this to the sage-10.2 milestone Nov 5, 2023

mkoeppe deleted the ci_linux_default_one_job branch November 5, 2023 17:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CI Linux: Build the default platform in one job #36627

CI Linux: Build the default platform in one job #36627

mkoeppe commented Nov 2, 2023 •

edited

Loading

github-actions bot commented Nov 2, 2023

kwankyu commented Nov 2, 2023 •

edited

Loading

mkoeppe commented Nov 2, 2023

kwankyu commented Nov 2, 2023

kwankyu left a comment

mkoeppe commented Nov 2, 2023

mkoeppe commented Nov 2, 2023

tobiasdiez commented Nov 3, 2023 •

edited

Loading

mkoeppe commented Nov 5, 2023

CI Linux: Build the default platform in one job #36627

CI Linux: Build the default platform in one job #36627

Conversation

mkoeppe commented Nov 2, 2023 • edited Loading

📝 Checklist

⌛ Dependencies

github-actions bot commented Nov 2, 2023

kwankyu commented Nov 2, 2023 • edited Loading

mkoeppe commented Nov 2, 2023

kwankyu commented Nov 2, 2023

kwankyu left a comment

Choose a reason for hiding this comment

mkoeppe commented Nov 2, 2023

mkoeppe commented Nov 2, 2023

tobiasdiez commented Nov 3, 2023 • edited Loading

mkoeppe commented Nov 5, 2023

mkoeppe commented Nov 2, 2023 •

edited

Loading

kwankyu commented Nov 2, 2023 •

edited

Loading

tobiasdiez commented Nov 3, 2023 •

edited

Loading