Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add simplified critical path scheduler to improve build times #2177

Merged
merged 22 commits into from
Feb 29, 2024

Conversation

peterbell10
Copy link
Contributor

This is a simplified alternative to #2019.

It still adds a critical path scheduler based on weighted edges in the build graph, but instead of using historical runtime it simply assigns a priority of 1 to everything except phony edges. In doing so, it prioritizes jobs by their depth in the build graph. This is better than random scheduling because jobs with dependents will unlock more work to be done when they are completed, thus we get fewer instances of starving the command runner.

For example, if your build has some code-generated sources then the code generator is prioritized above compiling non-generated sources and so all sources files will be unblocked and able to compile in parallel.

Compared to weighting by historic runtime, this will be worse in cases where build times are limited by a single long-running compile job at a low depth and with no dependencies. However, it does avoid the problem of systematically launching the resource-intensive jobs all at once.

nico and others added 22 commits August 25, 2021 11:48
The existing algorithm doesn't work because it strictly requires that
all outputs are visited before updating an edge. So any task downstream
from a task with multiple out-edges may get ignored.

The fix is to always propagate your critical time to the next input
node, and only place it in the queue if you offer a higher critical
time.
1. Move EdgePriorityQueue to graph.h and inherit from priority_queue
2. Add comment about edge->critical_time()
AddTarget cannot add edges to the ready queue before the critical time
has been computed.
@Meinersbur
Copy link

I would have preferred the approach from #2019. The number of edges is a quite inaccurate estimate for the build time of the critical path. Also, I feel the mentioned problem1 is an orthogonal problem that should be solved differently, e.g. with pools. Both are symptoms that heavily depend on the particular project and do not necessarily generalize to other projects.

Have you considered only prioritizing the critical path (as if with infinite parallelism), and schedule other edges in parallel to it according to other priorities, such as minimizing peak resource usage2?

Footnotes

  1. https://github.com/ninja-build/ninja/pull/2019#issuecomment-1076502406

  2. I think ninja currently does not measure resource usage, but a simple peak rss could recorded?

@kaspar030
Copy link

The number of edges is a quite inaccurate estimate for the build time of the critical path.

The historic build time is inaccurate, too, if e.g., ccache is used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants