-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add simplified critical path scheduler to improve build times #2177
Conversation
The existing algorithm doesn't work because it strictly requires that all outputs are visited before updating an edge. So any task downstream from a task with multiple out-edges may get ignored. The fix is to always propagate your critical time to the next input node, and only place it in the queue if you offer a higher critical time.
1. Move EdgePriorityQueue to graph.h and inherit from priority_queue 2. Add comment about edge->critical_time()
AddTarget cannot add edges to the ready queue before the critical time has been computed.
I would have preferred the approach from #2019. The number of edges is a quite inaccurate estimate for the build time of the critical path. Also, I feel the mentioned problem1 is an orthogonal problem that should be solved differently, e.g. with pools. Both are symptoms that heavily depend on the particular project and do not necessarily generalize to other projects. Have you considered only prioritizing the critical path (as if with infinite parallelism), and schedule other edges in parallel to it according to other priorities, such as minimizing peak resource usage2? Footnotes
|
The historic build time is inaccurate, too, if e.g., ccache is used. |
This is a simplified alternative to #2019.
It still adds a critical path scheduler based on weighted edges in the build graph, but instead of using historical runtime it simply assigns a priority of 1 to everything except phony edges. In doing so, it prioritizes jobs by their depth in the build graph. This is better than random scheduling because jobs with dependents will unlock more work to be done when they are completed, thus we get fewer instances of starving the command runner.
For example, if your build has some code-generated sources then the code generator is prioritized above compiling non-generated sources and so all sources files will be unblocked and able to compile in parallel.
Compared to weighting by historic runtime, this will be worse in cases where build times are limited by a single long-running compile job at a low depth and with no dependencies. However, it does avoid the problem of systematically launching the resource-intensive jobs all at once.