Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better scheduler for parallelized linker #2029

Open
rui314 opened this issue Sep 22, 2021 · 10 comments
Open

Better scheduler for parallelized linker #2029

rui314 opened this issue Sep 22, 2021 · 10 comments

Comments

@rui314
Copy link
Contributor

rui314 commented Sep 22, 2021

Parallelized linkers are gaining popularity these days. lld, which is multi-threaded, is very popular for large-scale program. A yet another linker, mold (https://github.com/rui314/mold), is more parallelized than lld. (Disclaimer: I'm the original author of the two linkers.)

The problem we observe with ninja + parallelized linker is that ninja spawns more subprocesses than necessary. Theoretically, we can't make a build faster by spawning more subprocesses once the CPU usage is saturated. In fact, doing so is likely to slows down the build due to higher memory pressure.

I believe ninja could do better by incorporating the existence of multi-threaded linker into scheduling decision. Currently, I guess ninja assumes all subprocesses are single-threaded. Is there any effort to improve ninja in this area?

rui314/mold#117

@ukai
Copy link

ukai commented Sep 22, 2021

@rui314
Copy link
Contributor Author

rui314 commented Sep 23, 2021

Pool might work, but in order to use that, it looks like users have to manually edit an auto-generated build.ninja file. I wish ninja to detect parallelized linker and adjust scheduling decision accordingly.

@ilyapopov
Copy link

See also #991

@hadrielk
Copy link

hadrielk commented Jun 2, 2022

@rui314 I think many people don't write build.ninja by hand, but use CMake to generate it for them. For such cases, a solution does exist: JOB_POOL_COMPILE and JOB_POOL_LINK.

Basically it lets them limit how many simultaneous compiler and/or linker jobs there are.

For example, this will limit the number of compilation jobs to 64 and linker jobs to 4, globally for the project:

set_property(GLOBAL PROPERTY JOB_POOLS "comp_jobs=64" "link_jobs=4")
set(CMAKE_JOB_POOL_COMPILE comp_jobs)
set(CMAKE_JOB_POOL_LINK link_jobs)

One can also set the pools per target.

(and I should note: that it's still limited even further by the -j <N> jobs argument given to Ninja, or its auto-detected jobs value if the argument is not given)

@ilyapopov
Copy link

ilyapopov commented Jun 2, 2022

@hadrielk

Unfortunately, that does not solve the problem. The setup you suggest would allow up 64 compile jobs and up to 4 link jobs at the same time (but no more than -j N jobs in total, counting a link job as one, and not taking into account that each occupies many cpus). What is requested is to specify that we want up to 64 compile jobs or 4 link jobs. There is currently no way to specify that.

@hadrielk
Copy link

hadrielk commented Jun 2, 2022

@ilyapopov yeah, you're not wrong.

But in practice I think it happens to work out that way anyway - or at least it does at my day job. Because Ninja spawns the compiler rules first, building all/most of the .o, all at the beginning... and then has a long tail in the back half of the overall build, linking stuff.

Or at least that's how it appears to us monitoring it at my day job. But that may be very specific to our setup. (~1,400 targets being built, mostly unity files, of both .so libs and execs, using distcc, ~3 hour build time)

We've actually been looking into how to smooth out the workload, to force a more even/balanced scheduling algorithm.

@hadrielk
Copy link

hadrielk commented Jun 2, 2022

BTW, while on this topic... if you look through the various open issues for Ninja, you'll find a bunch asking for different knobs for controlling what gets built, when.

I think part of the reason for that, is that one of the most important and interesting aspects to build systems is not how fast and well they can load and build a DAG and determine what-changed and needs rebuilding... but rather it's the scheduling decisions after they have such information.

Or at least it is for us at my employer's, where a full build only takes seconds to build a DAG and determine changes, but takes hours to actually build - and better scheduling can make a drastic difference.

The problem is I don't think there's one "best" algorithm. There're too many different resource constraints and needs for different users.

So we're thinking of forking Ninja and adding python plugin support, so that Ninja can let a loaded python callback decide the scheduling once the set of edges are chosen (...or really, within CommandRunner::CanRunMore() and Plan::FindWork(), so for each iteration of the Builder's while-loop).

That way we could prototype different scheduling ideas. Invoking python for this step would probably be fast enough to even just use permanently, rather than just for prototyping.

Has anyone already done this type of thing?

@ilyapopov
Copy link

I have not seen anyone adding scripting for scheduling, but there is a PR #2019 to add critical path scheduling.

@jonesmz

This comment was marked as abuse.

@jonesmz

This comment was marked as abuse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants