Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ninja: default pool depth for link.exe #13937

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

lb90
Copy link
Contributor

@lb90 lb90 commented Nov 21, 2024

link.exe doesn't quite benefit from parallel invocations. It's internally threaded and consumes LOTS of memory, so it makes sense to limit the link stage concurrency with a ninja pool. As always, the value can be overriden by the user with max_backend_link

Before that, I could easily run out of memory on a laptop with 8 logical CPUs and 16GB of RAM. The linking stage was also very slow due to frequent swapping

See also:

@lb90 lb90 requested a review from jpakkane as a code owner November 21, 2024 13:51
@lb90 lb90 force-pushed the default-pool-link-exe branch 2 times, most recently from 6f50df3 to f1c4984 Compare November 21, 2024 15:35
@eli-schwartz
Copy link
Member

Fixes #13573

It certainly does not! You propose to reset the existing default value of the one pool meson currently exposes. That ticket was asking for user-defined pools.

@eli-schwartz
Copy link
Member

Your quotes from the ninja issue tracker don't support the position you are taking here. One of them is about ninja's default -j value being processors + 2 resulting in too much parallelism for the compile stage of heavy C++ sources, and the other ticket is by the author of mold, who strenuously objects to the use of link pools in ninja "because it just results in mold dying of memory issues when running in parallel with compiling C++ code".

Your Microsoft links are very sparse on information, one of them is a problem report about CL.exe, not LINK.exe, and the other is an article about how CL.exe (again) can/should be made to operate in batched mode where one CL.exe compiles multiple files (or possibly a server mode where running CL.exe submits jobs to a daemon).

Copy link
Member

@eli-schwartz eli-schwartz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole approach is misguided, see how GCC's -flto=jobserver mode works for how to correctly do link-time multithreading that actually has the intended effect.

Linkers that currently perform fully uncontrolled resource exhaustion because they hardcode the assumption that they are the only process on the system that runs at the same time (and cannot even fathom the notion of another process performing heavy C++ compilation in the same build !!!) need to simply give up on their notion of automatically spawning $(nproc) threads, and collaborate via the jobserver instead.

Meantime, you can <highly opinionated comment>use ld.bfd<highly opinionated comment/>.

mesonbuild/backend/ninjabackend.py Outdated Show resolved Hide resolved
mesonbuild/backend/ninjabackend.py Outdated Show resolved Hide resolved
@eli-schwartz
Copy link
Member

As an aside, mold run by hand, with no other processes running in the background or in another terminal, will exhaust the open file descriptors count. This ticket has been open for over 2 years and the official response is:

I don't think we can fix it on our side, as I believe LTO uses that many files.

Despite frequent requests to add a very obvious solution, and despite the fact that other linkers handle this just fine, the mold author appears to be very worried that the problem may be unsolvable. At this point it's just become a betting game how long the ticket will stay open.

@lb90
Copy link
Contributor Author

lb90 commented Nov 21, 2024

Hi @eli-schwartz, thanks for the feedback! :)

Yeah, I linked some issues that provide relevant informations, though are not specifically about link.exe. That said, the author of mold suggests that modern linkers do not benefit from parallel invocations, and link.exe is one of those. MS link uses memory in the order of gigabytes even without LTO, while cl.exe is generally in the order of a few megabytes (about 20-30 MB when compiling GTK - based on C). I experience OOM errors and severe slowdowns while building GStreamer

The whole approach is misguided, see how GCC's -flto=jobserver mode works for how to correctly do link-time multithreading that actually has the intended effect.

indeed having link.exe in "server mode" would be ideal, but I dont' know if such interface is supported

@eli-schwartz
Copy link
Member

eli-schwartz commented Nov 21, 2024

indeed having link.exe in "server mode" would be ideal, but I dont' know if such interface is supported

To be clear, link.exe wouldn't run in server mode, and gcc doesn't run in server mode either.

-flto=jobserver means that when you link with gcc -flto=jobserver foo.o bar.o -o myprog, it checks to see if a parent Make process exists, then asks GNU Make how many available -j slots are available (presuming that when you run make -j8 it's frequently the case that e.g. two other compile jobs are running, for a total of three jobs -- that means that LTO is free to use five parallel LTO processes).

link.exe would need to be a client. It would also need to implement the desired functionality, likely by having Microsoft's corporate office decide they care about this.

You can read about it at https://www.gnu.org/software/make/manual/html_node/Job-Slots.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants