-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrap libraries in linker groups, allowing backwards/circular references #85805
Conversation
r? @jackh726 (rust-highfive has picked a reviewer for you, use r? to override) |
This comment has been minimized.
This comment has been minimized.
b8ca6c6
to
9508726
Compare
@bors try |
⌛ Trying commit 95087264ce73f6ba1e177dbfde29b3099cc986f2 with merge e1d60f651d7c6ca5c639dc1482eab9ca708d4162... |
☀️ Try build successful - checks-actions |
I definitely don't know enough here to review. @joshtriplett do you know anyone better? |
I would expect the motivation of a PR like this to just generally make things easier than they are today. Are there other motivations though? Originally I tried to avoid the
I never quantified that performance cost myself nor did I ever dig around to figure out what that clause was added in the first place. My best guess is that it requires that more symbols are kept in memory, but that didn't seem like a reason to put such a scary warning in the man page... That all being said if there isn't actually really much of a performance issue from this, I think that this would improve consistency across platforms a little. MSVC I believe behaves as-if A lot of the linker args and stuff in rustc is generally set up to specify things in a topologically-sorted fashion. There's a ton of stuff in various places that tries to be extremely careful about the order of linked libraries. I think all of this though is in the category of "this would generate an error otherwise", not something like "if this is in the other order bad things happen". Mostly this is just me realizing that there's probably a lot of tangential stuff that could be simplified after landing a change like this (probably no need to do it inline here of course). Anyway, overall I think this would be a nice thing to land. It's just one less thing to worry about with the linker which AFAIK has little cost. I do think it would be worthwhile to figure out that performance claim though in the man page and see if it's something that big projects will eventually run into or whether it's something Rust doesn't run into. Once we land this I doubt there's really any turning back. The best we could do in the future would be to add an option to go back to the old behavior, but that'd be a bummer to maintain that... |
@alexcrichton wrote:
Yes, I did have a specific motivation beyond just making it much easier to handle static libraries. In the course of handling aarch64 outline-atomics and adding the symbols to compiler-builtins for that, it turns out that that creates a circular dependency between compiler-builtins and libc. So, this is actually needed on aarch64 now, and rather than do a special-case fix I wanted to address the general problem. Apart from that, I also think this will help us along the path towards standardizing the handling of dynamic vs static linking of native libraries. Right now, numerous Rust crates handle that via feature flags; I'd love to standardize a general "please link your native libraries statically" flag for native crates to give the top-level crate more control about the linking of native libraries in dependencies. I think that will be much easier if those crates don't suddenly have to worry about library ordering when they didn't for dynamic linking.
LLVM's lld behaves this way as well; it doesn't need these options either.
Discussion in #76992 included some performance analysis that could not detect any appreciable performance hit at all. |
I would prefer not to give this guarantee now, but stabilize linking modifiers first (#81490), then migrate target specs / libc / libunwind to Even with groups, linker argument order still remains significant in presence of duplicate / conflicting symbols (see e.g. #84254). |
As a precedent, CMake does a pretty similar job of building a dependency tree of native libraries and turning it into a linker invocation, and it doesn't add grouping options implicitly, but rather gives users ability to add them manually. |
@petrochenkov So, I do want to see link modifiers, but that seems orthogonal to this. And I agree that argument order still matters if you have duplicate symbols, and I added comments to that effect here. But I think it's still reasonable to make order matter less, so that people who have the more unusual case of duplicate symbols have to deal with it, but people who just have ordinary dependencies don't. That would affect the degree to which we need to add facilities to enable such ordering. This is a problem we'd have to solve in order to deduplicate libraries, but I also think this will help us avoid cases where people would otherwise need to duplicate libraries (since circular dependencies wouldn't require that anymore). To be explicit: I don't want to rip out the code that carefully maintains topological ordering, and I think any PR proposing to do so would need to address all the potential compatibility and future-compatibility concerns. In this PR, I've carefully preserved the topological ordering logic, and added a comment that should make it clear that it shouldn't be removed lightly.
To the best of my knowledge, all the linkers we currently support either have these options (ld) or don't need them because they behave that way by default (lld, MSVC). I'm not aware of any current linker we support that errors out on dependencies from a later library to an earlier one and doesn't support this option. |
Non- |
One additional thing that I don't like is that For example, in crate |
The linkers typically used with wasm don't need this. My understanding is that Apple's ld (and zld) don't care either. I also found explicit documentation on Solaris's linker saying that it resolves symbol references that go backwards in order, as well. |
One additional note: there's one more case where link order matters, namely the accumulation of symbols into sections. For instance, the order of |
Oh apparently I wrote the same comment as above some time ago as well... Well in any case those are my thoughts on this feature and I'd personally be ok with it so long as someone else is willing to back up the claims of peformance. (the testing in the thread doesn't seem to be super rigorous about big projects and performance there, I'd expect a performance difference for something like librustc.so or libllvm.so if it actually mattered) |
@alexcrichton I'll do some additional performance testing and follow up. |
I built I used hyperfine to benchmark the resulting invocation:
I then deleted the
Then I wrapped the entire set of objects and libraries in
I also tried wrapping just std and below, plus the native libraries, in a group:
So, this does have a measurable impact with ld. Not an earth-shattering one, but a measurable one. I still think this is worthwhile, though. But I can also try narrowing the scope of the group to see if I can reduce the impact. |
Can you take a look at the impact of a larger project like rustc as well? (rustc's large dynamic library that is, not the executable which should be very small). Ideally you'd also statically link LLVM so there's a lot of C++ code in the mix as well. |
The easiest way to do that might be a rust-timer invocation: @rust-timer queue |
Awaiting bors try build completion. @rustbot label: +S-waiting-on-perf |
If this turns up an excessive performance impact, I'm going to try narrowing the scope of the group. It's not necessarily required to allow backreferences from native libraries to Rust libraries. |
Also, I confirmed that wrapping only std and below and the native libraries into a group has less impact:
|
We'll revisit this PR this Fall, the team agrees with this comment, see Zulip discussion |
@apiraino Sounds good to me; no rush. I'm happy to update this to apply again at that time. |
When we do revisit, perhaps it should be as an MCP? |
Ok, I guess my main take on this is that backward references are exceptional, therefore they should be explicit. If some symbol is satisfied through a backward referece, the it's more likely that it's some kind of missing unspecified dependency, rather than the intent. If we look at Rust dependecies as an example, cargo and rustc try very carefully to have all crate dependencies specified. What specific way can be used to specify backreferences is probably not too important right now, if the main users are libc and libgcc like mentioned in the PR description, so it can be unstable and anything will go (an attribute or maybe a command line option, mentioning specific symbols or maybe whole libraries). |
The dynamic library order does matter because we are using |
Based on discussions with @pnkfelix, I'm going to rework this patch to provide a separate option that enables linker groups, so that people can turn it on and experiment with it, but leave them disabled by default for now. |
@joshtriplett any updates on this? |
Discussed this with @pnkfelix today, and he's going to update the PR to add a |
triage: @joshtriplett @pnkfelix what's the status of this PR? |
I'm going to close this for now; I don't have the bandwidth to pursue broader static linking reform at this time. |
change stdlib circular dependencies handling Remove group_start and group_end, add dependencies to symbols.o Implements the suggestion from rust-lang#85805 (comment) r? `@petrochenkov`
Some native library sets, such as components of libc, libgcc, or
compiler-builtins, require backwards or circular references.
Additionally, because link order doesn't matter for native libraries
when using dynamic linking, many crates don't take it into account for
static linking either; this can introduce platform-specific issues.
Rather than make the order of
#[link]
lines orcargo:rustc-link-lib
lines semantically significant (and potentially even require specifying
some libraries twice), wrap all libraries in linker groups. This ensures
that symbols in all libraries resolve correctly regardless of library
order.
Linker groups of reasonable sizes don't have a substantial cost, but
putting the entirey of a large link line inside a linker group can
add enough cost to be noticeable. To minimize the cost of linker groups,
use separate groups for local native libraries, upstream crates, and
upstream native libraries. This doesn't allow backreferences between
those groups, but in most cases we won't want such backreferences.