Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formal support for linking rlibs using a non-Rust linker #73632

Open
adetaylor opened this issue Jun 22, 2020 · 97 comments
Open

Formal support for linking rlibs using a non-Rust linker #73632

adetaylor opened this issue Jun 22, 2020 · 97 comments
Labels
A-linkage Area: linking into static, shared libraries and binaries

Comments

@adetaylor
Copy link
Contributor

I'm working on a major existing C++ project which hopes to dip its toes into the Rusty waters. We:

  • Use a non-Cargo build system with static dependency rules (and 40000+ targets)
  • Sometimes build a single big binary; sometimes lots of shared objects, unit test executables, etc. - each containing various parts of our dependency tree.
  • Perform final linking using an existing C++ toolchain (based on LLVM 11 as it happens)
  • Want to have a few Rust components scattered throughout a very deep dependency tree, which may eventually roll up into one or multiple binaries

We can't:

  • Switch from our existing linker to rustc for final linking. C++ is the boss in our codebase; we're not ready to make the commitment to put Rust in charge of our final linking.
  • Create a Rust staticlib for each of our Rust components. This works if we're using Rust in only one place. For any binary containing several Rust components, there would be binary bloat and potentially violations of the one-definition-rule, by duplication of the Rust stdlib and any diamond dependencies.
  • Create a single Rust staticlib containing all our Rust components, then link that into every binary. That monster static library would depend on many C++ symbols, which wouldn't be present in some circumstances.

We can either:

  1. Create a Rust staticlib for each of our output binaries, using rustc and an auto-generated .rs file containing lots of extern crate statements. Or,
  2. Pass the rlib for each Rust component directly into the final C++ linking procedure.

The first approach is officially supported, but is hard because:

  • We need to create a Rust staticlib as part of our C++ tool invocations. This is awkward in our build system. Our C++ targets don't keep track of Rust compiler flags (--target, etc.) and in general it just feels weird to be doing Rust stuff in C++ targets.
  • Specifically, we need to invoke a Python wrapper script to consider invoking rustc to make a staticlib for every single one of our C++ link targets. For most of our targets (especially unit test targets) there will be no rlibs in their dependency tree, so it will be a no-op. But the presence of this wrapper script will make Rust adoption appear intrusive, and of course will have some small actual performance cost.
  • For those link targets which do include Rust code, we'll delay invocation of the main linker whilst we build a Rust static library.

The second approach is not officially supported. An rlib is an internal implementation format within Rust, and its only client is rustc. It is naughty to pass them directly into our own linker command line.

But it does, currently, work. It makes our build process much simpler and makes use of Rust less disruptive.

Because external toolchains are not expected to consume rlibs, some magic is required:

  • The final C++ linker needs to pull in all the Rust stdlib rlibs, which would be easy apart from the fact they contain the symbol metadata hash in their names.
  • We need to remap __rust_alloc to __rdl_alloc etc.

But obviously the bigger concern is that this is not a supported model, and Rust is free to break the rlib format at any moment.

Is there any appetite for making this a supported model for those with mixed C/C++/Rust codebases?

I'm assuming the answer may be 'no' because it would tie Rust's hands for future rlib format changes. But just in case: how's about the following steps?

  1. The Linkage section of the Rust reference is enhanced to list the two current strategies for linking C++ and Rust. Either:
    • Use rustc as the final linker; or
    • Build a Rust staticlib or cdylib then pass that to your existing final linker
      (I think this would be worth explicitly explaining anyway, so unless anyone objects, I may raise a PR)
  2. A new rustc --print stdrlibs (or similar) which will output the names of all the standard library rlibs (not just their directory, which is already possible with target-libdir)
  3. Some kind of new rustc option which generates a rust-dynamic-symbols.o file (or similar) containing the codegen which is otherwise done by rustc at final link-time (e.g. symbols to call __rdl_alloc from __rust_alloc, etc.)
  4. The Linkage section of the book is enhanced to list this as a third supported workflow. (You can use whatever linker you want, but make sure you link to rust-dynamic-symbols.o and everything output by rustc --print stdrlibs)
  5. Somehow, we add some tests to ensure this workflow doesn't break.

A few related issues:

@japaric @alexcrichton @retep998 @dtolnay I believe this may be the sort of thing you may wish to comment upon! I'm sure you'll come up with reasons why this is even harder than I already think. Thanks very much in advance.

@nagisa
Copy link
Member

nagisa commented Jun 22, 2020

The second approach is not officially supported. An rlib is an internal implementation format within Rust, and its only client is rustc. It is naughty to pass them directly into our own linker command line.

But it does, currently, work. It makes our build process much simpler and makes use of Rust less disruptive.

rlib is not a complete package you may need to link. It could very well just be a section containing just rmeta (serialized "metadata" representing checked Rust code) from which rustc would generate code at the time it is "forced" by generating a binary output (link/staticlib/cdylib). The fact that rlib contains any machine code at all is just an optimisation.

What you really want is a staticlib-nobundle (not an actual thing) or something similar that would not bundle the the dependencies into produced .a.

Related reading at #38913

@petrochenkov
Copy link
Contributor

I'm still don't understand what are the requirements exactly, but if rlibs (as they exist now) work for you, but staticlibs don't, then perhaps you need some variation of staticlibs with different bundling behavior for dependencies, rather than rlibs.

@adetaylor
Copy link
Contributor Author

The fact that rlib contains any machine code at all is just an optimisation.

Yep.

What you really want is a staticlib-nobundle (not an actual thing) or something similar that would not bundle the the dependencies into produced .a.

I agree - this is exactly what we need.

@petrochenkov
Copy link
Contributor

@adetaylor
Could you elaborate a bit.
Suppose you have a Rust crate root with some tree of dependencies

  imm1 - dup
 /
root
 \ 
  imm2 - dup
   \
    indirect - native

where imm1, imm2 and dup are other rust crates and native is some native dependency like C static library.

What the expected result of compiling root is supposed to be?

@adetaylor
Copy link
Contributor Author

Right now, it would be a staticlib and that's fine.

The problem comes in this scenario:

  imm1 - dup
 /
root
 \ 
  imm2 - dup

root2
 \ 
  imm2 - dup

c_plus_plus_binary_a
 \ 
  root2 - [...]

c_plus_plus_binary_b
 \ 
  root1 - [...]

  root1 - [...]
 /
c_plus_plus_binary_c
 \ 
  root2 - [...]

In this case, we can't build root and root2 as staticlibs, because they will conflict when we come to build c_plus_plus_binary_c.

@petrochenkov
Copy link
Contributor

I don't mean right now, I mean what compilation of root and root2 should produce to fit into your use case.

libroot.a + libimm1.a + libimm2.a + libdup.a + libroot2.a + some file in json or other machine-readable format describing the dependency trees?

@adetaylor
Copy link
Contributor Author

Exactly! Except that the JSON file probably isn't necessary. Our build system (gn + ninja) works extremely hard to establish static dependency relationships in advance, so that a rebuild builds the minimal required set. We would expect to use those relationships in the subsequent linker invocation unless there were some reason that we needed to query it dynamically.

@alexcrichton
Copy link
Member

This seems like a reasonable feature to me to support. The goal of rlibs was to reserve space for rustc to "do its thing" so we intentionally stabilize some format rather than accidentally do so. I don't think there's a need for a "staticlib-nobundle" because that's basically what an rlib is. While @nagisa is right in that rlibs can have varying contents, I think that it's fine to say for a use case like this you'd pass some flag to rustc saying "make sure I can pass this to the linker". Rustc, after all, goes to great lengths to pass rlibs directly to the linker to make linking fast.

I think this could work well perhaps with -Z binary-dep-depinfo (I forget the exact flag name). That'd basically tell you "here's all the other rlibs you need to link for this target" which would locate the standard library for you and other deps (which you'd probably ignore).

The gotchas that I could think of are:

  • For allocators we'd probably just need to say "no custom allocator" or something similar. You'd probably run rustc with no arguments saying "generate me the allocator *.a file to pass to the linker", and that'd just run as part of your build system. That's also where we could shove anything else if it comes up in the future.

  • You'll be on your own for deduplicating Rust dependencies across dynamic libraries. In your example of c_plus_plus_binary_c it's generally asking for trouble if for root1 and root2 are dynamic libraries and try to share dependencies. The compiler has internal support for this but you'd just be on your own. (this is totally fine, just wanted to point out!)

Overall I think we could get away with this by carefully designing how a "stable" pass-things-to-the-linker process would work. We could make sure to solve the constraints here while also giving ourselves some leeway so rustc can still change things in the future if needed. It seems worthwhile to me to integrate into other build systems like this, and it's definitely a pain point with rlibs since inception.

@Alexendoo Alexendoo added the A-linkage Area: linking into static, shared libraries and binaries label Jun 24, 2020
@nagisa
Copy link
Member

nagisa commented Jun 25, 2020

I think that it's fine to say for a use case like this you'd pass some flag to rustc saying "make sure I can pass this to the linker".

My suspicion is that the best and most holistic implementation of this kind of feature is represented by having a different crate-type alongside rlib. Implementing functionality that reliably satisfies the requirements outlined by the OP can require a fair number of adjustments to how monomorphizations are generated, where they end up and what their visibility ends to being. I think the answer to these and many other similar such questions are meaningfully different enough for rlib and the hypothetical libstatic-nobundle that the two shouldn't be conflated.

EDIT: Regardless of which way it goes, this feels to me like fairly major change in at least stance on what rlib is, would require extensive design work, a RFC and a dedicated implementer willing to implement said RFC.

@adetaylor
Copy link
Contributor Author

Just to say: thanks all for the comments. I'm delighted that this doesn't seem like a completely idiotic request. From our (user's) point of view it is equally easy if this is a crate-type=static-nobundle as opposed to crate-type=rlib, either's fine.

We will bear in mind that this seems like a plausible way forward as we continue to think about our linking strategies. We'll go quiet for a bit, but it's possible that in a few months we will pop up here and say "hi, we'd like to implement this, here's an RFC...". Watch this space. Obviously if anyone else gets there first that's even better :)

@bbjornse
Copy link
Contributor

bbjornse commented Jun 6, 2021

@adetaylor may I ask: have you found a workable solution? Or did you have to abandon Rust because of this issue?

In my experimentation I have used --emit obj to produce an object file, and create an archive from this (and other C++ object files part of the same library) using ar. I then use the system linker to produce the final artifacts. Is this not a viable solution?

@bjorn3
Copy link
Member

bjorn3 commented Jun 6, 2021

In my experimentation I have used --emit obj to produce an object file, and create an archive from this (and other C++ object files part of the same library) using ar. I then use the system linker to produce the final artifacts. Is this not a viable solution?

That doesn't work when using liballoc as the allocator shim is only generated when compiling a crate that needs to be linked and even then doesn't get included in the --emit obj object file, but a separate object file that gets deleted after linking. It can't be included in the main object file as --crate-type lib --crate-type bin in which case the allocator shim must be included for the bin, but must not be included for the lib.

@adetaylor
Copy link
Contributor Author

@bbjornse We haven't abandoned Rust, but we also haven't made any real progress in this direction.

I know several other teams at various organizations are using the approach of building rlibs then passing them all into the final C++ linker. This still works. It still requires nasty hacks as described in this issue description, and could break at any time.

@bbjornse
Copy link
Contributor

bbjornse commented Jun 7, 2021

That doesn't work when using liballoc as the allocator shim is only generated when compiling a crate that needs to be linked and even then doesn't get included in the --emit obj object file, but a separate object file that gets deleted after linking.

Interesting, thanks. Could you please explain: what exactly is the allocator shim, and why is it generated rather than part of a library? Does this relate to the __rust_alloc => __rdl_alloc remapping mentioned in the issue description?

Also, do you know if this is the only challenge with the object file approach? Are there other pieces which might be missing?

@bbjornse We haven't abandoned Rust, but we also haven't made any real progress in this direction.

I know several other teams at various organizations are using the approach of building rlibs then passing them all into the final C++ linker. This still works. It still requires nasty hacks as described in this issue description, and could break at any time

Thank you for your reply. This is somewhat disheartening. I'm still not abandoning hope for the --emit obj approach though: as far as I could tell the output was similar to that of the rlib, sans rmeta sections, and as you say using the rlibs directly can be made to work. Although passing rlibs directly to the native linker might never be officially supported, it sounds reasonable to me to expect that object files generated with --emit obj can be used in this way.

@bjorn3
Copy link
Member

bjorn3 commented Jun 7, 2021

Interesting, thanks. Could you please explain: what exactly is the allocator shim, and why is it generated rather than part of a library?

The allocator shim is what forward the allocator calls in liballoc to either a #[global_allocator] or the default allocator in libstd. Expanding it in place for #[global_allocator] won't work because if the crate with the #[global_allocator] and a crate linking to libstd get linked together the libstd default allocator wins as the dylib needs to have an allocator shim to link and the only allocator available when linking the dylib is libstd's default allocator.

Does this relate to the __rust_alloc => __rdl_alloc remapping mentioned in the issue description?

Correct

Also, do you know if this is the only challenge with the object file approach? Are there other pieces which might be missing?

Honestly I am not sure.

Although passing rlibs directly to the native linker might never be officially supported, it sounds reasonable to me to expect that object files generated with --emit obj can be used in this way.

Neither rlibs, nor --emit obj contain the allocator shim. In fact they can't. For rlibs this is the reason mentioned earlier. For --emit obj this is because for --crate-type lib --crate-type dylib --emit obj will generate a single object file, but the rlib must not have the allocator shim and the dylib must have it.

@bjorn3
Copy link
Member

bjorn3 commented Jun 7, 2021

I have been thinking about the rust compilation model lagely because of this and certain other deficiencies. No concrete actionable improvement yet though.

@bbjornse
Copy link
Contributor

bbjornse commented Jun 8, 2021

Thank you for your explanation. Have I understood correctly that

  • If any crate used as part of a program or dynamic library uses #[global_allocator], then that crate defines symbols __rg_alloc etc, and all references to __rust_alloc etc should be forwarded to __rg_* equivalents.
  • Otherwise, all references to __rust_alloc etc should be forwarded to __rdl_* equivalents.
  • No allocator shims are generated unless rustc is producing some kind of "final" artifact (--crate-type {bin,dylib,cdylib,staticlib} --emit link). In particular, they are not generated for --crate-type lib --emit rlib,obj.

I realize I don't understand why the rlib cannot include the shim. I see that it's only necessary when producing the final artifact, but I don't see how it would hurt to generate it immediately? (Unless I've misunderstood and #[global_allocator] is crate-local, of course.)

@bbjornse
Copy link
Contributor

bbjornse commented Jun 8, 2021

To be clear, the relation between my questions and this issue is whether --emit obj can do the trick instead of formal support for linking with rlibs or a hypothetical new static-nobundle crate type. Although --emit obj suffers from the same allocator shim problems and standard library dependency discovery problems as noted in the OP, it does not suffer from the problem that rlibs aren't intended to be passed to the system linker. Rather, the object file output strikes me as something intended to be passed to the system linker. And I'm still optimistic that the two other issues can be solved.

I'm not sure what the object file is guaranteed to include and not include, though. Crucially, is it e.g. guaranteed to not include dependencies, thus avoiding diamond dependency problems and being an improvement over the use of staticlib?

@bjorn3
Copy link
Member

bjorn3 commented Jun 8, 2021

I realize I don't understand why the rlib cannot include the shim. I see that it's only necessary when producing the final artifact, but I don't see how it would hurt to generate it immediately? (Unless I've misunderstood and #[global_allocator] is crate-local, of course.)

If you link to a dylib thaf already has an allocator shim, the rlib must not define an allocator shim, but the allocator shim of the dylib must be used. At the time of generating the rlib it is not yet known if it will be linked against a dylib. Only when compiling the final artifact.

@bbjornse
Copy link
Contributor

I see, thanks. Is this only when the rlib shim would use the default allocator? Would it hurt to immediately generate the shim for crates which include a #[global_allocator]?

If crates require a shim at all, this means they use liballoc, right? And the prebuilt libstd already includes an allocator shim forwarding to the default allocator. How come this works, when another dylib might be providing a global allocator? Could a similar technique, linking with a "default allocator shim" dylib, work in a liballoc +no-std setting?

@bjorn3
Copy link
Member

bjorn3 commented Jun 17, 2021

When you dynamically link against libstd or any other dylib that uses liballoc, #[global_allocator] will be silently ignored. Instead the allocator shim embedder in said dylib will be used. This wouldn't be possible if rlibs that use #[global_allocator] immediately created the allocator shim.

@anforowicz
Copy link
Contributor

RE: --emit=obj

It seems to me that the --crate-type switch controls the main output type of rustc (e.g. rlib, staticlib, bin, etc.) while the --emit switch can optionally ask rustc to retain additional, intermediate file types (e.g. obj, but also mir, llvm-bc, dep-info, etc.). It feels that the --emit switch can be useful for debugging what is happening during compilation, but should not be depended upon for end-to-end results, since the intermediate/internal files produced when building an rlib crate may change in the future. From this perspective, depending on the --emit=obj switch doesn't seem significantly different or less risky than depending on the details of what --crate-type=rlib produces.

RE: monomorphization concerns

I see that @nagisa has raised concerns that monomorphization within rlib might be fundamentally incompatible with the requirements here. FWIW, I’ve tried to explore this a bit and didn’t find anything broken so far - please shout if I missed anything.

In particular, I tried building the following things:

  • crate1: pub fn sub_in_crate1<T: Sub<Output = T>>(a: T, b: T) -> T { a - b }
  • crate2a: #[no_mangle] pub fn sub_from_7_in_crate2a(x: i32) -> i32 { crate1::sub_in_crate1(7, x) }
  • crate2b: #[no_mangle] pub fn sub_from_42_in_crate2b(x: i32) -> i32 { crate1::sub_in_crate1(42, x) }
  • C++ executable that depends on crate2a and crate2b

There would be indeed 2 copies of the monomorphized sub_in_crate1::<i32>:

$ nm out/rel/rust_odr_test | grep crate1
00000000001f23d7 T _ZN20rust_odr_test_crate113sub_in_crate117h56a8a4a969227185E
00000000001f2422 T _ZN20rust_odr_test_crate113sub_in_crate117hf5a81c2f183f301cE

But this seems fine:

  • The linker should hopefully be able to identify and eliminate the duplicated, identical binary code (and reduce binary bloat)
  • The behavior of the compiled code seems unaffected:
    • I can’t see how having 2 copies might change the behavior of the compiled+linked code. (e.g. I don’t see a way to introduce a singleton that becomes bifurcated/duplicated because of the above)
    • Rust seems okay with having duplicated monomorphisations in 2 rlibs when linking the dependent rlibs into the final executable. (Maybe there is a deduplication step somewhere, but a cursory look didn’t find anything so far.) And so it should be hopefully okay for C++ linker as well.

(More concrete and complete example is in a Chromium-specific code review tool: see here)

RE: RFC

I guess the next step would be introducing an RFC (as suggested above), but before doing that it is probably desirable to

  • Make sure there are no issues with the general approach
  • Try prototyping the changes (tweaking link_staticlib to skip adding each_linked_rlib in presence of a new, hypothetical-Z staticlib-with-no-transitive-deps switch)

WDYT?

@jsgf
Copy link
Contributor

jsgf commented Jul 24, 2021

(--crate-type vs --emit)

One problem right now is that --emit is currently hashed into the crate hash, so, for example, --emit metadata and --emit link will generate different crate hashes for otherwise identical command lines. I'd really prefer this weren't the case, but I think this particular thing isn't relevant to this issue since by the time we get to linking crate hashes and rmeta should be irrelevant.

But aside from that, I'd agree that crate-type and emit have somewhat overlapping responsibilities (broadly, what kind of file(s) do I want?), but I'd characterize it as "crate-type sets the intrinsic properties of the output file" and "emit selects a number of auxillary outputs which can be cheaply generated along the way". But --emit obj and --crate-type staticlib are very close to each other really, ignoring the whole "and link in all the dependants" thing. (Does --emit obj also force codegen-units = 1, or is that just --emit asm?)

00000000001f23d7 T _ZN20rust_odr_test_crate113sub_in_crate117h56a8a4a969227185E
00000000001f2422 T _ZN20rust_odr_test_crate113sub_in_crate117hf5a81c2f183f301cE

I'm curious how this looks with -Zsymbol-mangling-version=v0, so that whatever's causing the difference isn't hidden in the hash.

I'm super interested in this effort. Currently Buck uses staticlib for C++ -> Rust dep edges, and I'd really prefer to have this be rlib/staticlib-with-no-transitive-deps and set up the dependencies to arrange all the other stuff appear on the final (C++) link line.

@bbjornse
Copy link
Contributor

(--crate-type vs --emit)

I agree that depending on --emit obj, without formal or documented guarantees about what the output will contain, feels fragile (like depending on linking with rlib). Case in point, when experimenting with this I found that --emit obj works differently with --crate-type lib and with --crate-type staticlib.

I don't feel that --emit is just a debugging option, though; rather, I understand it as an option to make rustc a powerful stand-alone tool usable in a non-Cargo context.

Reasons I still think --emit obj is a reasonable alternative:

  • In contrast to rlib, it makes sense to me that the output from --emit obj is guaranteed to include object code. This is an improvement over rlib, where the presence of object code is an implementation detail (as noted above.)
  • As @jsgf notes, it's staticlib without "link in all the dependants" - which is (almost) exactly what we want (i.e. an improvement over --crate-type staticlib)
  • It makes it easy to create static libraries from a mix of C++ and Rust code. This is useful for bundling C++ bridges for a Rust library; bundling Rust bridges with a C++ library; or splitting a single library in Rust and C++ components.
  • The capability to generate an object file is natural to expect when coming from a C or C++ world, and so is a useful feature in itself. In fact, it already is a feature in itself; it's just unclear to me at least how the output depends on --crate-type, and in what contexts the output will be usable.

@tmandry
Copy link
Member

tmandry commented Aug 17, 2021

Case in point, when experimenting with this I found that --emit obj works differently with --crate-type lib and with --crate-type staticlib.

Isn't this expected if staticlib links in all dependencies and lib doesn't? Or were there other differences?

@bjorn3
Copy link
Member

bjorn3 commented Aug 17, 2021

--emit obj doesn't link in dependencies as it doesn't link at all. It just emits an intermediate object file that would be given to the linker otherwise.

GregBowyer added a commit to GregBowyer/rules_rust that referenced this issue Jul 27, 2022
Previous work has made `rust_library` operate as a `CcInfo` provider. As far as I can tell this lets the compiler directly consume an rlib rather than a formal static or shared library crate type.

This is a fine experiment but is a poor recommendation in the documentation, I would suggest that until rust-lang/rust#73632 is in a better place we dont recommend people use a `rust_library` target directly where `CcInfo` is expected. I found that this failed for me in ways similar to bazelbuild#1238 even without stating a custom allocator. After fixing this for Linux + GCC this then failed in macos *and* windows.

It seems that we are jumping the gun slightly and encouraging a behavior that only works with a specific configuration as the default, rather than as something to trial.
@petrochenkov
Copy link
Contributor

petrochenkov commented Jul 28, 2022

I wanted to confirm one thing.
Scenarios in which people want to use rlibs as "regular static libraries" also assume -Z link-native-libraries=false, right?
I.e. the linking of libraries is managed entirely by some outer build system rather than rustc.

In #99429 we may want to change representation of native libraries bundled into .rlib files, and it would be easier to do if we knew that third parties do not need to rely on the currently used representation (all individual object files from native libraries are copied to the rlib).

UPD: Hmm, it looks like -Z link-native-libraries is not actually respected when bundling native libraries into rlibs, not sure whether it's intentional or not.

@bsilver8192
Copy link

I wanted to confirm one thing. Scenarios in which people want to use rlibs as "regular static libraries" also assume -Z link-native-libraries=false, right? I.e. the linking of libraries is managed entirely by some outer build system rather than rustc.

+1 for my use case being like this, I want to manage those via Bazel.

@jsgf
Copy link
Contributor

jsgf commented Jul 29, 2022

@petrochenkov Yes, that's what I'd expect - the plan here is to make the build system deal with all dependencies uniformly, Rust and non-Rust.

UPD: Hmm, it looks like -Z link-native-libraries is not actually respected when bundling native libraries into rlibs, not sure whether it's intentional or not.

No, I think this is a bug, and I think I got this the wrong way around when I last mentioned it. Right now it has the effect of ignoring the bundled native library references at link time, but they're still included. I think it would be much more useful to make this skip embedding on a crate-by-crate basis, and have a separate option to ignore any bundled libraries at link time.

This would allow a more incremental approach of skipping bundled native libraries on a case-by-case basis, but still allow them for libstd (until it has fully specified dependencies).

@pcwalton
Copy link
Contributor

Hello! I've posted a pre-RFC for minimal stabilization of the rlib format on internals. Please feel free to comment there. Thanks!

@bjorn3
Copy link
Member

bjorn3 commented May 25, 2023

As of #86844 (scheduled for the 1.71 release) if you are directly linking the rlibs of the standard library rather than letting rustc handle linking, you will now need to define a static named __rust_no_alloc_shim_is_unstable which is at least 1 byte big. In addition if you are using #[global_allocator], you must stop defining __rust_alloc, __rust_dealloc, __rust_realloc and __rust_alloc_zeroed as they are now directly defined by the #[global_allocator] expansion rather than as part of the allocator shim. If you are using the default allocator in libstd you will need to keep defining them though.

aarongable pushed a commit to chromium/chromium that referenced this issue May 26, 2023
rust-lang/rust#73632 (comment):
```
As of #86844 (scheduled for the 1.71 release) if you are directly
linking the rlibs of the standard library rather than letting rustc
handle linking, you will now need to define a static named
`__rust_no_alloc_shim_is_unstable` which is at least 1 byte big. In
addition if you are using `#[global_allocator]`, you must stop defining
`__rust_alloc`, `__rust_dealloc`, `__rust_realloc` and
`__rust_alloc_zeroed` as they are now directly defined by the
`#[global_allocator]` expansion rather than as part of the allocator
shim. If you are using the default allocator in libstd you will need
to keep defining them though.
```

We use the default allocator, so we can keep using our C++ shims as
long as we define this new symbol.

[email protected]

Bug: 1292038
Change-Id: Ib21c48d8b656155acc0afc6941ed85f819bc9bea
Cq-Include-Trybots: luci.chromium.try:android-rust-arm32-rel,android-rust-arm64-dbg,android-rust-arm64-rel,linux-rust-x64-rel,linux-rust-x64-dbg,win-rust-x64-dbg,mac-rust-x64-dbg,win-rust-x64-rel
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4569204
Reviewed-by: Adrian Taylor <[email protected]>
Commit-Queue: danakj <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1149904}
@Lupus
Copy link

Lupus commented Jun 26, 2023

Is there somewhat future-proof workaround available as part of some open-source project that one could leverage maybe? There are some recipes in this discussion, but it's not clear how one could reconstruct the required "ugly hacks" to get going while we waiting for the right solution to make it to the upstream.

I'm building some bindings from Rust to OCaml, everything worked great until I tried to link two such bindings libraries in one binary, which lead me here with a bunch of linker errors at hand...

@keith
Copy link
Contributor

keith commented Jun 26, 2023

Folks using bazel and rules_rust workaround this today, so you can trace back that code or an example link command there to see the result

@danakj
Copy link
Contributor

danakj commented Jun 26, 2023

The current work around is to build rust rlibs, not staticlibs, and link those as you would .a files. You must explicitly link the stdlib rlibs as well though.

@Lupus
Copy link

Lupus commented Jun 26, 2023

Is compilation of an empty crate as a staticlib still a decent approach to not hunt individual stdlib rlibs? Bazel rules around that are quite wordy as it seems...

Other than that, one needs to parse cargo manifest, build dependency graph, build a list of rlibs required for particular crate, pass that list to the linker?

@durin42
Copy link
Contributor

durin42 commented Jun 26, 2023

AIUI, it's a decent approach if-and-only-if you can consolidate all your Rust bits into a single unified target. Otherwise you run the risk of pulling stuff in twice, which either bloats your binary or causes linker errors depending on how you do it.

UI-RayanWang pushed a commit to ubiquiti/ubnt_libjingle_component_src_build that referenced this issue Aug 16, 2023
rust-lang/rust#73632 (comment):
```
As of #86844 (scheduled for the 1.71 release) if you are directly
linking the rlibs of the standard library rather than letting rustc
handle linking, you will now need to define a static named
`__rust_no_alloc_shim_is_unstable` which is at least 1 byte big. In
addition if you are using `#[global_allocator]`, you must stop defining
`__rust_alloc`, `__rust_dealloc`, `__rust_realloc` and
`__rust_alloc_zeroed` as they are now directly defined by the
`#[global_allocator]` expansion rather than as part of the allocator
shim. If you are using the default allocator in libstd you will need
to keep defining them though.
```

We use the default allocator, so we can keep using our C++ shims as
long as we define this new symbol.

[email protected]

Bug: 1292038
Change-Id: Ib21c48d8b656155acc0afc6941ed85f819bc9bea
Cq-Include-Trybots: luci.chromium.try:android-rust-arm32-rel,android-rust-arm64-dbg,android-rust-arm64-rel,linux-rust-x64-rel,linux-rust-x64-dbg,win-rust-x64-dbg,mac-rust-x64-dbg,win-rust-x64-rel
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4569204
Reviewed-by: Adrian Taylor <[email protected]>
Commit-Queue: danakj <[email protected]>
Cr-Commit-Position: refs/heads/main@{#1149904}
NOKEYCHECK=True
GitOrigin-RevId: 579b3dd0ea41a40da8a61ab87a8b0bc39e158998
@tgross35
Copy link
Contributor

@pcwalton did you ever move forward from pre-rfc to rfc with that?

I don't think I've seen this yet (it's a long thread...) but could rustc maybe gain the ability to turn a rlib into a .a? It would be a pretty fast operation that would let us do something before we get to the point of stabilizing rlib (still +1 for the RFC of course) and without needing to figure out a non-rlib distribution of std.

@bjorn3
Copy link
Member

bjorn3 commented Sep 14, 2023

We already have the staticlib crate type for bundling everything into a single .a file. Turning individual rlibs into .a files won't work as it would either duplicate the allocator shim and such between every such .a or or omit them from all and produce unlinkable .a files. One idea I have is to produce a new crate type which is to staticlib as dylib is to cdylib. It would act like a dylib with respect to producing the allocator shim and bundling multiple crates, but produce a .a file instead of a .so file and add the necessary crate metadata to allow consuming it like a regular crate from rustc. See also the end of #111594 (comment)

facebook-github-bot pushed a commit to facebook/buck2-prelude that referenced this issue Oct 31, 2023
Summary: This diff ports the `native_unbundle_deps` feature from buck1 to buck2. The effect of this change is to make all `cxx_library` -> `rust_library` edges in the dep graph result in linking `rlib` artifacts rather than `staticlib` artifacts. The benefit being that we can then avoid issues with duplicate deps across libraries. rust-lang/rust#73632 has much more detail about this approach.

Reviewed By: zertosh

Differential Revision: D50523279

fbshipit-source-id: 16ee22db9140dba1e882d86aa376a03e010ef352
facebook-github-bot pushed a commit to facebook/ocamlrep that referenced this issue Oct 31, 2023
Summary: This diff ports the `native_unbundle_deps` feature from buck1 to buck2. The effect of this change is to make all `cxx_library` -> `rust_library` edges in the dep graph result in linking `rlib` artifacts rather than `staticlib` artifacts. The benefit being that we can then avoid issues with duplicate deps across libraries. rust-lang/rust#73632 has much more detail about this approach.

Reviewed By: zertosh

Differential Revision: D50523279

fbshipit-source-id: 16ee22db9140dba1e882d86aa376a03e010ef352
facebook-github-bot pushed a commit to facebook/buck2 that referenced this issue Oct 31, 2023
Summary: This diff ports the `native_unbundle_deps` feature from buck1 to buck2. The effect of this change is to make all `cxx_library` -> `rust_library` edges in the dep graph result in linking `rlib` artifacts rather than `staticlib` artifacts. The benefit being that we can then avoid issues with duplicate deps across libraries. rust-lang/rust#73632 has much more detail about this approach.

Reviewed By: zertosh

Differential Revision: D50523279

fbshipit-source-id: 16ee22db9140dba1e882d86aa376a03e010ef352
copybara-service bot pushed a commit to chromeos/adhd that referenced this issue Dec 6, 2023
This aligns with ChromeOS Rust.

Also update rules_rust to latest release to pick up
bazelbuild/rules_rust@aa4b3a8
Which includes bazelbuild/rules_rust@4bd44d0
for __rust_no_alloc_shim_is_unstable which is required since Rust 1.71.0
as described in rust-lang/rust#73632 (comment)

BUG=None
TEST=cop

Change-Id: Ib579ee725f0adc233021fb07ba55046930f71f9c
Reviewed-on: https://chromium-review.googlesource.com/c/chromiumos/third_party/adhd/+/5095413
Tested-by: [email protected] <[email protected]>
Reviewed-by: Chih-Yang Hsia <[email protected]>
Commit-Queue: Li-Yu Yu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-linkage Area: linking into static, shared libraries and binaries
Projects
None yet
Development

No branches or pull requests