-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLVM Bitcode Linker: A self contained linker for nvptx and other targets #117458
Conversation
r? @oli-obk (rustbot has picked a reviewer for you, use r? to override) |
@rustbot label +O-NVPTX |
This comment has been minimized.
This comment has been minimized.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
Cc: @workingjubilee as I remember we talked about this in this issue denzp/rust-ptx-linker#34 (comment) |
Can we require fat LTO for nvptx? When rustc does fat LTO it already combines all crates into a single object file. |
I'm really excited to hear that PTX might gain a maintained linker again! Two years ago I was using Rust to write single-source kernels for NVIDIA GPUs, where LTO and inlining were crucial. I at some point had to fork the ptx-linker crate to keep LTO working with newer LLVM versions. It would be awesome if linking would "just work" instead again, so that the linker's is kept in sync and supports fat LTO. Thank you so much for working on this ❤️ |
It would certainly be possible, rust is capable of calling the right functions in llvm to make it work. But it doesn't work "out of the box" and I'm not certain it's the correct thing to do long term eiteher. If we decided to use only When it comes to compiler change we would still need to add When I tested it just now it seems like As an addition, there might be future benefits an approach as this embedded-linker can have to other targets than ptx as well. Maybe it can be used to bring thin lto to embedded targets where the native linker doesn't support it. It can also provide compilation without requiring to install an external target specific linker. Is this an approach we would like to explore further even with the known deficiencies listed above? Is the main advantage that we avoid yet another tool that needs to be maintained? Or that we get more control using libLLVM compared to calling the llvm tools? |
That will be fixed soon. There is an open PR for making compiler-builtins participate in LTO. |
5a9b1dc
to
282b8d3
Compare
This comment has been minimized.
This comment has been minimized.
This comment was marked as resolved.
This comment was marked as resolved.
There are a few questions in my original post, but I think the one that stops me from moving forward is figuring out whether this is an approach that can be accepted. Are there any processes required to get something like this MR approved (MCP?) or is it just a matter of completing the implementation? Insight like "this is probably not going to be accepted, but bjorn3's alternative probably will be" would also be very helpful to help me avoid putting effort into something that is probably not going to go anywhere. |
I planned to review it in a couple of weeks. |
I think the general approach is fine, we can ship a small additional tool, especially if it's not enabled by default. |
0995767
to
843dd28
Compare
@petrochenkov Yet another conflict in @rustbot label -S-waiting-on-author +S-waiting-on-review |
@bors r- |
@bors r+ |
…kingjubilee Rollup of 15 pull requests Successful merges: - rust-lang#116791 (Allow codegen backends to opt-out of parallel codegen) - rust-lang#116793 (Allow targets to override default codegen backend) - rust-lang#117458 (LLVM Bitcode Linker: A self contained linker for nvptx and other targets) - rust-lang#119385 (Fix type resolution of associated const equality bounds (take 2)) - rust-lang#121438 (std support for wasm32 panic=unwind) - rust-lang#121893 (Add tests (and a bit of cleanup) for interior mut handling in promotion and const-checking) - rust-lang#122080 (Clarity improvements to `DropTree`) - rust-lang#122152 (Improve diagnostics for parenthesized type arguments) - rust-lang#122166 (Remove the unused `field_remapping` field from `TypeLowering`) - rust-lang#122249 (interpret: do not call machine read hooks during validation) - rust-lang#122299 (Store backtrace for `must_produce_diag`) - rust-lang#122318 (Revision-related tweaks for next-solver tests) - rust-lang#122320 (Use ptradd for vtable indexing) - rust-lang#122328 (unix_sigpipe: Replace `inherit` with `sig_dfl` in syntax tests) - rust-lang#122330 (bootstrap readme: fix, improve, update) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of rust-lang#117458 - kjetilkjeka:embedded-linker, r=petrochenkov LLVM Bitcode Linker: A self contained linker for nvptx and other targets This PR introduces a new linker named `llvm-bitcode-linker`. It is a `self-contained` linker that can be used to link programs in `llbc` before optimizing and compiling to native code. It will first be used internally in the Rust compiler to enable tests for the `nvptx64-nvidia-cuda` target as the original `rust-ptx-linker` is deprecated. It will then be provided to users of the `nvptx64-nvidia-cuda` target with the purpose of linking ptx. More targets than nvptx will also be supported eventually. The PR introduces a new unstable `LinkerFlavor` for the compiler. The compiler will also not be shipped with rustc but most likely instead be shipped in it's own unstable component (a follow up PR will be opened for this). This means that merging this PR should not add any stability guarantees. When more details of `self-contained` is implemented it will only be possible to use the linker when `-Clink-self-contained=+linker` is passed. <details> <summary>Original Description</summary> **When this PR was created it was focused a bit differently. The original text is preserved here in case there's some interests in it** I have experimenting with approaches to replace the ptx-linker and enable the nvptx target tests again. I think it's time to get some feedback on the approach. ### The problem The only useful linker for the nvptx target is [this crate](https://github.com/denzp/rust-ptx-linker). Since this linker performs linking on llvm bitcode it needs to track the llvm version of rustc and use the same format. It has not been maintained for 3+ years and must be considered abandoned. Over the years rust have upgraded LLVM while the linker has been left to bitrot. It is no longer in a usable state. Due to the difficulty of keeping the ptx-linker up to date outside of tree the nvptx tests was [disabled a long time ago](rust-lang@f8f9a28). It was [previously discussed](rust-lang#96842 (comment)) if adding the ptx-linker to the rust repo would be a possibility. My efforts in doing this stopped at getting an answered if the license would prohibit it from inclusion in the [Rust repo](rust-lang#96842 (comment)). I therefore concluded that a re-write would be necessary. ### The possible solution presented here The llvm tools know perfectly well how to link and optimize llvm bitcode. Each of them only perform a single task, and are therefore a bit cumbersome to call with the current linker approach rustc takes. This PR adds a simple tool (current name `embedded-linker`) which can link self contained (often embedded) programs in llvm bitcode before compiling to the target format. Optimization will also be performed if lto is enabled. The rust compiler will make a single invocation to this tool, while the tool will orchestrate the many calls to the llvm tools. ### The questions - Is having control over the nvptx linking and therefore also tests worth it to add such tool? or should the tool live outside the rust repo? - Is the approach of calling llvm tools acceptable? Or would we want to keep the ptx-linker approach of using the llvm library? The tools seems to provide more simplicity and stability, but more intermediate files are being written. Perhaps there also are some performance penalty for the calling tools approach. - What is the process for adding such tool? MCP? - Does adding `llvm-link` to the llvm-tool component require any process? - Does it require some sort of FCP to remove ptx-linker as the default linker for ptx? Or is it sufficient that using the upstream ptx-linker is broken in its current state. it is possible to use a somewhat patched version of ptx-linker. </details>
…mponent, r=Mark-Simulacrum Distribute LLVM bitcode linker as a preview component The self-contained LLVM bitcode linker was recently added in rust-lang#117458. It is currently only in use to link the Nvidia ptx assembly tests when running rustc tests (local or CI). In fact, the linker itself is currently only usable for the `nvptx64-nvidia-cuda` target, but more targets will be supported in the future. The reason a new linker was needed for the ptx format is that the [old one](https://github.com/denzp/rust-ptx-linker) has not been updated the last few years. It worked fine for a while, but as LLVM changed it broke and the nvptx tests was [disabled in rustc back in 2019](rust-lang@f8f9a28). It was ad-hoc patched and have been used in a sub-optimal state by the community until now. If this PR is merged, the LLVM bitcode linker will be distributed as a preview component that can be used as a replacement for the old ptx-linker for development in addition to rustc tests. In addition to installing the `llvm-bitcode-linker` component, also the `llvm-tools` component must be installed as the `llvm-bitcode-linker` works by calling llvm tools. Even though the LLVM bitcode linker is in its early stages it already now provides a lot of value over the old ptx-linker just by working and using up-to-date llvm tooling. By shipping it as a component it will be easier to gather user experience and improving it. `@petrochenkov` when installing as a component it will be installed in the self-contained folder and will not work with `-Clink-self-contained=no` (although for some reason I expect to be a bug `-Clink-self-contained=-linker` doesn't properly disable it). However, when building using `x.py build` it will be placed in the `lib/rustlib/<target>/bin` directory and will be available for internal tests even if `-Clink-self-contained=no` is passed. CC: `@Mark-Simulacrum` as I very briefly discussed it with you some months ago https://rust-lang.zulipchat.com/#narrow/stream/242791-t-infra/topic/.E2.9C.94.20How.20to.20ship.20a.20new.20tool.20.28embedded.20linker.29.20to.20users.3F
…mponent, r=Mark-Simulacrum Distribute LLVM bitcode linker as a preview component The self-contained LLVM bitcode linker was recently added in rust-lang#117458. It is currently only in use to link the Nvidia ptx assembly tests when running rustc tests (local or CI). In fact, the linker itself is currently only usable for the `nvptx64-nvidia-cuda` target, but more targets will be supported in the future. The reason a new linker was needed for the ptx format is that the [old one](https://github.com/denzp/rust-ptx-linker) has not been updated the last few years. It worked fine for a while, but as LLVM changed it broke and the nvptx tests was [disabled in rustc back in 2019](rust-lang@f8f9a28). It was ad-hoc patched and have been used in a sub-optimal state by the community until now. If this PR is merged, the LLVM bitcode linker will be distributed as a preview component that can be used as a replacement for the old ptx-linker for development in addition to rustc tests. In addition to installing the `llvm-bitcode-linker` component, also the `llvm-tools` component must be installed as the `llvm-bitcode-linker` works by calling llvm tools. Even though the LLVM bitcode linker is in its early stages it already now provides a lot of value over the old ptx-linker just by working and using up-to-date llvm tooling. By shipping it as a component it will be easier to gather user experience and improving it. ``@petrochenkov`` when installing as a component it will be installed in the self-contained folder and will not work with `-Clink-self-contained=no` (although for some reason I expect to be a bug `-Clink-self-contained=-linker` doesn't properly disable it). However, when building using `x.py build` it will be placed in the `lib/rustlib/<target>/bin` directory and will be available for internal tests even if `-Clink-self-contained=no` is passed. CC: ``@Mark-Simulacrum`` as I very briefly discussed it with you some months ago https://rust-lang.zulipchat.com/#narrow/stream/242791-t-infra/topic/.E2.9C.94.20How.20to.20ship.20a.20new.20tool.20.28embedded.20linker.29.20to.20users.3F
Rollup merge of rust-lang#123423 - kjetilkjeka:llvm_bitcode_linker_component, r=Mark-Simulacrum Distribute LLVM bitcode linker as a preview component The self-contained LLVM bitcode linker was recently added in rust-lang#117458. It is currently only in use to link the Nvidia ptx assembly tests when running rustc tests (local or CI). In fact, the linker itself is currently only usable for the `nvptx64-nvidia-cuda` target, but more targets will be supported in the future. The reason a new linker was needed for the ptx format is that the [old one](https://github.com/denzp/rust-ptx-linker) has not been updated the last few years. It worked fine for a while, but as LLVM changed it broke and the nvptx tests was [disabled in rustc back in 2019](rust-lang@f8f9a28). It was ad-hoc patched and have been used in a sub-optimal state by the community until now. If this PR is merged, the LLVM bitcode linker will be distributed as a preview component that can be used as a replacement for the old ptx-linker for development in addition to rustc tests. In addition to installing the `llvm-bitcode-linker` component, also the `llvm-tools` component must be installed as the `llvm-bitcode-linker` works by calling llvm tools. Even though the LLVM bitcode linker is in its early stages it already now provides a lot of value over the old ptx-linker just by working and using up-to-date llvm tooling. By shipping it as a component it will be easier to gather user experience and improving it. ``@petrochenkov`` when installing as a component it will be installed in the self-contained folder and will not work with `-Clink-self-contained=no` (although for some reason I expect to be a bug `-Clink-self-contained=-linker` doesn't properly disable it). However, when building using `x.py build` it will be placed in the `lib/rustlib/<target>/bin` directory and will be available for internal tests even if `-Clink-self-contained=no` is passed. CC: ``@Mark-Simulacrum`` as I very briefly discussed it with you some months ago https://rust-lang.zulipchat.com/#narrow/stream/242791-t-infra/topic/.E2.9C.94.20How.20to.20ship.20a.20new.20tool.20.28embedded.20linker.29.20to.20users.3F
…ct, r=davidtwco NVPTX: Avoid PassMode::Direct for args in C abi Fixes rust-lang#117480 I must admit that I'm confused about `PassMode` altogether, is there a good sum-up threads for this anywhere? I'm especially confused about how "indirect" and "byval" goes together. To me it seems like "indirect" basically means "use a indirection through a pointer", while "byval" basically means "do not use indirection through a pointer". The return used to keep `PassMode::Direct` for small aggregates. It turns out that `make_indirect` messes up the tests and one way to fix it is to keep `PassMode::Direct` for all aggregates. I have mostly seen this PassMode mentioned for args. Is it also a problem for returns? When experimenting with `byval` as an alternative i ran into [this assert](https://github.com/rust-lang/rust/blob/61a3eea8043cc1c7a09c2adda884e27ffa8a1172/compiler/rustc_codegen_llvm/src/abi.rs#L463C22-L463C22) I have added tests for the same kind of types that is already tested for the "ptx-kernel" abi. The tests cannot be enabled until something like rust-lang#117458 is completed and merged. CC: `@RalfJung` since you seem to be the expert on this and have already helped me out tremendously CC: `@RDambrosio016` in case this influence your work on `rustc_codegen_nvvm` `@rustbot` label +O-NVPTX
…ct, r=davidtwco NVPTX: Avoid PassMode::Direct for args in C abi Fixes rust-lang#117480 I must admit that I'm confused about `PassMode` altogether, is there a good sum-up threads for this anywhere? I'm especially confused about how "indirect" and "byval" goes together. To me it seems like "indirect" basically means "use a indirection through a pointer", while "byval" basically means "do not use indirection through a pointer". The return used to keep `PassMode::Direct` for small aggregates. It turns out that `make_indirect` messes up the tests and one way to fix it is to keep `PassMode::Direct` for all aggregates. I have mostly seen this PassMode mentioned for args. Is it also a problem for returns? When experimenting with `byval` as an alternative i ran into [this assert](https://github.com/rust-lang/rust/blob/61a3eea8043cc1c7a09c2adda884e27ffa8a1172/compiler/rustc_codegen_llvm/src/abi.rs#L463C22-L463C22) I have added tests for the same kind of types that is already tested for the "ptx-kernel" abi. The tests cannot be enabled until something like rust-lang#117458 is completed and merged. CC: ``@RalfJung`` since you seem to be the expert on this and have already helped me out tremendously CC: ``@RDambrosio016`` in case this influence your work on `rustc_codegen_nvvm` ``@rustbot`` label +O-NVPTX
Rollup merge of rust-lang#117671 - kjetilkjeka:nvptx_c_abi_avoid_direct, r=davidtwco NVPTX: Avoid PassMode::Direct for args in C abi Fixes rust-lang#117480 I must admit that I'm confused about `PassMode` altogether, is there a good sum-up threads for this anywhere? I'm especially confused about how "indirect" and "byval" goes together. To me it seems like "indirect" basically means "use a indirection through a pointer", while "byval" basically means "do not use indirection through a pointer". The return used to keep `PassMode::Direct` for small aggregates. It turns out that `make_indirect` messes up the tests and one way to fix it is to keep `PassMode::Direct` for all aggregates. I have mostly seen this PassMode mentioned for args. Is it also a problem for returns? When experimenting with `byval` as an alternative i ran into [this assert](https://github.com/rust-lang/rust/blob/61a3eea8043cc1c7a09c2adda884e27ffa8a1172/compiler/rustc_codegen_llvm/src/abi.rs#L463C22-L463C22) I have added tests for the same kind of types that is already tested for the "ptx-kernel" abi. The tests cannot be enabled until something like rust-lang#117458 is completed and merged. CC: ``@RalfJung`` since you seem to be the expert on this and have already helped me out tremendously CC: ``@RDambrosio016`` in case this influence your work on `rustc_codegen_nvvm` ``@rustbot`` label +O-NVPTX
This PR introduces a new linker named
llvm-bitcode-linker
. It is aself-contained
linker that can be used to link programs inllbc
before optimizing and compiling to native code. It will first be used internally in the Rust compiler to enable tests for thenvptx64-nvidia-cuda
target as the originalrust-ptx-linker
is deprecated. It will then be provided to users of thenvptx64-nvidia-cuda
target with the purpose of linking ptx. More targets than nvptx will also be supported eventually.The PR introduces a new unstable
LinkerFlavor
for the compiler. The compiler will also not be shipped with rustc but most likely instead be shipped in it's own unstable component (a follow up PR will be opened for this). This means that merging this PR should not add any stability guarantees.When more details of
self-contained
is implemented it will only be possible to use the linker when-Clink-self-contained=+linker
is passed.Original Description
When this PR was created it was focused a bit differently. The original text is preserved here in case there's some interests in it
I have experimenting with approaches to replace the ptx-linker and enable the nvptx target tests again. I think it's time to get some feedback on the approach.
The problem
The only useful linker for the nvptx target is this crate. Since this linker performs linking on llvm bitcode it needs to track the llvm version of rustc and use the same format. It has not been maintained for 3+ years and must be considered abandoned. Over the years rust have upgraded LLVM while the linker has been left to bitrot. It is no longer in a usable state.
Due to the difficulty of keeping the ptx-linker up to date outside of tree the nvptx tests was disabled a long time ago. It was previously discussed if adding the ptx-linker to the rust repo would be a possibility. My efforts in doing this stopped at getting an answered if the license would prohibit it from inclusion in the Rust repo. I therefore concluded that a re-write would be necessary.
The possible solution presented here
The llvm tools know perfectly well how to link and optimize llvm bitcode. Each of them only perform a single task, and are therefore a bit cumbersome to call with the current linker approach rustc takes.
This PR adds a simple tool (current name
embedded-linker
) which can link self contained (often embedded) programs in llvm bitcode before compiling to the target format. Optimization will also be performed if lto is enabled. The rust compiler will make a single invocation to this tool, while the tool will orchestrate the many calls to the llvm tools.The questions
llvm-link
to the llvm-tool component require any process?