Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add specification for TLS descriptors #373

Merged
merged 2 commits into from
Sep 13, 2023

Conversation

ishitatsuyuki
Copy link
Contributor

@ishitatsuyuki ishitatsuyuki commented Mar 27, 2023

This is a draft to propose a TLS descriptor scheme as requested in #94.

Currently working on a prototype on GNU toolchain to iron out potential issues, and until that's done this will remain a draft.

Rendered

@ishitatsuyuki
Copy link
Contributor Author

Main changes since #94 (comment):

  • Scheduling / interleaving is allowed.
  • The compiler may choose the temporary registers used in the instruction sequence.
  • The descriptor structure is now 4-word long, which is enough space to accommodate resolver, module_id, module_offset, module_generation as used in glibc. This makes the GOT entries larger but avoids the need to separately allocate descriptor memory for dlopened modules. (Thanks Rui for the idea.)

@enh-google
Copy link

@rprichard --- any thoughts from the bionic implementation perspective?

riscv-elf.adoc Outdated

Description:: This relaxation can relax a sequence of the load address of a symbol or load/store with a thread-local symbol reference into a thread-pointer-relative instruction.

**TODO:** How do we handle load/stores? Should we annotate the load/store instruction with an additional relocation so that the low offset can be fused into the load/store?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to try to handle load/stores for TLSDESC because we don't handle even usual GOT-indirect load/stores. I think we can just forget about it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the TODO, although I'll keep this thread open in case other reviewers want to share their opinion.

riscv-elf.adoc Outdated
Comment on lines 1604 to 1627
auipc a0, 0 // R_RISCV_TLSDESC_HI20 (symbol), R_RISCV_RELAX
lw t0, a0, 0 // R_RISCV_TLSDESC_LOAD_LO12_I (label), R_RISCV_RELAX
addi a0, a0, 0 // R_RISCV_TLSDESC_ADD_LO12_I (label), R_RISCV_RELAX
jalr t0, t0 // R_RISCV_TLSDESC_CALL (label), R_RISCV_RELAX
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you want to generalize registers with tX and tY?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pc-rel values in the examples are replaced with a concrete value, so I've also opted for concrete registers here. I don't have a strong opinion on this though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's useful to generalize the register as it's important for relaxation.

riscv-elf.adoc Outdated
Comment on lines 988 to 989
instruction scheduling, but the order of the 4 instructions must not change. The linker can use
the relocations to recognize the sequence and to perform relaxations.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to add a constraint on the order of these four instructions unless it is technically required. The linker can perform optimizations even if these instructions are given in an arbitrary order, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reworded the constraints and explicitly stated that lw and addi may be reordered with respect to each other.

riscv-elf.adoc Outdated
----

The `resolver` function is called with the address to the `tls_descriptor` itself,
and returns the address the TLS variable resolves to.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't it return the offset from TP?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention was to save a redundant subtraction and addition when DTV is used.

But now I realized that the relaxation only works if it's specified as the TP offset. So I've changed it to say offset from TP.

riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Outdated
auipc tX, %tlsdesc_hi(symbol) // R_RISCV_TLSDESC_HI20 (symbol)
lw tY, tX, %tlsdesc_lo_load(label) // R_RISCV_TLSDESC_LOAD_LO12_I (label)
addi a0, tX, %tlsdesc_lo_add(label) // R_RISCV_TLSDESC_ADD_LO12_I (label)
jalr t0, tY // R_RISCV_TLSDESC_CALL (label)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to specify the assembly syntax for this.

(Reminder to self)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to .tlsdesc_call, following AArch64. Let me know in case you prefer something else, syntax or naming wise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Received a comment from @jrtc27 (during meeting) that a directive might be error-prone when considering interactions with R_RISCV_ALIGN and its padding bytes. Let me see if there's a consistent way to incorporate the % syntax into the jalr line.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed back to %tlsdesc_call and it uses a syntax consistent with what we're already doing in %tprel_add.

riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Outdated Show resolved Hide resolved
@ishitatsuyuki
Copy link
Contributor Author

Thanks Paul for the spec text suggestions, it reads a lot more natural now.

riscv-elf.adoc Outdated
Comment on lines 963 to 976
Up to 3 `unsigned long` may be stored inline within the descriptor. Dynamic linker
implementations may use this to avoid a separate allocation to store data associated
with the symbol.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To address a question from the last meeting (https://github.com/riscv-admin/psabi/blob/master/MINUTES/meeting-20230403.adoc):

glibc, bionic and BSD libc uses (ti_module, ti_offset, gen_count). musl uses (ti_module, ti_offset). Both cases should happily fit in the 3-slot specification.

riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Outdated
Comment on lines 967 to 969
The TLS descriptor `resolver` is called with a special calling convention where all
registers all callee-saved, except `a0` which is used to pass the argument and `t0`
which is used as the alternate link register.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Should vector registers be caller or callee saved? Callee save is better for the fast path, but for the slow path it can suffer from over-conservative register saving.

On x86, SSE/AVX registers follow standard calling convention instead.
On AArch64, Q (NEON) registers are callee saved, but SVE registers follow standard calling convention instead.

If we're gonna follow suit, then vector registers will be caller saved as in the provisional Vector ABI docs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a guarantee that the TLS resolver will be a leaf function? vector register should be all caller-save IF it's not leaf function since mem* or str* might use vector version, and you never know the which function has call them or not, then you still need to save all register before any function call in TLS resolver.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fast path is a leaf function while the slow path almost definitely is not. The resolver is written in assembly and conditionally saves registers if it needs to enter the slow path. In that sequence, we could save vector registers as well, if callee-saved vector registers is desired.

In glibc the slow path is only hit once per (module / thread combination) to allocate the storage, so actually it might make more sense to make vector regs callee saved as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sound like it worth to having a special calling convention, I guess one challenge here is how to detect the presence of vector extensions, but anyway those stuffs should be addressed on upstream glibc soon by hw probing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tentatively clarified the text to say that vector registers are also callee-saved if vector extension is supported. Opinions welcome.

@ishitatsuyuki
Copy link
Contributor Author

I wonder if the use of labels might pose inconvenience for compiler implementations. I'm currently trying to implement this on GCC, but AFAIU the way to introduce labels in GCC also creates a jump target, which in turn results in more blocks in control flow graph (and inhibits many optimizations).

Label-utilizing relocations also exists for la family of instructions, but they can be handled by assembler in those cases since scheduling is not needed there. Doing the same for TLSDESC means giving up on scheduling opportunities as well as regalloc flexibility in some cases, but both might be negligible after all.

If anyone here is familiar with LLVM or other compilers, what do you think?

  • Does the compiler have a way to introduce labels that do not affect control flow?
  • If labels are un-ergonomic, is there a better way to convey information necessary for relaxation?

@jrtc27
Copy link
Collaborator

jrtc27 commented May 22, 2023

GCC's opaque blocks of textual assembly for instructions is pretty terrible if that's what you're referring to, yes. LLVM does not suffer from such a limitation. It will gladly split up things like LLA into its constituent instructions and optimise them (https://godbolt.org/z/53xobnvxz). I don't know if there are mechanisms to do similar things in GCC, but if not it's long overdue.

@ishitatsuyuki
Copy link
Contributor Author

Rebased. As a note for anyone who is potentially implementing, the relocation magic numbers have been shifted since some numbers have been taken by other spec updates.

Other changes:

  • Clarified that vector registers are also callee-saved.
  • .tlsdesc_call uses a directive instead of assembler function, since it does not relate to a immediate constant within the instruction.

@ishitatsuyuki
Copy link
Contributor Author

ishitatsuyuki commented May 22, 2023

GCC's opaque blocks of textual assembly for instructions is pretty terrible if that's what you're referring to, yes. LLVM does not suffer from such a limitation. It will gladly split up things like LLA into its constituent instructions and optimise them (godbolt.org/z/53xobnvxz). I don't know if there are mechanisms to do similar things in GCC, but if not it's long overdue.

It indeed looks like GCC falls short in this regard. It looks like LLVM handles the label stuff fine, so in that case I think we can proceed with the current way of specifying (relaxation-related) relocations.

@ishitatsuyuki
Copy link
Contributor Author

One small spec update, the %tlsdesc_lo family functions are unified since the assembler can determine the relocation to use from opcode. It also fixes some errors in the spec where %tlsdesc_lo_load and %tlsdesc_load_lo were both appearing in examples.

I have my WIP glibc tree here: https://github.com/ishitatsuyuki/glibc/tree/rv-tlsdesc. It doesn't handle vector calling conventions yet (I'm generally unsure about what's the state of vector integration in toolchain). For testing, I hand crafted some assembly and compiled it with a quickly hacked around GNU assembler, and linked with mold (linker support is required to generate the GOT entries, code is at rui314/mold#1041). It should work with both static TLS and dlopen TLS, but if there are bugs please let me know (I should also be able to do in depth tests after I finish work on GCC side).

netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Sep 28, 2023
New features:

- We now use BLAKE3 as a cryptographic hash function instead of SHA256.
  This change has made --build-id a few percent faster. libssl is no longer
  a build dependency.
- mold is now a few percent faster than the previous version due to an
  optimization of string merging code path.
- mold now emits slightly optimized code for thread-local variable accesses.
- [RISC-V] mold now supports TLSDESC relocations. TLSDESC is a new mechanism
  for faster thread-local variable access. We (@ishitatsuyuki) actually led
  the effort to ratify the specification (riscv-non-isa/riscv-elf-psabi-doc#373)
  and implement it to compiler toolchain including GCC, GNU binutils and,
  of course, mold.

Bug fixes and compatibility improvements:

- mold no longer marks an as-needed .so as "needed" if the .so file is not
  directly used by the output file. Previously, mold marked a .so file as
  "needed" if the .so file was used by another "needed" .so file.
- [PPC64] --execute-only now works on 64-bit PowerPC.
ishitatsuyuki added a commit to ishitatsuyuki/gcc that referenced this pull request Nov 20, 2023
This implements TLS Descriptors (TLSDESC) as specified in [1].

The 4-instruction sequence is implemented as a single RTX insn for
simplicity, but this can be revisited later if instruction scheduling or
more flexible RA is desired.

The default remains to be the traditional TLS model, but can be configured
with --with_tls={trad,desc}. The choice can be revisited once toolchain
and libc support ships.

[1]: riscv-non-isa/riscv-elf-psabi-doc#373.

gcc/Changelog:
	* config/riscv/riscv.opt: Add -mtls-dialect to configure TLS flavor.
	* config.gcc: Add --with_tls configuration option to change the
	default TLS flavor.
	* config/riscv/riscv.h: Add TARGET_TLSDESC determined from
	-mtls-dialect and with_tls defaults.
	* config/riscv/riscv-opts.h: Define enum riscv_tls_type for the
	two TLS flavors.
	* config/riscv/riscv-protos.h: Define SYMBOL_TLSDESC symbol type.
	* config/riscv/riscv.md: Add instruction sequence for TLSDESC.
	* config/riscv/riscv.cc (riscv_symbol_insns): Add instruction
	sequence length data for TLSDESC.
	(riscv_legitimize_tls_address): Add lowering of TLSDESC.
	* doc/install.texi: Document --with-tls for RISC-V.
	* doc/invoke.texi: Document --mtls-dialect for RISC-V.
ishitatsuyuki added a commit to ishitatsuyuki/gcc that referenced this pull request Dec 4, 2023
This implements TLS Descriptors (TLSDESC) as specified in [1].

The 4-instruction sequence is implemented as a single RTX insn for
simplicity, but this can be revisited later if instruction scheduling or
more flexible RA is desired.

The default remains to be the traditional TLS model, but can be configured
with --with_tls={trad,desc}. The choice can be revisited once toolchain
and libc support ships.

[1]: riscv-non-isa/riscv-elf-psabi-doc#373.

gcc/Changelog:
	* config/riscv/riscv.opt: Add -mtls-dialect to configure TLS flavor.
	* config.gcc: Add --with_tls configuration option to change the
	default TLS flavor.
	* config/riscv/riscv.h: Add TARGET_TLSDESC determined from
	-mtls-dialect and with_tls defaults.
	* config/riscv/riscv-opts.h: Define enum riscv_tls_type for the
	two TLS flavors.
	* config/riscv/riscv-protos.h: Define SYMBOL_TLSDESC symbol type.
	* config/riscv/riscv.md: Add instruction sequence for TLSDESC.
	* config/riscv/riscv.cc (riscv_symbol_insns): Add instruction
	sequence length data for TLSDESC.
	(riscv_legitimize_tls_address): Add lowering of TLSDESC.
	* doc/install.texi: Document --with-tls for RISC-V.
	* doc/invoke.texi: Document -mtls-dialect for RISC-V.
ishitatsuyuki added a commit to ishitatsuyuki/gcc that referenced this pull request Dec 4, 2023
This implements TLS Descriptors (TLSDESC) as specified in [1].

The 4-instruction sequence is implemented as a single RTX insn for
simplicity, but this can be revisited later if instruction scheduling or
more flexible RA is desired.

The default remains to be the traditional TLS model, but can be configured
with --with_tls={trad,desc}. The choice can be revisited once toolchain
and libc support ships.

[1]: riscv-non-isa/riscv-elf-psabi-doc#373.

gcc/Changelog:
	* config/riscv/riscv.opt: Add -mtls-dialect to configure TLS flavor.
	* config.gcc: Add --with_tls configuration option to change the
	default TLS flavor.
	* config/riscv/riscv.h: Add TARGET_TLSDESC determined from
	-mtls-dialect and with_tls defaults.
	* config/riscv/riscv-opts.h: Define enum riscv_tls_type for the
	two TLS flavors.
	* config/riscv/riscv-protos.h: Define SYMBOL_TLSDESC symbol type.
	* config/riscv/riscv.md: Add instruction sequence for TLSDESC.
	* config/riscv/riscv.cc (riscv_symbol_insns): Add instruction
	sequence length data for TLSDESC.
	(riscv_legitimize_tls_address): Add lowering of TLSDESC.
	* doc/install.texi: Document --with-tls for RISC-V.
	* doc/invoke.texi: Document -mtls-dialect for RISC-V.
ilovepi added a commit to ilovepi/llvm-project that referenced this pull request Jan 4, 2024
This patch adds support for RISC-V TLSDESC relocations, as described in
riscv-non-isa/riscv-elf-psabi-doc#373.

It does not attempt to handle relaxation for these cases, which will be
handled separately.
ilovepi added a commit to ilovepi/llvm-project that referenced this pull request Jan 23, 2024
This patch adds basic TLSDESC support for the global dynamic case in the
RISC-V backend by adding new relocation types for TLSDESC, as prescribed
in riscv-non-isa/riscv-elf-psabi-doc#373.

We also add a new pseudo instruction to simplify code generation.

Possible improvements for the local dynamic case will be addressed in separate
patches.

The current implementation is only enabled when passing the
-riscv-enable-tlsdesc flag.
MaskRay pushed a commit to MaskRay/llvm-project that referenced this pull request Jan 23, 2024
This patch adds basic TLSDESC support for the global dynamic case in the
RISC-V backend by adding new relocation types for TLSDESC, as prescribed
in riscv-non-isa/riscv-elf-psabi-doc#373.

We also add a new pseudo instruction to simplify code generation.

Possible improvements for the local dynamic case will be addressed in separate
patches.

The current implementation is only enabled when passing the
-riscv-enable-tlsdesc flag.
ilovepi added a commit to ilovepi/llvm-project that referenced this pull request Jan 23, 2024
This patch adds basic TLSDESC support for the global dynamic case in the
RISC-V backend by adding new relocation types for TLSDESC, as prescribed
in riscv-non-isa/riscv-elf-psabi-doc#373.

We also add a new pseudo instruction to simplify code generation.

Possible improvements for the local dynamic case will be addressed in separate
patches.

The current implementation is only enabled when passing the -enable-tlsdesc
codegen flag.
ilovepi added a commit to ilovepi/llvm-project that referenced this pull request Jan 23, 2024
This patch adds basic TLSDESC support for the global dynamic case in the
RISC-V backend by adding new relocation types for TLSDESC, as prescribed
in riscv-non-isa/riscv-elf-psabi-doc#373.

We also add a new pseudo instruction to simplify code generation.

Possible improvements for the local dynamic case will be addressed in separate
patches.

The current implementation is only enabled when passing the -enable-tlsdesc
codegen flag.
ilovepi added a commit to llvm/llvm-project that referenced this pull request Jan 24, 2024
This patch adds basic TLSDESC support in the RISC-V backend.

Specifically, we add new relocation types for TLSDESC, as prescribed in 
riscv-non-isa/riscv-elf-psabi-doc#373, and add a
new pseudo instruction to simplify code generation.

This patch does not try to optimize the local dynamic case, which can be
improved in separate patches. 

Linker side changes will also be handled separately.

The current implementation is only enabled when passing the new
`-enable-tlsdesc` codegen flag.
MaskRay added a commit to llvm/llvm-project that referenced this pull request Jan 25, 2024
Support
R_RISCV_TLSDESC_HI20/R_RISCV_TLSDESC_LOAD_LO12/R_RISCV_TLSDESC_ADD_LO12/R_RISCV_TLSDESC_CALL.
LOAD_LO12/ADD_LO12/CALL relocations reference a label at the HI20
location, which requires special handling. We save the value of HI20 to
be reused. Two interleaved TLSDESC code sequences, which compilers do
not generate, are unsupported.

For -no-pie/-pie links, TLSDESC to initial-exec or local-exec
optimizations are eligible. Implement the relevant hooks
(R_RELAX_TLS_GD_TO_LE, R_RELAX_TLS_GD_TO_IE): the first two instructions
are converted to NOP while the latter two are converted to a GOT load or
a lui+addi.

The first two instructions, which would be converted to NOP, are removed
instead in the presence of relaxation. Relaxation is eligible as long as
the R_RISCV_TLSDESC_HI20 relocation has a pairing R_RISCV_RELAX,
regardless of whether the following instructions have a R_RISCV_RELAX.
In addition, for the TLSDESC to LE optimization (`lui a0,<hi20>; addi a0,a0,<lo12>`),
`lui` can be removed (i.e. use the short form) if hi20 is 0.

```
// TLSDESC to LE/IE optimization
.Ltlsdesc_hi2:
  auipc a4, %tlsdesc_hi(c)                      # if relax: remove; otherwise, NOP
  load  a5, %tlsdesc_load_lo(.Ltlsdesc_hi2)(a4) # if relax: remove; otherwise, NOP
  addi  a0, a4, %tlsdesc_add_lo(.Ltlsdesc_hi2)  # if LE && !hi20 {if relax: remove; otherwise, NOP}
  jalr  t0, 0(a5), %tlsdesc_call(.Ltlsdesc_hi2)
  add   a0, a0, tp
```

The implementation carefully ensures that an instruction unrelated to
the current TLSDESC code sequence, if immediately follows a removable
instruction (HI20 or LOAD_LO12 OR (LE-specific) ADD_LO12), is not
converted to NOP.

* `riscv64-tlsdesc.s` is inspired by `i386-tlsdesc-gd.s` (https://reviews.llvm.org/D112582).
* `riscv64-tlsdesc-relax.s` tests linker relaxation.
* `riscv-tlsdesc-gd-mixed.s` is inspired by `x86-64-tlsdesc-gd-mixed.s` (https://reviews.llvm.org/D116900).

Link: riscv-non-isa/riscv-elf-psabi-doc#373

Reviewed By: ilovepi

Pull Request: #79239
llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Jan 25, 2024
Support
R_RISCV_TLSDESC_HI20/R_RISCV_TLSDESC_LOAD_LO12/R_RISCV_TLSDESC_ADD_LO12/R_RISCV_TLSDESC_CALL.
LOAD_LO12/ADD_LO12/CALL relocations reference a label at the HI20
location, which requires special handling. We save the value of HI20 to
be reused. Two interleaved TLSDESC code sequences, which compilers do
not generate, are unsupported.

For -no-pie/-pie links, TLSDESC to initial-exec or local-exec
optimizations are eligible. Implement the relevant hooks
(R_RELAX_TLS_GD_TO_LE, R_RELAX_TLS_GD_TO_IE): the first two instructions
are converted to NOP while the latter two are converted to a GOT load or
a lui+addi.

The first two instructions, which would be converted to NOP, are removed
instead in the presence of relaxation. Relaxation is eligible as long as
the R_RISCV_TLSDESC_HI20 relocation has a pairing R_RISCV_RELAX,
regardless of whether the following instructions have a R_RISCV_RELAX.
In addition, for the TLSDESC to LE optimization (`lui a0,<hi20>; addi a0,a0,<lo12>`),
`lui` can be removed (i.e. use the short form) if hi20 is 0.

```
// TLSDESC to LE/IE optimization
.Ltlsdesc_hi2:
  auipc a4, %tlsdesc_hi(c)                      # if relax: remove; otherwise, NOP
  load  a5, %tlsdesc_load_lo(.Ltlsdesc_hi2)(a4) # if relax: remove; otherwise, NOP
  addi  a0, a4, %tlsdesc_add_lo(.Ltlsdesc_hi2)  # if LE && !hi20 {if relax: remove; otherwise, NOP}
  jalr  t0, 0(a5), %tlsdesc_call(.Ltlsdesc_hi2)
  add   a0, a0, tp
```

The implementation carefully ensures that an instruction unrelated to
the current TLSDESC code sequence, if immediately follows a removable
instruction (HI20 or LOAD_LO12 OR (LE-specific) ADD_LO12), is not
converted to NOP.

* `riscv64-tlsdesc.s` is inspired by `i386-tlsdesc-gd.s` (https://reviews.llvm.org/D112582).
* `riscv64-tlsdesc-relax.s` tests linker relaxation.
* `riscv-tlsdesc-gd-mixed.s` is inspired by `x86-64-tlsdesc-gd-mixed.s` (https://reviews.llvm.org/D116900).

Link: riscv-non-isa/riscv-elf-psabi-doc#373

Reviewed By: ilovepi

Pull Request: llvm#79239

(cherry picked from commit 1117fdd)
llvmbot pushed a commit to llvmbot/llvm-project that referenced this pull request Jan 25, 2024
Support
R_RISCV_TLSDESC_HI20/R_RISCV_TLSDESC_LOAD_LO12/R_RISCV_TLSDESC_ADD_LO12/R_RISCV_TLSDESC_CALL.
LOAD_LO12/ADD_LO12/CALL relocations reference a label at the HI20
location, which requires special handling. We save the value of HI20 to
be reused. Two interleaved TLSDESC code sequences, which compilers do
not generate, are unsupported.

For -no-pie/-pie links, TLSDESC to initial-exec or local-exec
optimizations are eligible. Implement the relevant hooks
(R_RELAX_TLS_GD_TO_LE, R_RELAX_TLS_GD_TO_IE): the first two instructions
are converted to NOP while the latter two are converted to a GOT load or
a lui+addi.

The first two instructions, which would be converted to NOP, are removed
instead in the presence of relaxation. Relaxation is eligible as long as
the R_RISCV_TLSDESC_HI20 relocation has a pairing R_RISCV_RELAX,
regardless of whether the following instructions have a R_RISCV_RELAX.
In addition, for the TLSDESC to LE optimization (`lui a0,<hi20>; addi a0,a0,<lo12>`),
`lui` can be removed (i.e. use the short form) if hi20 is 0.

```
// TLSDESC to LE/IE optimization
.Ltlsdesc_hi2:
  auipc a4, %tlsdesc_hi(c)                      # if relax: remove; otherwise, NOP
  load  a5, %tlsdesc_load_lo(.Ltlsdesc_hi2)(a4) # if relax: remove; otherwise, NOP
  addi  a0, a4, %tlsdesc_add_lo(.Ltlsdesc_hi2)  # if LE && !hi20 {if relax: remove; otherwise, NOP}
  jalr  t0, 0(a5), %tlsdesc_call(.Ltlsdesc_hi2)
  add   a0, a0, tp
```

The implementation carefully ensures that an instruction unrelated to
the current TLSDESC code sequence, if immediately follows a removable
instruction (HI20 or LOAD_LO12 OR (LE-specific) ADD_LO12), is not
converted to NOP.

* `riscv64-tlsdesc.s` is inspired by `i386-tlsdesc-gd.s` (https://reviews.llvm.org/D112582).
* `riscv64-tlsdesc-relax.s` tests linker relaxation.
* `riscv-tlsdesc-gd-mixed.s` is inspired by `x86-64-tlsdesc-gd-mixed.s` (https://reviews.llvm.org/D116900).

Link: riscv-non-isa/riscv-elf-psabi-doc#373

Reviewed By: ilovepi

Pull Request: llvm#79239

(cherry picked from commit 1117fdd)
tstellar pushed a commit to llvm/llvm-project that referenced this pull request Jan 27, 2024
Support
R_RISCV_TLSDESC_HI20/R_RISCV_TLSDESC_LOAD_LO12/R_RISCV_TLSDESC_ADD_LO12/R_RISCV_TLSDESC_CALL.
LOAD_LO12/ADD_LO12/CALL relocations reference a label at the HI20
location, which requires special handling. We save the value of HI20 to
be reused. Two interleaved TLSDESC code sequences, which compilers do
not generate, are unsupported.

For -no-pie/-pie links, TLSDESC to initial-exec or local-exec
optimizations are eligible. Implement the relevant hooks
(R_RELAX_TLS_GD_TO_LE, R_RELAX_TLS_GD_TO_IE): the first two instructions
are converted to NOP while the latter two are converted to a GOT load or
a lui+addi.

The first two instructions, which would be converted to NOP, are removed
instead in the presence of relaxation. Relaxation is eligible as long as
the R_RISCV_TLSDESC_HI20 relocation has a pairing R_RISCV_RELAX,
regardless of whether the following instructions have a R_RISCV_RELAX.
In addition, for the TLSDESC to LE optimization (`lui a0,<hi20>; addi a0,a0,<lo12>`),
`lui` can be removed (i.e. use the short form) if hi20 is 0.

```
// TLSDESC to LE/IE optimization
.Ltlsdesc_hi2:
  auipc a4, %tlsdesc_hi(c)                      # if relax: remove; otherwise, NOP
  load  a5, %tlsdesc_load_lo(.Ltlsdesc_hi2)(a4) # if relax: remove; otherwise, NOP
  addi  a0, a4, %tlsdesc_add_lo(.Ltlsdesc_hi2)  # if LE && !hi20 {if relax: remove; otherwise, NOP}
  jalr  t0, 0(a5), %tlsdesc_call(.Ltlsdesc_hi2)
  add   a0, a0, tp
```

The implementation carefully ensures that an instruction unrelated to
the current TLSDESC code sequence, if immediately follows a removable
instruction (HI20 or LOAD_LO12 OR (LE-specific) ADD_LO12), is not
converted to NOP.

* `riscv64-tlsdesc.s` is inspired by `i386-tlsdesc-gd.s` (https://reviews.llvm.org/D112582).
* `riscv64-tlsdesc-relax.s` tests linker relaxation.
* `riscv-tlsdesc-gd-mixed.s` is inspired by `x86-64-tlsdesc-gd-mixed.s` (https://reviews.llvm.org/D116900).

Link: riscv-non-isa/riscv-elf-psabi-doc#373

Reviewed By: ilovepi

Pull Request: #79239

(cherry picked from commit 1117fdd)
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
Support
R_RISCV_TLSDESC_HI20/R_RISCV_TLSDESC_LOAD_LO12/R_RISCV_TLSDESC_ADD_LO12/R_RISCV_TLSDESC_CALL.
LOAD_LO12/ADD_LO12/CALL relocations reference a label at the HI20
location, which requires special handling. We save the value of HI20 to
be reused. Two interleaved TLSDESC code sequences, which compilers do
not generate, are unsupported.

For -no-pie/-pie links, TLSDESC to initial-exec or local-exec
optimizations are eligible. Implement the relevant hooks
(R_RELAX_TLS_GD_TO_LE, R_RELAX_TLS_GD_TO_IE): the first two instructions
are converted to NOP while the latter two are converted to a GOT load or
a lui+addi.

The first two instructions, which would be converted to NOP, are removed
instead in the presence of relaxation. Relaxation is eligible as long as
the R_RISCV_TLSDESC_HI20 relocation has a pairing R_RISCV_RELAX,
regardless of whether the following instructions have a R_RISCV_RELAX.
In addition, for the TLSDESC to LE optimization (`lui a0,<hi20>; addi a0,a0,<lo12>`),
`lui` can be removed (i.e. use the short form) if hi20 is 0.

```
// TLSDESC to LE/IE optimization
.Ltlsdesc_hi2:
  auipc a4, %tlsdesc_hi(c)                      # if relax: remove; otherwise, NOP
  load  a5, %tlsdesc_load_lo(.Ltlsdesc_hi2)(a4) # if relax: remove; otherwise, NOP
  addi  a0, a4, %tlsdesc_add_lo(.Ltlsdesc_hi2)  # if LE && !hi20 {if relax: remove; otherwise, NOP}
  jalr  t0, 0(a5), %tlsdesc_call(.Ltlsdesc_hi2)
  add   a0, a0, tp
```

The implementation carefully ensures that an instruction unrelated to
the current TLSDESC code sequence, if immediately follows a removable
instruction (HI20 or LOAD_LO12 OR (LE-specific) ADD_LO12), is not
converted to NOP.

* `riscv64-tlsdesc.s` is inspired by `i386-tlsdesc-gd.s` (https://reviews.llvm.org/D112582).
* `riscv64-tlsdesc-relax.s` tests linker relaxation.
* `riscv-tlsdesc-gd-mixed.s` is inspired by `x86-64-tlsdesc-gd-mixed.s` (https://reviews.llvm.org/D116900).

Link: riscv-non-isa/riscv-elf-psabi-doc#373

Reviewed By: ilovepi

Pull Request: llvm#79239

(cherry picked from commit 1117fdd)
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
Support
R_RISCV_TLSDESC_HI20/R_RISCV_TLSDESC_LOAD_LO12/R_RISCV_TLSDESC_ADD_LO12/R_RISCV_TLSDESC_CALL.
LOAD_LO12/ADD_LO12/CALL relocations reference a label at the HI20
location, which requires special handling. We save the value of HI20 to
be reused. Two interleaved TLSDESC code sequences, which compilers do
not generate, are unsupported.

For -no-pie/-pie links, TLSDESC to initial-exec or local-exec
optimizations are eligible. Implement the relevant hooks
(R_RELAX_TLS_GD_TO_LE, R_RELAX_TLS_GD_TO_IE): the first two instructions
are converted to NOP while the latter two are converted to a GOT load or
a lui+addi.

The first two instructions, which would be converted to NOP, are removed
instead in the presence of relaxation. Relaxation is eligible as long as
the R_RISCV_TLSDESC_HI20 relocation has a pairing R_RISCV_RELAX,
regardless of whether the following instructions have a R_RISCV_RELAX.
In addition, for the TLSDESC to LE optimization (`lui a0,<hi20>; addi a0,a0,<lo12>`),
`lui` can be removed (i.e. use the short form) if hi20 is 0.

```
// TLSDESC to LE/IE optimization
.Ltlsdesc_hi2:
  auipc a4, %tlsdesc_hi(c)                      # if relax: remove; otherwise, NOP
  load  a5, %tlsdesc_load_lo(.Ltlsdesc_hi2)(a4) # if relax: remove; otherwise, NOP
  addi  a0, a4, %tlsdesc_add_lo(.Ltlsdesc_hi2)  # if LE && !hi20 {if relax: remove; otherwise, NOP}
  jalr  t0, 0(a5), %tlsdesc_call(.Ltlsdesc_hi2)
  add   a0, a0, tp
```

The implementation carefully ensures that an instruction unrelated to
the current TLSDESC code sequence, if immediately follows a removable
instruction (HI20 or LOAD_LO12 OR (LE-specific) ADD_LO12), is not
converted to NOP.

* `riscv64-tlsdesc.s` is inspired by `i386-tlsdesc-gd.s` (https://reviews.llvm.org/D112582).
* `riscv64-tlsdesc-relax.s` tests linker relaxation.
* `riscv-tlsdesc-gd-mixed.s` is inspired by `x86-64-tlsdesc-gd-mixed.s` (https://reviews.llvm.org/D116900).

Link: riscv-non-isa/riscv-elf-psabi-doc#373

Reviewed By: ilovepi

Pull Request: llvm#79239

(cherry picked from commit 1117fdd)
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
Support
R_RISCV_TLSDESC_HI20/R_RISCV_TLSDESC_LOAD_LO12/R_RISCV_TLSDESC_ADD_LO12/R_RISCV_TLSDESC_CALL.
LOAD_LO12/ADD_LO12/CALL relocations reference a label at the HI20
location, which requires special handling. We save the value of HI20 to
be reused. Two interleaved TLSDESC code sequences, which compilers do
not generate, are unsupported.

For -no-pie/-pie links, TLSDESC to initial-exec or local-exec
optimizations are eligible. Implement the relevant hooks
(R_RELAX_TLS_GD_TO_LE, R_RELAX_TLS_GD_TO_IE): the first two instructions
are converted to NOP while the latter two are converted to a GOT load or
a lui+addi.

The first two instructions, which would be converted to NOP, are removed
instead in the presence of relaxation. Relaxation is eligible as long as
the R_RISCV_TLSDESC_HI20 relocation has a pairing R_RISCV_RELAX,
regardless of whether the following instructions have a R_RISCV_RELAX.
In addition, for the TLSDESC to LE optimization (`lui a0,<hi20>; addi a0,a0,<lo12>`),
`lui` can be removed (i.e. use the short form) if hi20 is 0.

```
// TLSDESC to LE/IE optimization
.Ltlsdesc_hi2:
  auipc a4, %tlsdesc_hi(c)                      # if relax: remove; otherwise, NOP
  load  a5, %tlsdesc_load_lo(.Ltlsdesc_hi2)(a4) # if relax: remove; otherwise, NOP
  addi  a0, a4, %tlsdesc_add_lo(.Ltlsdesc_hi2)  # if LE && !hi20 {if relax: remove; otherwise, NOP}
  jalr  t0, 0(a5), %tlsdesc_call(.Ltlsdesc_hi2)
  add   a0, a0, tp
```

The implementation carefully ensures that an instruction unrelated to
the current TLSDESC code sequence, if immediately follows a removable
instruction (HI20 or LOAD_LO12 OR (LE-specific) ADD_LO12), is not
converted to NOP.

* `riscv64-tlsdesc.s` is inspired by `i386-tlsdesc-gd.s` (https://reviews.llvm.org/D112582).
* `riscv64-tlsdesc-relax.s` tests linker relaxation.
* `riscv-tlsdesc-gd-mixed.s` is inspired by `x86-64-tlsdesc-gd-mixed.s` (https://reviews.llvm.org/D116900).

Link: riscv-non-isa/riscv-elf-psabi-doc#373

Reviewed By: ilovepi

Pull Request: llvm#79239

(cherry picked from commit 1117fdd)
tstellar pushed a commit to tstellar/llvm-project that referenced this pull request Feb 14, 2024
Support
R_RISCV_TLSDESC_HI20/R_RISCV_TLSDESC_LOAD_LO12/R_RISCV_TLSDESC_ADD_LO12/R_RISCV_TLSDESC_CALL.
LOAD_LO12/ADD_LO12/CALL relocations reference a label at the HI20
location, which requires special handling. We save the value of HI20 to
be reused. Two interleaved TLSDESC code sequences, which compilers do
not generate, are unsupported.

For -no-pie/-pie links, TLSDESC to initial-exec or local-exec
optimizations are eligible. Implement the relevant hooks
(R_RELAX_TLS_GD_TO_LE, R_RELAX_TLS_GD_TO_IE): the first two instructions
are converted to NOP while the latter two are converted to a GOT load or
a lui+addi.

The first two instructions, which would be converted to NOP, are removed
instead in the presence of relaxation. Relaxation is eligible as long as
the R_RISCV_TLSDESC_HI20 relocation has a pairing R_RISCV_RELAX,
regardless of whether the following instructions have a R_RISCV_RELAX.
In addition, for the TLSDESC to LE optimization (`lui a0,<hi20>; addi a0,a0,<lo12>`),
`lui` can be removed (i.e. use the short form) if hi20 is 0.

```
// TLSDESC to LE/IE optimization
.Ltlsdesc_hi2:
  auipc a4, %tlsdesc_hi(c)                      # if relax: remove; otherwise, NOP
  load  a5, %tlsdesc_load_lo(.Ltlsdesc_hi2)(a4) # if relax: remove; otherwise, NOP
  addi  a0, a4, %tlsdesc_add_lo(.Ltlsdesc_hi2)  # if LE && !hi20 {if relax: remove; otherwise, NOP}
  jalr  t0, 0(a5), %tlsdesc_call(.Ltlsdesc_hi2)
  add   a0, a0, tp
```

The implementation carefully ensures that an instruction unrelated to
the current TLSDESC code sequence, if immediately follows a removable
instruction (HI20 or LOAD_LO12 OR (LE-specific) ADD_LO12), is not
converted to NOP.

* `riscv64-tlsdesc.s` is inspired by `i386-tlsdesc-gd.s` (https://reviews.llvm.org/D112582).
* `riscv64-tlsdesc-relax.s` tests linker relaxation.
* `riscv-tlsdesc-gd-mixed.s` is inspired by `x86-64-tlsdesc-gd-mixed.s` (https://reviews.llvm.org/D116900).

Link: riscv-non-isa/riscv-elf-psabi-doc#373

Reviewed By: ilovepi

Pull Request: llvm#79239

(cherry picked from commit 1117fdd)
ishitatsuyuki added a commit to ishitatsuyuki/gcc that referenced this pull request Mar 27, 2024
This implements TLS Descriptors (TLSDESC) as specified in [1].

The 4-instruction sequence is implemented as a single RTX insn for
simplicity, but this can be revisited later if instruction scheduling or
more flexible RA is desired.

The default remains to be the traditional TLS model, but can be configured
with --with_tls={trad,desc}. The choice can be revisited once toolchain
and libc support ships.

[1]: riscv-non-isa/riscv-elf-psabi-doc#373.

gcc/Changelog:
	* config/riscv/riscv.opt: Add -mtls-dialect to configure TLS flavor.
	* config.gcc: Add --with_tls configuration option to change the
	default TLS flavor.
	* config/riscv/riscv.h: Add TARGET_TLSDESC determined from
	-mtls-dialect and with_tls defaults.
	* config/riscv/riscv-opts.h: Define enum riscv_tls_type for the
	two TLS flavors.
	* config/riscv/riscv-protos.h: Define SYMBOL_TLSDESC symbol type.
	* config/riscv/riscv.md: Add instruction sequence for TLSDESC.
	* config/riscv/riscv.cc (riscv_symbol_insns): Add instruction
	sequence length data for TLSDESC.
	(riscv_legitimize_tls_address): Add lowering of TLSDESC.
	* doc/install.texi: Document --with-tls for RISC-V.
	* doc/invoke.texi: Document -mtls-dialect for RISC-V.
ishitatsuyuki added a commit to ishitatsuyuki/gcc that referenced this pull request Mar 29, 2024
This implements TLS Descriptors (TLSDESC) as specified in [1].

The 4-instruction sequence is implemented as a single RTX insn for
simplicity, but this can be revisited later if instruction scheduling or
more flexible RA is desired.

The default remains to be the traditional TLS model, but can be configured
with --with-tls={trad,desc}. The choice can be revisited once toolchain
and libc support ships.

[1]: riscv-non-isa/riscv-elf-psabi-doc#373.

gcc/Changelog:
	* config/riscv/riscv.opt: Add -mtls-dialect to configure TLS flavor.
	* config.gcc: Add --with_tls configuration option to change the
	default TLS flavor.
	* config/riscv/riscv.h: Add TARGET_TLSDESC determined from
	-mtls-dialect and with_tls defaults.
	* config/riscv/riscv-opts.h: Define enum riscv_tls_type for the
	two TLS flavors.
	* config/riscv/riscv-protos.h: Define SYMBOL_TLSDESC symbol type.
	* config/riscv/riscv.md: Add instruction sequence for TLSDESC.
	* config/riscv/riscv.cc (riscv_symbol_insns): Add instruction
	sequence length data for TLSDESC.
	(riscv_legitimize_tls_address): Add lowering of TLSDESC.
	* doc/install.texi: Document --with-tls for RISC-V.
	* doc/invoke.texi: Document -mtls-dialect for RISC-V.
	* testsuite/gcc.target/riscv/tls_1.x: Add TLSDESC GD test case.
	* testsuite/gcc.target/riscv/tlsdesc.c: Same as above.
vathpela pushed a commit to vathpela/gcc that referenced this pull request Apr 8, 2024
This implements TLS Descriptors (TLSDESC) as specified in [1].

The 4-instruction sequence is implemented as a single RTX insn for
simplicity, but this can be revisited later if instruction scheduling or
more flexible RA is desired.

The default remains to be the traditional TLS model, but can be configured
with --with-tls={trad,desc}. The choice can be revisited once toolchain
and libc support ships.

[1]: riscv-non-isa/riscv-elf-psabi-doc#373.

gcc/ChangeLog:

	* config/riscv/riscv.opt: Add -mtls-dialect to configure TLS flavor.
	* config.gcc: Add --with-tls configuration option to change the
	default TLS flavor.
	* config/riscv/riscv.h: Add TARGET_TLSDESC determined from
	-mtls-dialect and with_tls defaults.
	* config/riscv/riscv-opts.h: Define enum riscv_tls_type for the
	two TLS flavors.
	* config/riscv/riscv-protos.h: Define SYMBOL_TLSDESC symbol type.
	* config/riscv/riscv.md: Add instruction sequence for TLSDESC.
	* config/riscv/riscv.cc (riscv_symbol_insns): Add instruction
	sequence length data for TLSDESC.
	(riscv_legitimize_tls_address): Add lowering of TLSDESC.
	* doc/install.texi: Document --with-tls for RISC-V.
	* doc/invoke.texi: Document -mtls-dialect for RISC-V.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/tls_1.x: Add TLSDESC GD test case.
	* gcc.target/riscv/tlsdesc.c: Same as above.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants