Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add R_RISCV_SET_ULEB128 and R_RISCV_SUB_ULEB128. #124

Closed
wants to merge 2 commits into from
Closed

Add R_RISCV_SET_ULEB128 and R_RISCV_SUB_ULEB128. #124

wants to merge 2 commits into from

Conversation

kuanlinchentw
Copy link

In order to implement uleb128 subtraction, we need two relocations for the
linker to re-calculate the value after relaxing.
Please reference the commit: https://sourceware.org/ml/binutils/2019-11/msg00393.html

In order to implement uleb128 subtraction, we need two relocations for the
linker to re-calculate the value after relaxing.
@MaskRay
Copy link
Collaborator

MaskRay commented Dec 3, 2019

Can you give some examples where R_RISCV_SET_ULEB128 and R_RISCV_SUB_ULEB128 may be used? For BFD_RELOC_RISCV_SUB_ULEB128, how does the linker know the length of the value?

@Nelson1225
Copy link
Collaborator

This seems good to me :)

I think this should be similar to the R_RISCV_ADD[8|16|32|64] plus R_RISCV_SUB[8|16|32|64]. If we have .word L1 - L2, and the L1 and L2 may be changed since the linker relaxation, then we need to fix the value correctly by the relocation R_RISCV_ADD32 and R_RISCV_SUB32. So the new relocation R_RISCV_SET_ULEB128 and R_RISCV_SUB_ULEB128 are probably used to fix the correct value for the .uleb128 L1 - L2.

For GNU linker, we can use the existed APIs to get the length of uleb128, also can read and write the uleb128 value. For linker relocation, the key point may be that we can not reduce the code size when relocating, so even if the value L1 - L2 is reduced, we still have to keep the original length of uleb128.

For now we can fix the correct value for some data directives (.byte, .word, ...) after relaxations. So I think it would be nice if we can also handle the .uleb128 data directives correctly after relaxations :)

Best and Regards
Nelson

@MaskRay
Copy link
Collaborator

MaskRay commented Dec 5, 2019

How does the assembler represent .uleb128 L1 - L2? R_RISCV_SET_ULEB128 L1 + R_RISCV_SUB_ULEB128 L2 at the same offset?

When the linker sees R_RISCV_SUB_ULEB128, it is supposed to read the original value to get the length, perform subtraction, then write the new value (cannot be shorter than the value written by R_RISCV_SET_ULEB128). Then does more than one R_RISCV_SUB_ULEB128 work? I think no, because from the value was not the original one, so you can't infer its length. These details should probably be noted down.

riscv-elf.md Outdated
@@ -904,4 +906,4 @@ wint_t | 4 | 4
The following definitions apply for all ABIs defined in this document. Here
there is no differentiation between ILP32 and LP64 abis.

`wchar_t` is signed. `wint_t` is unsigned.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't actually figure out what the diff is here, but presumably it's accidental.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a "newline at the end of file" change, I think. Either adding or removing it.

@jim-wilson
Copy link
Collaborator

Fangrui Song asked for an example. Consider this testcase
#include
int
main (void)
{
try
{
std::cout << "Hello world\n";
}
catch(...)
{
std::cout << "Caught exception\n";
}
return 0;
}
Compiling with -O -S with current rv32 tools, in the .gcc_except_table section, I see
.LLSDA1528:
.byte 0xff
.byte 0x9b
.byte 0x3d
.byte 0x3
.byte 0x34
.4byte .LEHB0-.LFB1528
.4byte .LEHE0-.LEHB0
We have to use .4byte for the text section label subtractions, because with linker relaxation these sizes could change. With Kuan-Lin's binutils patch, I get
.LLSDA1528:
.byte 0xff
.byte 0x9b
.uleb128 .LLSDATT1528-.LLSDATTD1528
.LLSDATTD1528:
.byte 0x1
.uleb128 .LLSDACSE1528-.LLSDACSB1528
.LLSDACSB1528:
.uleb128 .LEHB0-.LFB1528
.uleb128 .LEHE0-.LEHB0
and now we can use uleb128 for both the size of the gcc exception table, and for the text section label subtractions.

In the object file there are pairs of set/sub relocations in the patched output.
RELOCATION RECORDS FOR [.gcc_except_table]:
OFFSET TYPE VALUE
00000002 R_RISCV_SET_ULEB128 .LLSDATT1528
00000002 R_RISCV_SUB_ULEB128 .LLSDATTD1528
00000004 R_RISCV_SET_ULEB128 .LLSDACSE1528
00000004 R_RISCV_SUB_ULEB128 .LLSDACSB1528
00000005 R_RISCV_SET_ULEB128 .LEHB0
00000005 R_RISCV_SUB_ULEB128 .LFB1528
00000006 R_RISCV_SET_ULEB128 .LEHE0
00000006 R_RISCV_SUB_ULEB128 .LEHB0

The use of uleb here then results in a smaller executable. Without the binutils patch I get
gamma05:2323$ size a.out
text data bss dec hex filename
587272 149435 6832 743539 b5873 a.out
and with the binutils patch I get
gamma05:2246$ size a.out
text data bss dec hex filename
587272 131721 6832 725825 b1341 a.out
The size decrease is entirely due to the smaller gcc_except_table section. This is using an embedded elf static libstdc++, so most of the size is coming from libstdc++.

On the other hand, object files get larger, because there are more relocations, and linker relaxation is going to be slower, as there is more stuff to relax.

@jim-wilson
Copy link
Collaborator

Now that I've reviewed Kuan-Lin's binutils patch, I see that there is no actual relaxation of the uleb128 values. The uleb128 size is set at assembler time based on assembly time values, and there is an assumption that values can only decrease at link time due to relaxation. If a value does decrease enough to reduce the uleb128 size, then we add leading zeros to the value to ensure that the uleb128 size doesn't change. This trick only works for uleb128, not sleb128. But gcc only uses leb128 subtraction for pointer values which are unsigned, so all we need is uleb128 support.

Kuan-Lin's patch does require that there are matching SET_ULEB128 and SUB_ULEB128 relocs, in either order, and the linker gives an error if it sees one but not the other.

The assumption that the uleb128 address size can only decrease with relaxation is true only if both addresses are in the same section. If we are subtracting a text section address from a data section address, then that value could increase due to alignment padding between sections/segments. So I think these relocs should only be allowed when the symbols are in the same section. I think the binutils patch should be fixed to enforce this.

If Kuan-Lin's binutils patch goes in, gcc will automatically start emitting code using the new relocs due to a configure test. I don't want gcc to emit code by default that won't work with LLVM, so I think we should have buy in from LLVM folks before the binutils patch goes in.

@palmer-dabbelt
Copy link
Contributor

I have a post or two on the binutils mailing list about this, but the current binutils approach isn't sufficient. There's at least two issues I know of:

  • Assuming we want to maintain two-byte alignment in unlinked object files, we can't handle alignment with R_RISCV_ALIGN unless we know the byte alignment of the
  • It's possible to produce invalid linker output with the current patch set, which happens for something like .uleb128 SYMBOL (ie, no subtraction).

IIRC there's examples on the mailing list. While I don't think there's anything fundamentally wrong with the ABI proposed here, we need at least one valid implementation (and ideally an implementation in both GCC and LLVM) for me to be happy taking it.

@jim-wilson
Copy link
Collaborator

I don't see an alignment problem, because the size is fixed at assembly time. Maybe you are confusing this with the separate .p2align patch for binutils? Or maybe the issue here is that if someone puts a .uleb128 in the text section between instructions then we have a potential problem? I'd argue that people shouldn't do that, and we can warn/error for that, or we could disable the use of the set/sub relocs in that unusual case, or we could require an explicit align after the uleb128 directives.. Or maybe you are saying if we later add relaxation we have a potential problem?

I didn't see a problem with uleb128 symbol when running the gcc testsuite, but that is no guarantee that there is no problem.

I will have to recheck your email again.

@palmer-dabbelt
Copy link
Contributor

No, the .byte stuff is fine because the last bit of the instruction alignment is always statically known. The problem is trying to relax LEB128s, at which point we don't statically know that last alignment bit. The issue here is that we're trying to maintain two constraints: that instructions are two-byte aligned before linking (so they can be disassembled), and that instructions are at least two-byte aligned after linking.

Imagine we had the following code, in rv64:

.text
address:
.uleb128 address
.balign 2
c.nop

We'd need to emit a 10-byte sequence (presumably full invalid instructions) for the ULEB, as the address may be that big (it's rv64). R_RISCV_ALIGN is defined to align up to the next largest power of two, which means two-byte alignment can insert either 0 bytes or 1 byte. If we insert 0 bytes then pre-linking instruction decode works but the ULEB can't be relaxed to an odd number of bytes, while if we insert 1 byte then pre-linking instruction decode breaks.

We could probably cobble something together in the disassembler to make this work, but IIRC there were some other issues with the "round up to the next power of two" behavior of R_RISCV_ALIGN related to mixing RVC and no-RVC code. IIRC there were ways to fix those issues, but I don't think we ever got around to it.

Instead, I think we should introduce some sort of "R_RISCV_ALIGN_DOWN" relocation that aligns to the next lower power of two. This would allow us to relax ULEBs, but also allow us to avoid the correctness requirement around handling R_RISCV_ALIGN -- essentially the idea would be to keep the alignment correct before linking, while still allowing enough bytes to align after linking. That would allow simple linkers (ie, Linux's module loader) to ignore R_RISCV_ALIGN_DOWN, which would avoid baking "this loader doesn't support relaxation" into the ABI.

@Nelson1225
Copy link
Collaborator

Nelson1225 commented Dec 18, 2019

According to Kuan-Lin's binutils patch,

@@ -1512,6 +1532,25 @@ perform_relocation (const reloc_howto_type *howto,
value = ENCODE_RVC_LUI_IMM (RISCV_CONST_HIGH_PART (value));
break;

case R_RISCV_SET_ULEB128:
case R_RISCV_SUB_ULEB128:
{
unsigned int len = 0;
bfd_byte *endp, *p;

_bfd_read_unsigned_leb128 (input_bfd, contents + rel->r_offset, &len);

/* Clean the contents value to zero. Do not reduce the length. */
p = contents + rel->r_offset;
endp = p + len -1;
memset (p, 0x80, len);
*(endp) = 0;
p = write_uleb128 (p, value) - 1;
if (p < endp)
*p |= 0x80;
return bfd_reloc_ok;
}

At first, I'm wondering why we can ensure the size of uleb128 by adding only one leading zeros (0x80). It is reasonable, so I just record the reason here, in case someone has the same problem as me. Consider the following example,

.text
foo:
... code can be relaxed ...
bar:
.uleb128 bar - foo

The value of bar - foo can be represented by the equation "x * (2^7)^y" bytes. If the x and y is 1, we will need one leading zeros (0x80) for any code size reduction. However, if we need more than one leading zeros after relaxations, then we have to relax at least (1- (1/2^7)) = (1- (1/128)) = 99.22% code size. I think this is impossible for the current relaxations. Beside, we have considered the large alignment cases for all relaxations, so the large alignment won't cause the large code reduction. Therefore, adding one leading zeros (0x80) is enough to cover all possible uleb128 subtraction cases.

@kito-cheng
Copy link
Collaborator

Hi @kuanlinchentw:
Some question for this new relocation:

  • What's the main purpose of this relocation, I assume is for code size reduction, right?
  • Which section will use this relocation in future?
  • Does we need support that on glibc/linux kernel for dynamic linker/kernel module loader? Or all we need is implement in assembler and ld.bfd + lld?

@kuanlinchentw
Copy link
Author

  • Assuming we want to maintain two-byte alignment in unlinked object files, we can't handle alignment with R_RISCV_ALIGN unless we know the byte alignment of the
  • It's possible to produce invalid linker output with the current patch set, which happens for something like .uleb128 SYMBOL (ie, no subtraction).

I think the format .uleb128 SYMBOL (ie, no subtraction) is a misunderstanding.
It's impossible to know the final address of SYMBOL at assembler time.
I tried x86 and ARM and both of them just fill the assembler time address and leave no relocation.
Therefore, my patch only try to solve the .uleb128 "subtraction".

@kuanlinchentw
Copy link
Author

The assumption that the uleb128 address size can only decrease with relaxation is true only if both addresses are in the same section. If we are subtracting a text section address from a data section address, then that value could increase due to alignment padding between sections/segments. So I think these relocs should only be allowed when the symbols are in the same section. I think the binutils patch should be fixed to enforce this.

I agree with this. I'll fix it in the patch.

@kuanlinchentw
Copy link
Author

  • What's the main purpose of this relocation, I assume is for code size reduction, right?

Yes, the original motivation of ULEB128 is for code size reduction.

  • Which section will use this relocation in future?

I think .gcc_execption_table and some dwarf sections use ULEB128 most ofter.

  • Does we need support that on glibc/linux kernel for dynamic linker/kernel module loader? Or all we need is implement in assembler and ld.bfd + lld?

It only need to implement in assembler and ld.bfd+lld, because the subtraction value is fixed after linking.

@kito-cheng
Copy link
Collaborator

@asb FYI https://sourceware.org/ml/binutils/2019-12/msg00024.html, the another patch about the alignment between code and date in same section.

@MaskRay
Copy link
Collaborator

MaskRay commented Sep 25, 2020

Apologies for making a comment without reading all previous discussions. I am a bit concerned about a relocation type without a clear width. See http://sourceware.org/PR4029 for an example that oscillating .uleb128 caused a bug. I don't really read through Alan Modra's workaround but I think it is about overaligning the label immediately preceding Type Tables. It works before type tables are accessed by adding a negative offset to TTBase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants