From b1adeb02293bc73d87d38eb26b360198371af1c2 Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Mon, 30 Aug 2021 14:39:35 +0800 Subject: [PATCH] Add mapping symbol - What is Mapping Symbol? This proposal add a new special symbol class, called mapping symbol, those symbol are used for assist disassembler to having better knowledge of binary, it could be used for distinguish the code and data region. Here is two kind of mapping symbol: data and instruction, and both having an optional extra part in the symbol name for carry extra information. Symbol Name | Meaning :---------- | :----------------------------------------------------------- $d | Start of a sequence of data. $x | Start of a sequence of instructions. $x | Start of a sequence of instructions with extension. Mapping symbol are also used by other ISA for same purpose, like ARM, AArch64, C-sky and nds32. - Data Mapping Symbol Data mapping symbol could having an extra length information to present the orignal data layout. e.g. ``` .foo: .word 10 .word 20 ``` Without mapping symbol: ``` $ riscv64-unknown-elf-gcc foo.s -c $ riscv64-unknown-elf-gcc foo.o -c Disassembly of section .text: 0000000000000000 <.foo>: 0: 000a c.slli zero,0x2 2: 0000 unimp 4: 0014 0x14 ... ``` With mapping symbol: ``` Disassembly of section .text: 00000000 <.foo>: # $d insert here. 0: 0000000a .word 0x0000000a 4: 00000014 .word 0x00000014 ``` - Instruction Mapping Symbol Instruction mapping symbol with extra ISA info could also used for ifunc, e.g. library are built with `rv64gc`, but few functions like memcpy provide two version, one built with `rv64gc`, and one built with `rv64gcv`, and select by ifunc mechanism at run-time; however the arch attribute is recording for minimal execution environment requirement, so the ISA information from arch attribute isn't enough for disassembler to disassemble the `rv64gcv` version correctly. --- riscv-elf.adoc | 38 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) diff --git a/riscv-elf.adoc b/riscv-elf.adoc index 06eff4b1..95ae9fde 100644 --- a/riscv-elf.adoc +++ b/riscv-elf.adoc @@ -1164,6 +1164,44 @@ Merge policy::: The linker should report errors if object files of different privileged specification versions are merged. + +=== Mapping Symbol + +The section can have a mixture of code and data or code with different ISAs. +A number of symbols, named mapping symbols, describe the boundaries. + +[%autowidth] +|=== +| Symbol Name | Meaning +| $d | Start of a sequence of data. +| $x | Start of a sequence of instructions. +| $x | Start of a sequence of instructions with extension. +|=== + +The mapping symbol should set the type to `STT_NOTYPE`, binding to `STB_LOCAL`, +and the size of symbol to zero. + +The mapping symbol for data(`$d`) indicates the start of a sequence of data bytes. + +The mapping symbol for instruction(`$x`) indicates the start of a sequence of +instructions. +and it has an optional ISA string, which means the following code regions are +using ISA is different than the ISA recorded in the arch attribute; +the ISA information will used until the next instruction mapping symbol; +an instruction mapping symbol without ISA string means using ISA configuration +from ELF attribute. + +Format and rule of the optional ISA string are same as `Tag_RISCV_arch`, must +having explicit version, more detailed rule please refer to <>. + +NOTE: The use case for mapping symbol for instruction(`$x`) with ISA information +is used with ifunc, e.g. libraries are built with `rv64gc`, but few functions +like memcpy provides two versions, one built with `rv64gc`, and one built with +`rv64gcv`, and select by ifunc mechanism at run-time; however, the arch +attribute is recording for minimal execution environment requirements, so the +ISA information from arch attribute is not enough for the disassembler to +disassemble the `rv64gcv` version correctly. + == Linker Relaxation At link time, when all the memory objects have been resolved, the code sequence