From b74bc8dc35ec65d7696cb699b96283437edb1f51 Mon Sep 17 00:00:00 2001 From: Kito Cheng Date: Mon, 30 Aug 2021 14:39:35 +0800 Subject: [PATCH] Add mapping symbol - What is Mapping Symbol? This proposal add a new special symbol class, called mapping symbol, those symbol are used for assist disassembler to having better knowledge of binary, it could be used for distinguish the code and data region. Here is two kind of mapping symbol: data and instruction, and both having an optional extra part in the symbol name for carry extra information. Symbol Name | Meaning :---------- | :----------------------------------------------------------- $d | Start of a sequence of data. $x | Start of a sequence of instructions. $x | Start of a sequence of instructions with extension. Mapping symbol are also used by other ISA for same purpose, like ARM, AArch64, C-sky and nds32. - Data Mapping Symbol Data mapping symbol could having an extra length information to present the orignal data layout. e.g. ``` .foo: .word 10 .word 20 ``` Without mapping symbol: ``` $ riscv64-unknown-elf-gcc foo.s -c $ riscv64-unknown-elf-gcc foo.o -c Disassembly of section .text: 0000000000000000 <.foo>: 0: 000a c.slli zero,0x2 2: 0000 unimp 4: 0014 0x14 ... ``` With mapping symbol: ``` Disassembly of section .text: 00000000 <.foo>: # $d insert here. 0: 0000000a .word 0x0000000a 4: 00000014 .word 0x00000014 ``` - Instruction Mapping Symbol Instruction mapping symbol with extra ISA info could also used for ifunc, e.g. library are built with `rv64gc`, but few functions like memcpy provide two version, one built with `rv64gc`, and one built with `rv64gcv`, and select by ifunc mechanism at run-time; however the arch attribute is recording for minimal execution environment requirement, so the ISA information from arch attribute isn't enough for disassembler to disassemble the `rv64gcv` version correctly. --- riscv-elf.adoc | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/riscv-elf.adoc b/riscv-elf.adoc index e99176e1..1f7d966c 100644 --- a/riscv-elf.adoc +++ b/riscv-elf.adoc @@ -760,6 +760,43 @@ Tag_RISCV_priv_spec contains the major/minor/revision version information of the privileged specification. It will report errors if object files of different privileged specification versions are merged. +=== Mapping Symbol + +Mapping symbol is a special symbol class used for assist disassembler to +having better knowledge of binary, it could be used for distinguish the +code and data region. + +[%autowidth] +|=== +| Symbol Name | Meaning +| $d | Start of a sequence of data. +| $x | Start of a sequence of instructions. +| $x | Start of a sequence of instructions with extension. +|=== + +Mapping symbol for data(`$d`) means following region are data. + +Mapping symbol for instruction(`$x`) means following region are instructions, +and it has an optional ISA string, which means following code region are using +ISA different than the ISA recorded in arch attribute, the ISA information will +used until next instruction mapping symobl; an instruction mapping symobl +without ISA string means using ISA configuration from ELF attribute. + +Format and rule of the optional ISA string is same as Tag_RISCV_arch, must +having explicit version, more detailed rule please refer to <>. + +NOTE: The use case for mapping symbol for instruction(`$x`) with ISA information is +used with ifunc, e.g. library are built with `rv64gc`, but few functions +like memcpy provide two version, one built with `rv64gc`, and one built with +`rv64gcv`, and select by ifunc mechanism at run-time; however the arch +attribute is recording for minimal execution environment requirement, so the ISA +information from arch attribute isn't enough for disassembler to disassemble the +`rv64gcv` version correctly. + +NOTE: For toolchain implementation, linker are permit to merge the adjacency +mapping symbol if they are exactly same type, and strip tool are permit to +strip mapping symbol. + == Code relaxation At link time, when all the memory objects have been resolved, the code sequence