Skip to content

Commit

Permalink
Add mapping symbol
Browse files Browse the repository at this point in the history
- What is Mapping Symbol?

This proposal add a new special symbol class, called mapping symbol,
those symbol are used for assist disassembler to having better knowledge
of binary, it could be used for distinguish the code and data region.

Here is two kind of mapping symbol: data and instruction, and both
having an optional extra part in the symbol name for carry extra information.

Symbol Name | Meaning
:---------- | :-----------------------------------------------------------
$d          | Start of a sequence of data.
$x          | Start of a sequence of instructions.
$x<ISA>     | Start of a sequence of instructions with <ISA> extension.

Mapping symbol are also used by other ISA for same purpose, like ARM, AArch64,
C-sky and nds32.

- Data Mapping Symbol

Data mapping symbol could having an extra length information to present
the orignal data layout.

e.g.
```
.foo:
        .word 10
        .word 20
```

Without mapping symbol:
```
$ riscv64-unknown-elf-gcc foo.s -c
$ riscv64-unknown-elf-gcc foo.o -c

Disassembly of section .text:

0000000000000000 <.foo>:
   0:   000a                    c.slli  zero,0x2
   2:   0000                    unimp
   4:   0014                    0x14
        ...
```

With mapping symbol:
```
Disassembly of section .text:

00000000 <.foo>:                                      # $d insert here.
   0:   0000000a        .word   0x0000000a
   4:   00000014        .word   0x00000014
```

- Instruction Mapping Symbol

Instruction mapping symbol with extra ISA info could also used for
ifunc, e.g. library are built with `rv64gc`, but few functions
like memcpy provide two version, one built with `rv64gc`, and one built with
`rv64gcv`, and select by ifunc mechanism at run-time; however the arch
attribute is recording for minimal execution environment requirement, so the ISA
information from arch attribute isn't enough for disassembler to disassemble the
`rv64gcv` version correctly.
  • Loading branch information
kito-cheng committed Mar 14, 2023
1 parent 3d9938a commit b1adeb0
Showing 1 changed file with 38 additions and 0 deletions.
38 changes: 38 additions & 0 deletions riscv-elf.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -1164,6 +1164,44 @@ Merge policy:::
The linker should report errors if object files of different privileged
specification versions are merged.


=== Mapping Symbol

The section can have a mixture of code and data or code with different ISAs.
A number of symbols, named mapping symbols, describe the boundaries.

[%autowidth]
|===
| Symbol Name | Meaning
| $d | Start of a sequence of data.
| $x | Start of a sequence of instructions.
| $x<ISA> | Start of a sequence of instructions with <ISA> extension.
|===

The mapping symbol should set the type to `STT_NOTYPE`, binding to `STB_LOCAL`,
and the size of symbol to zero.

The mapping symbol for data(`$d`) indicates the start of a sequence of data bytes.

The mapping symbol for instruction(`$x`) indicates the start of a sequence of
instructions.
and it has an optional ISA string, which means the following code regions are
using ISA is different than the ISA recorded in the arch attribute;
the ISA information will used until the next instruction mapping symbol;
an instruction mapping symbol without ISA string means using ISA configuration
from ELF attribute.

Format and rule of the optional ISA string are same as `Tag_RISCV_arch`, must
having explicit version, more detailed rule please refer to <<Attributes>>.

NOTE: The use case for mapping symbol for instruction(`$x`) with ISA information
is used with ifunc, e.g. libraries are built with `rv64gc`, but few functions
like memcpy provides two versions, one built with `rv64gc`, and one built with
`rv64gcv`, and select by ifunc mechanism at run-time; however, the arch
attribute is recording for minimal execution environment requirements, so the
ISA information from arch attribute is not enough for the disassembler to
disassemble the `rv64gcv` version correctly.

== Linker Relaxation

At link time, when all the memory objects have been resolved, the code sequence
Expand Down

0 comments on commit b1adeb0

Please sign in to comment.