Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Add mapping symbol #196

Merged
merged 1 commit into from
Mar 14, 2023
Merged

Proposal: Add mapping symbol #196

merged 1 commit into from
Mar 14, 2023

Conversation

kito-cheng
Copy link
Collaborator

@kito-cheng kito-cheng commented Jul 8, 2021

Revise history:

Date Comment
2021/07/22 Remove $d<N>, add rule for $x<ISA>
2021/07/09 Add implementation section to mention implantation related stuffs.

What is Mapping Symbol

This proposal add a new special symbol class, called mapping symbol,
those symbol are used for assist disassembler to having better knowledge
of binary, it could be used for distinguish the code and data region.

Here is two kind of mapping symbol: data and instruction, and both
having an optional extra part in the symbol name for carry extra information.

Symbol Name Meaning
$d Start of a sequence of data.
$x Start of a sequence of instructions.
$x<ISA> Start of a sequence of instructions with <ISA> extension.

Mapping symbol are also used by other ISA for same purpose, like ARM, AArch64,
C-sky and nds32.

Data Mapping Symbol

Data mapping symbol could having an extra length information to present
the orignal data layout.

e.g.

.foo:
	.word 10
	.byte 1
	.word 20
	.byte 2

Without mapping symbol:

$ riscv64-unknown-elf-gcc foo.s -c
$ riscv64-unknown-elf-gcc foo.o -c

Disassembly of section .text:

0000000000000000 <.foo>:
   0:   000a                    c.slli  zero,0x2
   2:   0000                    unimp
   4:   1401                    addi    s0,s0,-32
   6:   0000                    unimp
   8:   0200                    addi    s0,sp,256

With mapping symbol:

Disassembly of section .text:

00000000 <.foo>:                                      # $d insert here.
   0:   0000000a        .word   0x0000000a
   4:   00001401        .word   0x00001401
   8:   Address 0x0000000000000008 is out of bounds.

Instruction Mapping Symbol

Instruction mapping symbol with extra ISA info could also used for
ifunc, e.g. library are built with rv64gc, but few functions
like memcpy provide two version, one built with rv64gc, and one built with
rv64gcv, and select by ifunc mechanism at run-time; however the arch
attribute is recording for minimal execution environment requirement, so the ISA
information from arch attribute isn't enough for disassembler to disassemble the
rv64gcv version correctly.

Implementation

Here is implementation note for each component of toolchain:

  • Mapping symbol are automatically emit by assembler.
  • Disassembler take mapping symbol as hint to having better better knowledge about original layout.
  • Compiler don't need to aware the existence of mapping symbol.
  • Linker don't need extra mechanism for handing mapping symbol, but linker are permit to merge the adjacency mapping symbol if they are exactly same type.
  • Strip tool are permit to strip mapping symbol.

@Nelson1225
Copy link
Collaborator

Nelson1225 commented Jul 9, 2021

Hi Guys,

This PR LGTM, thanks for Kito's help.

The v2 proposed patch in binutils is as follows,
https://sourceware.org/pipermail/binutils/2021-July/117316.html

However, compared to the current implementation, there are some issues that might be worth discussing.

  • $d, N is data length.

The length is only attached when the data is added by cons_worker. List all possible cases as follows,
.byte, .hword, .int, .long, .octa, .quad, .short, .word, .2byte, .4byte, .8byte, .dc, .dc.a, .dc.b, .dc.l, .dc.w, .rva.

For the .fill data directive, or rs_align, I still use a $d without data length to mark the region, since it is hard to estimate the length when adding the mapping symbols. Besides, the initial purpose is to display the code which are written by users more closely, the .fill data are originally a directive to represent a data region, so add the data length to the $d seems redundant.

Therefore, it would be great if we can add - $d<N>, N is 1, 2, 4, 8.

  • $a1 for odd byte code alignment, $a for a region of code alignment.

Consider the following case, which is compiled by the old toolchain without the mapping symbols.
$ cat tmp.s
.text
.option norvc
.option norelax
.byte 1
.align 3
nop
$ riscv64-unknown-elf-as tmp.s -o tmp.o
$ riscv64-unknown-elf-objdump -d tmp.o
0000000000000000 <.text>:
0: 0001 nop
2: 0001 nop
4: 00000013 nop
8: 00000013 nop
c: 00000013 nop

There are two problems here,

  1. The .byte + part of .align 3 is recognized to a c.nop instruction.
  2. c.nop is illegal since .option norvc is used.

But if we add mapping symbols for the case, then we can get the more friendly result,
0000000000000000 <.text>:
0: 01 .byte 0x01
1: 00 align.byte
2: 0001 align.nop
4: 00000013 align.nop
8: 00000013 nop
c: 00000013 align.nop

$a1 is added at address 0x1, which is the start of the .align 3. And then $a is added at 0x1 + 0x1 = 0x2. If we don't add the $a1, then we will get the following result,
0000000000000000 <.text>:
0: 01 .byte 0x01
1: 0100 align.addi s0,sp,128
3: 1300 align.addi s0,sp,416
5: 0000 align.unimp
7: 1300 align.addi s0,sp,416
9: 0000 unimp
b: 1300 addi s0,sp,416
d: 0000 align.unimp

Umm seems like objdump decode the .align.byte as a part of instruction addi. However, we also can improve the objdump, so maybe it could dump the correct results without the $a1 mapping symbol. I believe this is feasible, but use $a1 is quit easy, and no harm, at least to me.

  • The objudmp -D make mapping symbols useless for ARM/AARCH64.

The documentation for -D says that on ARM platforms -D should disassemble instructions. Therefore, they dump data by searching mapping symbols only when the DISASSEMBLE_DATA is set. That causes the mapping symbols are useless for -D. I am not sure if any riscv document had mentioned this, so I prefer to keep them useful, but only for text sections. Otherwise, dump other sections as data.

@jrtc27
Copy link
Collaborator

jrtc27 commented Jul 9, 2021

align.byte and align.nop are not valid mnemonics/directives. There is no need for special markers for alignment bytes, just fix the disassembler to correctly handle:

$d:
    .byte 1
$x:
    .byte half_of_a_c_nop(?)
    c.nop

i.e. don't try to disassemble instructions that are at odd addresses after data regions (or anything other than 0 mod 4 if RVC isn't present).

riscv-elf.md Outdated Show resolved Hide resolved
riscv-elf.md Outdated
the original data layout.

Mapping symbol for instruction(`$x`) means following region are instructions,
and it has an optional ISA string, which means following code region are using
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to state what the rules are for the ISA string; is it anything valid for -march= (ie default versions, implied extensions, etc all supported) or is it like the .RISCV.attributes section where it must be the full exploded string?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below it says that the string must be same as Tag_RISCV_arch.

@kito-cheng
Copy link
Collaborator Author

kito-cheng commented Jul 11, 2021

The objudmp -D make mapping symbols useless for ARM/AARCH64.
The documentation for -D says that on ARM platforms -D should disassemble instructions. Therefore, they dump data by searching mapping symbols only when the DISASSEMBLE_DATA is set. That causes the mapping symbols are useless for -D. I am not sure if any riscv document had mentioned this, so I prefer to keep them useful, but only for text sections. Otherwise, dump other sections as data.

I would prefer don't write down the behavior for implementation, it's a hit, how to interpreter that is dependent on implementation.

@MaskRay
Copy link
Collaborator

MaskRay commented Jul 11, 2021

I do see a use case for $d<N> yet.

I have some experience with the llvm-objdump and llvm-symbolizer. llvm-symbolizer just excludes mapping symbols. llvm-objdump adds mapping symbols of the current section into a sorted vector and performs a binary search. $d<N> will be ignored anyway. If you don't ignore $d<N>, you will need additional verification complexity I don't know will be justified.

A larger problem with $d<N> is that you need to know the data length beforehand. This is often cumbersome for a streamer style assembler (both LLVM MC and GNU as).


I cannot find $d.<suffix> $x.<suffix> $t.<suffix> usage in a hand-written assembly file other than lld/gas testsuites. I think it is fine not to define it for now.

@Nelson1225
Copy link
Collaborator

Thanks for your suggestions, @jrtc27 @MaskRay and @kito-cheng, I have sent v3 series of patches as follows,
https://sourceware.org/pipermail/binutils/2021-July/117348.html

Compared to v2,

  • Removed the data mapping symbols with data size.

  • Removed the alignment mapping symbols, $a and $a1.

  • If the alignment have odd bytes spaces and relaxation
    is disable, we usually remain an odd byte 0x00 at the
    start of alignment, and then fill nops for the remianing
    spaces. Therefore, add a $d mapping symbol for the odd
    byte, and then add a $x at $d + 0x1. This behavior is
    same as Arm and Aarch64.

Thanks
Nelson

@kito-cheng
Copy link
Collaborator Author

Updates:

  • Remove $d
  • add rule for $x<ISA>

Copy link
Collaborator

@jim-wilson jim-wilson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks OK to me. Could use a little English improvement, but that can wait until enough people are OK with it. No need to rewrite the English multiple times.

riscv-elf.md Outdated Show resolved Hide resolved
riscv-elf.md Outdated
the original data layout.

Mapping symbol for instruction(`$x`) means following region are instructions,
and it has an optional ISA string, which means following code region are using
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Below it says that the string must be same as Tag_RISCV_arch.

@kito-cheng
Copy link
Collaborator Author

Changes:

  • Mark use case part as NOTE.
  • Add one more note about toolchain implementation, allow linker to merge mapping symbol and allow strip tool to strip mapping symbol.

@kito-cheng
Copy link
Collaborator Author

Changes:

  • Rebase to master

@kito-cheng
Copy link
Collaborator Author

Changes:

  • Rebase to master

@kito-cheng
Copy link
Collaborator Author

@jrtc27 @MaskRay could you take a look? :)

riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Show resolved Hide resolved
riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Outdated Show resolved Hide resolved
@kito-cheng
Copy link
Collaborator Author

Changes:

  • Rebase
  • Allow an optional suffix uniquifier string for mapping symbol name.
  • Revise wording per @MaskRay's suggestion

@kito-cheng
Copy link
Collaborator Author

Changes:

  • Rebase
  • Revise description of mapping symbol

a4lg added a commit to a4lg/binutils-gdb that referenced this pull request Aug 11, 2022
The mapping symbols with ISA string is proposed to deal with so called
"ifunc issue".  It enables disassembling a certain range of the code with
a different architecture than the rest, even if conflicting.  This is useful
when there's "optimized" implementation is available but dynamically
switched only if a certain extension is available.

This commit implements the assembler support to emit mapping symbols with
ISA string (and partial disassembler support only to pass tests).

[1] Proposal: Extend .option directive for control enabled extensions on
specific code region,
riscv-non-isa/riscv-asm-manual#67

[2] Proposal: Add mapping symbol,
riscv-non-isa/riscv-elf-psabi-doc#196

This commit is based on Nelson Chu's proposal "RISC-V: Output mapping
symbols with ISA string once .option arch is used." but heavily modified to
reflect the intent of Kito's original proposal.  It is also made smarter so
that it no longer requires MAP_INSN_ARCH.

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add new field
	`arch_is_default' to keep track whether the architecture is still
	holds its default.
	(updated_riscv_subsets) New variable to keep track of whether the
	architecture is possibly changed and inspected to emit proper
	mapping symbols.
	(make_mapping_symbol): Make mapping symbols with ISA string if
	necessary.  Don't emit the mapping symbol if the previous one in the
	same section has the same name.
	(riscv_elf_section_change_hook): New.  Try to emit a new mapping
	symbol if the section is changed.
	(riscv_mapping_state): Don't skip if the architecture is possibly
	changed and the new state is "code".
	(s_riscv_option): Keep track of `updated_riscv_subsets' and
	`riscv_opts.arch_is_default'.
	* config/tc-riscv.h (md_elf_section_change_hook): Define as
	`riscv_elf_section_change_hook'.
	(riscv_elf_section_change_hook): Declare.
	* testsuite/gas/riscv/mapping-01a.d: Reflect mapping symbols with
	ISA string.
	* testsuite/gas/riscv/mapping-02a.d: Likewise.
	* testsuite/gas/riscv/mapping-03a.d: Likewise.
	* testsuite/gas/riscv/mapping-04a.d: Likewise.
	* testsuite/gas/riscv/mapping-norelax-03a.d: Likewise.
	* testsuite/gas/riscv/mapping-norelax-04a.d: Likewise.

opcodes/ChangeLog:

	* riscv-dis.c (riscv_get_map_state): Minimum support of mapping
	symbols with ISA string without actually parsing the ISA string.
	The only purpose of this change is to pass the tests.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request Aug 11, 2022
The mapping symbols with ISA string is proposed to deal with so called
"ifunc issue".  It enables disassembling a certain range of the code with
a different architecture than the rest, even if conflicting.  This is useful
when there's "optimized" implementation is available but dynamically
switched only if a certain extension is available.

This commit implements the disassembler support to parse mapping symbols
with ISA string.

[1] Proposal: Extend .option directive for control enabled extensions on
specific code region,
riscv-non-isa/riscv-asm-manual#67

[2] Proposal: Add mapping symbol,
riscv-non-isa/riscv-elf-psabi-doc#196

This commit is based on Nelson Chu's proposal "RISC-V: Output mapping
symbols with ISA string once .option arch is used." but heavily modified to
reflect the intent of Kito's original proposal.  It is also made smarter so
that it no longer requires MAP_INSN_ARCH.

gas/ChangeLog:

	* testsuite/gas/riscv/option-arch-01a.d: Reflect the disassembler
	support of mapping symbols with ISA string.

opcodes/ChangeLog:

	* riscv-dis.c (initial_default_arch) Default architecture string if
	no ELF attributes are available.
	(default_arch): A copy of the default architecture string.
	(is_arch_mapping): New variable to keep track of whether the current
	architecture is deviced from a mapping symbol.
	(riscv_disassemble_insn): Update FPR names when a mapping symbol
	with ISA string is encountered.
	(riscv_get_map_state): Support mapping symbols with ISA string.
	Use `is_arch_mapping' to stop repeatedly parsing the default
	architecture.
	(riscv_get_disassembler): Safer architecture string handling.
	Copy the string to switch to the default while disassembling.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request Aug 11, 2022
The mapping symbols with ISA string is proposed to deal with so called
"ifunc issue".  It enables disassembling a certain range of the code with
a different architecture than the rest, even if conflicting.  This is useful
when there's "optimized" implementation is available but dynamically
switched only if a certain extension is available.

This commit implements the assembler support to emit mapping symbols with
ISA string (and partial disassembler support only to pass tests).

[1] Proposal: Extend .option directive for control enabled extensions on
specific code region,
riscv-non-isa/riscv-asm-manual#67

[2] Proposal: Add mapping symbol,
riscv-non-isa/riscv-elf-psabi-doc#196

This commit is based on Nelson Chu's proposal "RISC-V: Output mapping
symbols with ISA string once .option arch is used." but heavily modified to
reflect the intent of Kito's original proposal.  It is also made smarter so
that it no longer requires MAP_INSN_ARCH.

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add new field
	`arch_is_default' to keep track whether the architecture is still
	holds its default.
	(updated_riscv_subsets) New variable to keep track of whether the
	architecture is possibly changed and inspected to emit proper
	mapping symbols.
	(make_mapping_symbol): Make mapping symbols with ISA string if
	necessary.  Don't emit the mapping symbol if the previous one in the
	same section has the same name.
	(riscv_elf_section_change_hook): New.  Try to emit a new mapping
	symbol if the section is changed.
	(riscv_mapping_state): Don't skip if the architecture is possibly
	changed and the new state is "code".
	(s_riscv_option): Keep track of `updated_riscv_subsets' and
	`riscv_opts.arch_is_default'.
	* config/tc-riscv.h (md_elf_section_change_hook): Define as
	`riscv_elf_section_change_hook'.
	(riscv_elf_section_change_hook): Declare.
	* testsuite/gas/riscv/mapping-01a.d: Reflect mapping symbols with
	ISA string.
	* testsuite/gas/riscv/mapping-02a.d: Likewise.
	* testsuite/gas/riscv/mapping-03a.d: Likewise.
	* testsuite/gas/riscv/mapping-04a.d: Likewise.
	* testsuite/gas/riscv/mapping-norelax-03a.d: Likewise.
	* testsuite/gas/riscv/mapping-norelax-04a.d: Likewise.

opcodes/ChangeLog:

	* riscv-dis.c (riscv_get_map_state): Minimum support of mapping
	symbols with ISA string without actually parsing the ISA string.
	The only purpose of this change is to pass the tests.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request Aug 11, 2022
The mapping symbols with ISA string is proposed to deal with so called
"ifunc issue".  It enables disassembling a certain range of the code with
a different architecture than the rest, even if conflicting.  This is useful
when there's "optimized" implementation is available but dynamically
switched only if a certain extension is available.

This commit implements the disassembler support to parse mapping symbols
with ISA string.

[1] Proposal: Extend .option directive for control enabled extensions on
specific code region,
riscv-non-isa/riscv-asm-manual#67

[2] Proposal: Add mapping symbol,
riscv-non-isa/riscv-elf-psabi-doc#196

This commit is based on Nelson Chu's proposal "RISC-V: Output mapping
symbols with ISA string once .option arch is used." but heavily modified to
reflect the intent of Kito's original proposal.  It is also made smarter so
that it no longer requires MAP_INSN_ARCH.

gas/ChangeLog:

	* testsuite/gas/riscv/option-arch-01a.d: Reflect the disassembler
	support of mapping symbols with ISA string.

opcodes/ChangeLog:

	* riscv-dis.c (initial_default_arch) Default architecture string if
	no ELF attributes are available.
	(default_arch): A copy of the default architecture string.
	(is_arch_mapping): New variable to keep track of whether the current
	architecture is deviced from a mapping symbol.
	(riscv_disassemble_insn): Update FPR names when a mapping symbol
	with ISA string is encountered.
	(riscv_get_map_state): Support mapping symbols with ISA string.
	Use `is_arch_mapping' to stop repeatedly parsing the default
	architecture.
	(riscv_get_disassembler): Safer architecture string handling.
	Copy the string to switch to the default while disassembling.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request Aug 11, 2022
The mapping symbols with ISA string is proposed to deal with so called
"ifunc issue".  It enables disassembling a certain range of the code with
a different architecture than the rest, even if conflicting.  This is useful
when there's "optimized" implementation is available but dynamically
switched only if a certain extension is available.

This commit implements the assembler support to emit mapping symbols with
ISA string (and partial disassembler support only to pass tests).

[1] Proposal: Extend .option directive for control enabled extensions on
specific code region,
riscv-non-isa/riscv-asm-manual#67

[2] Proposal: Add mapping symbol,
riscv-non-isa/riscv-elf-psabi-doc#196

This commit is based on Nelson Chu's proposal "RISC-V: Output mapping
symbols with ISA string once .option arch is used." but heavily modified to
reflect the intent of Kito's original proposal.  It is also made smarter so
that it no longer requires MAP_INSN_ARCH.

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add new field
	`arch_is_default' to keep track of whether the architecture is
	holding the default value.
	(updated_riscv_subsets) New variable to keep track of whether the
	architecture is possibly changed and inspected to emit proper
	mapping symbols.
	(make_mapping_symbol): Make mapping symbols with ISA string if
	necessary.  Don't emit the mapping symbol if the previous one in the
	same section has the same name.
	(riscv_elf_section_change_hook): New.  Try to emit a new mapping
	symbol if the section is changed.
	(riscv_mapping_state): Don't skip if the architecture is possibly
	changed and the new state is "code".
	(s_riscv_option): Keep track of `updated_riscv_subsets' and
	`riscv_opts.arch_is_default'.
	* config/tc-riscv.h (md_elf_section_change_hook): Define as
	`riscv_elf_section_change_hook'.
	(riscv_elf_section_change_hook): Declare.
	* testsuite/gas/riscv/mapping-01a.d: Reflect mapping symbols with
	ISA string.
	* testsuite/gas/riscv/mapping-02a.d: Likewise.
	* testsuite/gas/riscv/mapping-03a.d: Likewise.
	* testsuite/gas/riscv/mapping-04a.d: Likewise.
	* testsuite/gas/riscv/mapping-norelax-03a.d: Likewise.
	* testsuite/gas/riscv/mapping-norelax-04a.d: Likewise.

opcodes/ChangeLog:

	* riscv-dis.c (riscv_get_map_state): Minimum support of mapping
	symbols with ISA string without actually parsing the ISA string.
	The only purpose of this change is to pass the tests.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request Aug 11, 2022
The mapping symbols with ISA string is proposed to deal with so called
"ifunc issue".  It enables disassembling a certain range of the code with
a different architecture than the rest, even if conflicting.  This is useful
when there's "optimized" implementation is available but dynamically
switched only if a certain extension is available.

This commit implements the disassembler support to parse mapping symbols
with ISA string.

[1] Proposal: Extend .option directive for control enabled extensions on
specific code region,
riscv-non-isa/riscv-asm-manual#67

[2] Proposal: Add mapping symbol,
riscv-non-isa/riscv-elf-psabi-doc#196

This commit is based on Nelson Chu's proposal "RISC-V: Output mapping
symbols with ISA string once .option arch is used." but heavily modified to
reflect the intent of Kito's original proposal.  It is also made smarter so
that it no longer requires MAP_INSN_ARCH.

gas/ChangeLog:

	* testsuite/gas/riscv/option-arch-01a.d: Reflect the disassembler
	support of mapping symbols with ISA string.

opcodes/ChangeLog:

	* riscv-dis.c (initial_default_arch) Default architecture string if
	no ELF attributes are available.
	(default_arch): A copy of the default architecture string.
	(is_arch_mapping): New variable to keep track of whether the current
	architecture is deviced from a mapping symbol.
	(riscv_disassemble_insn): Update FPR names when a mapping symbol
	with ISA string is encountered.
	(riscv_get_map_state): Support mapping symbols with ISA string.
	Use `is_arch_mapping' to stop repeatedly parsing the default
	architecture.
	(riscv_get_disassembler): Safer architecture string handling.
	Copy the string to switch to the default while disassembling.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request Aug 11, 2022
The mapping symbols with ISA string is proposed to deal with so called
"ifunc issue".  It enables disassembling a certain range of the code with
a different architecture than the rest, even if conflicting.  This is useful
when there's "optimized" implementation is available but dynamically
switched only if a certain extension is available.

This commit implements the assembler support to emit mapping symbols with
ISA string (and partial disassembler support only to pass tests).

[1] Proposal: Extend .option directive for control enabled extensions on
specific code region,
riscv-non-isa/riscv-asm-manual#67

[2] Proposal: Add mapping symbol,
riscv-non-isa/riscv-elf-psabi-doc#196

This commit is based on Nelson Chu's proposal "RISC-V: Output mapping
symbols with ISA string once .option arch is used." but heavily modified to
reflect the intent of Kito's original proposal.  It is also made smarter so
that it no longer requires MAP_INSN_ARCH.

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add new field
	`arch_is_default' to keep track of whether the architecture is
	holding the default value.
	(updated_riscv_subsets) New variable to keep track of whether the
	architecture is possibly changed and inspected to emit proper
	mapping symbols.
	(make_mapping_symbol): Make mapping symbols with ISA string if
	necessary.  Don't emit the mapping symbol if the previous one in the
	same section has the same name.
	(riscv_elf_section_change_hook): New.  Try to emit a new mapping
	symbol if the section is changed.
	(riscv_mapping_state): Don't skip if the architecture is possibly
	changed and the new state is "code".
	(s_riscv_option): Keep track of `updated_riscv_subsets' and
	`riscv_opts.arch_is_default'.
	* config/tc-riscv.h (md_elf_section_change_hook): Define as
	`riscv_elf_section_change_hook'.
	(riscv_elf_section_change_hook): Declare.
	* testsuite/gas/riscv/mapping-01a.d: Reflect mapping symbols with
	ISA string.
	* testsuite/gas/riscv/mapping-02a.d: Likewise.
	* testsuite/gas/riscv/mapping-03a.d: Likewise.
	* testsuite/gas/riscv/mapping-04a.d: Likewise.
	* testsuite/gas/riscv/mapping-norelax-03a.d: Likewise.
	* testsuite/gas/riscv/mapping-norelax-04a.d: Likewise.

opcodes/ChangeLog:

	* riscv-dis.c (riscv_get_map_state): Minimum support of mapping
	symbols with ISA string without actually parsing the ISA string.
	The only purpose of this change is to pass the tests.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request Aug 11, 2022
The mapping symbols with ISA string is proposed to deal with so called
"ifunc issue".  It enables disassembling a certain range of the code with
a different architecture than the rest, even if conflicting.  This is useful
when there's "optimized" implementation is available but dynamically
switched only if a certain extension is available.

This commit implements the disassembler support to parse mapping symbols
with ISA string.

[1] Proposal: Extend .option directive for control enabled extensions on
specific code region,
riscv-non-isa/riscv-asm-manual#67

[2] Proposal: Add mapping symbol,
riscv-non-isa/riscv-elf-psabi-doc#196

This commit is based on Nelson Chu's proposal "RISC-V: Output mapping
symbols with ISA string once .option arch is used." but heavily modified to
reflect the intent of Kito's original proposal.  It is also made smarter so
that it no longer requires MAP_INSN_ARCH.

gas/ChangeLog:

	* testsuite/gas/riscv/option-arch-01a.d: Reflect the disassembler
	support of mapping symbols with ISA string.

opcodes/ChangeLog:

	* riscv-dis.c (initial_default_arch) Default architecture string if
	no ELF attributes are available.
	(default_arch): A copy of the default architecture string.
	(is_arch_mapping): New variable to keep track of whether the current
	architecture is deviced from a mapping symbol.
	(riscv_disassemble_insn): Update FPR names when a mapping symbol
	with ISA string is encountered.
	(riscv_get_map_state): Support mapping symbols with ISA string.
	Use `is_arch_mapping' to stop repeatedly parsing the default
	architecture.
	(riscv_get_disassembler): Safer architecture string handling.
	Copy the string to switch to the default while disassembling.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request Aug 11, 2022
The mapping symbols with ISA string is proposed to deal with so called
"ifunc issue".  It enables disassembling a certain range of the code with
a different architecture than the rest, even if conflicting.  This is useful
when there's "optimized" implementation is available but dynamically
switched only if a certain extension is available.

This commit implements the assembler support to emit mapping symbols with
ISA string (and partial disassembler support only to pass tests).

[1] Proposal: Extend .option directive for control enabled extensions on
specific code region,
riscv-non-isa/riscv-asm-manual#67

[2] Proposal: Add mapping symbol,
riscv-non-isa/riscv-elf-psabi-doc#196

This commit is based on Nelson Chu's proposal "RISC-V: Output mapping
symbols with ISA string once .option arch is used." but heavily modified to
reflect the intent of Kito's original proposal.  It is also made smarter so
that it no longer requires MAP_INSN_ARCH.

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add new field
	`arch_is_default' to keep track of whether the architecture is
	holding the default value.
	(updated_riscv_subsets) New variable to keep track of whether the
	architecture is possibly changed and inspected to emit proper
	mapping symbols.
	(make_mapping_symbol): Make mapping symbols with ISA string if
	necessary.  Don't emit the mapping symbol if the previous one in the
	same section has the same name.
	(riscv_elf_section_change_hook): New.  Try to emit a new mapping
	symbol if the section is changed.
	(riscv_mapping_state): Don't skip if the architecture is possibly
	changed and the new state is "code".
	(s_riscv_option): Keep track of `updated_riscv_subsets' and
	`riscv_opts.arch_is_default'.
	* config/tc-riscv.h (md_elf_section_change_hook): Define as
	`riscv_elf_section_change_hook'.
	(riscv_elf_section_change_hook): Declare.
	* testsuite/gas/riscv/mapping-01a.d: Reflect mapping symbols with
	ISA string.
	* testsuite/gas/riscv/mapping-02a.d: Likewise.
	* testsuite/gas/riscv/mapping-03a.d: Likewise.
	* testsuite/gas/riscv/mapping-04a.d: Likewise.
	* testsuite/gas/riscv/mapping-norelax-03a.d: Likewise.
	* testsuite/gas/riscv/mapping-norelax-04a.d: Likewise.

opcodes/ChangeLog:

	* riscv-dis.c (riscv_get_map_state): Minimum support of mapping
	symbols with ISA string without actually parsing the ISA string.
	The only purpose of this change is to pass the tests.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request Aug 11, 2022
The mapping symbols with ISA string is proposed to deal with so called
"ifunc issue".  It enables disassembling a certain range of the code with
a different architecture than the rest, even if conflicting.  This is useful
when there's "optimized" implementation is available but dynamically
switched only if a certain extension is available.

This commit implements the disassembler support to parse mapping symbols
with ISA string.

[1] Proposal: Extend .option directive for control enabled extensions on
specific code region,
riscv-non-isa/riscv-asm-manual#67

[2] Proposal: Add mapping symbol,
riscv-non-isa/riscv-elf-psabi-doc#196

This commit is based on Nelson Chu's proposal "RISC-V: Output mapping
symbols with ISA string once .option arch is used." but heavily modified to
reflect the intent of Kito's original proposal.  It is also made smarter so
that it no longer requires MAP_INSN_ARCH.

gas/ChangeLog:

	* testsuite/gas/riscv/option-arch-01a.d: Reflect the disassembler
	support of mapping symbols with ISA string.

opcodes/ChangeLog:

	* riscv-dis.c (initial_default_arch) Default architecture string if
	no ELF attributes are available.
	(default_arch): A copy of the default architecture string.
	(is_arch_mapping): New variable to keep track of whether the current
	architecture is deviced from a mapping symbol.
	(riscv_disassemble_insn): Update FPR names when a mapping symbol
	with ISA string is encountered.
	(riscv_get_map_state): Support mapping symbols with ISA string.
	Use `is_arch_mapping' to stop repeatedly parsing the default
	architecture.
	(riscv_get_disassembler): Safer architecture string handling.
	Copy the string to switch to the default while disassembling.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request Aug 11, 2022
The mapping symbols with ISA string is proposed to deal with so called
"ifunc issue".  It enables disassembling a certain range of the code with
a different architecture than the rest, even if conflicting.  This is useful
when there's "optimized" implementation is available but dynamically
switched only if a certain extension is available.

This commit implements the assembler support to emit mapping symbols with
ISA string (and partial disassembler support only to pass tests).

[1] Proposal: Extend .option directive for control enabled extensions on
specific code region,
riscv-non-isa/riscv-asm-manual#67

[2] Proposal: Add mapping symbol,
riscv-non-isa/riscv-elf-psabi-doc#196

This commit is based on Nelson Chu's proposal "RISC-V: Output mapping
symbols with ISA string once .option arch is used." but heavily modified to
reflect the intent of Kito's original proposal.  It is also made smarter so
that it no longer requires MAP_INSN_ARCH.

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add new field
	`arch_is_default' to keep track of whether the architecture is
	holding the default value.
	(updated_riscv_subsets) New variable to keep track of whether the
	architecture is possibly changed and inspected to emit proper
	mapping symbols.
	(make_mapping_symbol): Make mapping symbols with ISA string if
	necessary.  Don't emit the mapping symbol if the previous one in the
	same section has the same name.
	(riscv_elf_section_change_hook): New.  Try to emit a new mapping
	symbol if the section is changed.
	(riscv_mapping_state): Don't skip if the architecture is possibly
	changed and the new state is "code".
	(s_riscv_option): Keep track of `updated_riscv_subsets' and
	`riscv_opts.arch_is_default'.
	* config/tc-riscv.h (md_elf_section_change_hook): Define as
	`riscv_elf_section_change_hook'.
	(riscv_elf_section_change_hook): Declare.
	* testsuite/gas/riscv/mapping-01a.d: Reflect mapping symbols with
	ISA string.
	* testsuite/gas/riscv/mapping-02a.d: Likewise.
	* testsuite/gas/riscv/mapping-03a.d: Likewise.
	* testsuite/gas/riscv/mapping-04a.d: Likewise.
	* testsuite/gas/riscv/mapping-norelax-03a.d: Likewise.
	* testsuite/gas/riscv/mapping-norelax-04a.d: Likewise.

opcodes/ChangeLog:

	* riscv-dis.c (riscv_get_map_state): Minimum support of mapping
	symbols with ISA string without actually parsing the ISA string.
	The only purpose of this change is to pass the tests.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request Aug 11, 2022
The mapping symbols with ISA string is proposed to deal with so called
"ifunc issue".  It enables disassembling a certain range of the code with
a different architecture than the rest, even if conflicting.  This is useful
when there's "optimized" implementation is available but dynamically
switched only if a certain extension is available.

This commit implements the disassembler support to parse mapping symbols
with ISA string.

[1] Proposal: Extend .option directive for control enabled extensions on
specific code region,
riscv-non-isa/riscv-asm-manual#67

[2] Proposal: Add mapping symbol,
riscv-non-isa/riscv-elf-psabi-doc#196

This commit is based on Nelson Chu's proposal "RISC-V: Output mapping
symbols with ISA string once .option arch is used." but heavily modified to
reflect the intent of Kito's original proposal.  It is also made smarter so
that it no longer requires MAP_INSN_ARCH.

gas/ChangeLog:

	* testsuite/gas/riscv/option-arch-01a.d: Reflect the disassembler
	support of mapping symbols with ISA string.

opcodes/ChangeLog:

	* riscv-dis.c (initial_default_arch) Default architecture string if
	no ELF attributes are available.
	(default_arch): A copy of the default architecture string.
	(is_arch_mapping): New variable to keep track of whether the current
	architecture is deviced from a mapping symbol.
	(riscv_disassemble_insn): Update FPR names when a mapping symbol
	with ISA string is encountered.
	(riscv_get_map_state): Support mapping symbols with ISA string.
	Use `is_arch_mapping' to stop repeatedly parsing the default
	architecture.
	(riscv_get_disassembler): Safer architecture string handling.
	Copy the string to switch to the default while disassembling.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request Aug 11, 2022
The mapping symbols with ISA string is proposed to deal with so called
"ifunc issue".  It enables disassembling a certain range of the code with
a different architecture than the rest, even if conflicting.  This is useful
when there's "optimized" implementation is available but dynamically
switched only if a certain extension is available.

This commit implements the assembler support to emit mapping symbols with
ISA string (and partial disassembler support only to pass tests).

[1] Proposal: Extend .option directive for control enabled extensions on
specific code region,
riscv-non-isa/riscv-asm-manual#67

[2] Proposal: Add mapping symbol,
riscv-non-isa/riscv-elf-psabi-doc#196

This commit is based on Nelson Chu's proposal "RISC-V: Output mapping
symbols with ISA string once .option arch is used." but heavily modified to
reflect the intent of Kito's original proposal.  It is also made smarter so
that it no longer requires MAP_INSN_ARCH.

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add new field
	`arch_is_default' to keep track of whether the architecture is
	holding the default value.
	(updated_riscv_subsets) New variable to keep track of whether the
	architecture is possibly changed and inspected to emit proper
	mapping symbols.
	(make_mapping_symbol): Make mapping symbols with ISA string if
	necessary.  Don't emit the mapping symbol if the previous one in the
	same section has the same name.
	(riscv_elf_section_change_hook): New.  Try to emit a new mapping
	symbol if the section is changed.
	(riscv_mapping_state): Don't skip if the architecture is possibly
	changed and the new state is "code".
	(s_riscv_option): Keep track of `updated_riscv_subsets' and
	`riscv_opts.arch_is_default'.
	* config/tc-riscv.h (md_elf_section_change_hook): Define as
	`riscv_elf_section_change_hook'.
	(riscv_elf_section_change_hook): Declare.
	* testsuite/gas/riscv/mapping-01a.d: Reflect mapping symbols with
	ISA string.
	* testsuite/gas/riscv/mapping-02a.d: Likewise.
	* testsuite/gas/riscv/mapping-03a.d: Likewise.
	* testsuite/gas/riscv/mapping-04a.d: Likewise.
	* testsuite/gas/riscv/mapping-norelax-03a.d: Likewise.
	* testsuite/gas/riscv/mapping-norelax-04a.d: Likewise.

opcodes/ChangeLog:

	* riscv-dis.c (riscv_get_map_state): Minimum support of mapping
	symbols with ISA string without actually parsing the ISA string.
	The only purpose of this change is to pass the tests.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request Aug 11, 2022
The mapping symbols with ISA string is proposed to deal with so called
"ifunc issue".  It enables disassembling a certain range of the code with
a different architecture than the rest, even if conflicting.  This is useful
when there's "optimized" implementation is available but dynamically
switched only if a certain extension is available.

This commit implements the disassembler support to parse mapping symbols
with ISA string.

[1] Proposal: Extend .option directive for control enabled extensions on
specific code region,
riscv-non-isa/riscv-asm-manual#67

[2] Proposal: Add mapping symbol,
riscv-non-isa/riscv-elf-psabi-doc#196

This commit is based on Nelson Chu's proposal "RISC-V: Output mapping
symbols with ISA string once .option arch is used." but heavily modified to
reflect the intent of Kito's original proposal.  It is also made smarter so
that it no longer requires MAP_INSN_ARCH.

gas/ChangeLog:

	* testsuite/gas/riscv/option-arch-01a.d: Reflect the disassembler
	support of mapping symbols with ISA string.

opcodes/ChangeLog:

	* riscv-dis.c (initial_default_arch) Default architecture string if
	no ELF attributes are available.
	(default_arch): A copy of the default architecture string.
	(is_arch_mapping): New variable to keep track of whether the current
	architecture is deviced from a mapping symbol.
	(riscv_disassemble_insn): Update FPR names when a mapping symbol
	with ISA string is encountered.
	(riscv_get_map_state): Support mapping symbols with ISA string.
	Use `is_arch_mapping' to stop repeatedly parsing the default
	architecture.
	(riscv_get_disassembler): Safer architecture string handling.
	Copy the string to switch to the default while disassembling.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request Aug 11, 2022
The mapping symbols with ISA string is proposed to deal with so called
"ifunc issue".  It enables disassembling a certain range of the code with
a different architecture than the rest, even if conflicting.  This is useful
when there's "optimized" implementation is available but dynamically
switched only if a certain extension is available.

This commit implements the assembler support to emit mapping symbols with
ISA string (and partial disassembler support only to pass tests).

[1] Proposal: Extend .option directive for control enabled extensions on
specific code region,
riscv-non-isa/riscv-asm-manual#67

[2] Proposal: Add mapping symbol,
riscv-non-isa/riscv-elf-psabi-doc#196

This commit is based on Nelson Chu's proposal "RISC-V: Output mapping
symbols with ISA string once .option arch is used." but heavily modified to
reflect the intent of Kito's original proposal.  It is also made smarter so
that it no longer requires MAP_INSN_ARCH.

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add new field
	`arch_is_default' to keep track of whether the architecture is
	holding the default value.
	(updated_riscv_subsets) New variable to keep track of whether the
	architecture is possibly changed and inspected to emit proper
	mapping symbols.
	(make_mapping_symbol): Make mapping symbols with ISA string if
	necessary.  Don't emit the mapping symbol if the previous one in the
	same section has the same name.
	(riscv_elf_section_change_hook): New.  Try to emit a new mapping
	symbol if the section is changed.
	(riscv_mapping_state): Don't skip if the architecture is possibly
	changed and the new state is "code".
	(s_riscv_option): Keep track of `updated_riscv_subsets' and
	`riscv_opts.arch_is_default'.
	* config/tc-riscv.h (md_elf_section_change_hook): Define as
	`riscv_elf_section_change_hook'.
	(riscv_elf_section_change_hook): Declare.
	* testsuite/gas/riscv/mapping-01a.d: Reflect mapping symbols with
	ISA string.
	* testsuite/gas/riscv/mapping-02a.d: Likewise.
	* testsuite/gas/riscv/mapping-03a.d: Likewise.
	* testsuite/gas/riscv/mapping-04a.d: Likewise.
	* testsuite/gas/riscv/mapping-norelax-03a.d: Likewise.
	* testsuite/gas/riscv/mapping-norelax-04a.d: Likewise.

opcodes/ChangeLog:

	* riscv-dis.c (riscv_get_map_state): Minimum support of mapping
	symbols with ISA string without actually parsing the ISA string.
	The only purpose of this change is to pass the tests.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request Aug 11, 2022
The mapping symbols with ISA string is proposed to deal with so called
"ifunc issue".  It enables disassembling a certain range of the code with
a different architecture than the rest, even if conflicting.  This is useful
when there's "optimized" implementation is available but dynamically
switched only if a certain extension is available.

This commit implements the disassembler support to parse mapping symbols
with ISA string.

[1] Proposal: Extend .option directive for control enabled extensions on
specific code region,
riscv-non-isa/riscv-asm-manual#67

[2] Proposal: Add mapping symbol,
riscv-non-isa/riscv-elf-psabi-doc#196

This commit is based on Nelson Chu's proposal "RISC-V: Output mapping
symbols with ISA string once .option arch is used." but heavily modified to
reflect the intent of Kito's original proposal.  It is also made smarter so
that it no longer requires MAP_INSN_ARCH.

gas/ChangeLog:

	* testsuite/gas/riscv/option-arch-01a.d: Reflect the disassembler
	support of mapping symbols with ISA string.

opcodes/ChangeLog:

	* riscv-dis.c (initial_default_arch) Default architecture string if
	no ELF attributes are available.
	(default_arch): A copy of the default architecture string.
	(is_arch_mapping): New variable to keep track of whether the current
	architecture is deviced from a mapping symbol.
	(riscv_disassemble_insn): Update FPR names when a mapping symbol
	with ISA string is encountered.
	(riscv_get_map_state): Support mapping symbols with ISA string.
	Use `is_arch_mapping' to stop repeatedly parsing the default
	architecture.
	(riscv_get_disassembler): Safer architecture string handling.
	Copy the string to switch to the default while disassembling.
wangliu-iscas pushed a commit to wangliu-iscas/binutils-gdb that referenced this pull request Sep 30, 2022
RISC-V Psabi pr196,
riscv-non-isa/riscv-elf-psabi-doc#196

bfd/
    * elfxx-riscv.c (riscv_release_subset_list): Free arch_str if needed.
    (riscv_copy_subset_list): Copy arch_str as well.
    * elfxx-riscv.h (riscv_subset_list_t): Store arch_str for each subset list.
gas/
    * config/tc-riscv.c (riscv_reset_subsets_list_arch_str): Update the
    architecture string in the subset_list.
    (riscv_set_arch): Call riscv_reset_subsets_list_arch_str after parsing new
    architecture string.
    (s_riscv_option): Likewise.
    (need_arch_map_symbol): New boolean, used to indicate if .option
    directives do affect instructions.
    (make_mapping_symbol): New boolean parameter reset_seg_arch_str.  Need to
    generate $x+arch for MAP_INSN, and then store it into tc_segment_info_data
    if reset_seg_arch_str is true.
    (riscv_mapping_state): Decide if we need to add $x+arch for MAP_INSN.  For
    now, only add $x+arch if the architecture strings in subset list and segment
    are different.  Besides, always add $x+arch at the start of section, and do
    not add $x+arch for code alignment, since rvc for alignment can be judged
    from addend of R_RISCV_ALIGN.
    (riscv_remove_mapping_symbol): If current and previous mapping symbol have
    same value, then remove the current $x only if the previous is $x+arch;
    Otherwise, always remove previous.
    (riscv_add_odd_padding_symbol): Updated.
    (riscv_check_mapping_symbols): Don't need to add any $x+arch if
    need_arch_map_symbol is false, so changed them to $x.
    (riscv_frag_align_code): Updated since riscv_mapping_state is changed.
    (riscv_init_frag): Likewise.
    (s_riscv_insn): Likewise.
    (riscv_elf_final_processing): Call riscv_release_subset_list to release
    riscv_subsets, rather than only release arch_str in the riscv_write_out_attrs.
    (riscv_write_out_attrs): No need to call riscv_arch_str, just get arch_str
    from riscv_subsets.
    * config/tc-riscv.h (riscv_segment_info_type): Record current $x+arch mapping
    symbol of each segment.

    * testsuite/gas/riscv/mapping-0*: Merged and replaced by mapping.s.
    * testsuite/gas/riscv/mapping.s: New testcase, to test most of the cases in
    one file.
    * testsuite/gas/riscv/mapping-symbols.d: Likewise.
    * testsuite/gas/riscv/mapping-dis.d: Likewise.
    * testsuite/gas/riscv/mapping-non-arch.s: New testcase for the case that
    does need any $x+arch.
    * testsuite/gas/riscv/mapping-non-arch.d: Likewise.
    * testsuite/gas/riscv/option-arch-01a.d: Updated.
opcodes/
    * riscv-dis.c (riscv_disassemble_insn): Set riscv_fpr_names back to
    riscv_fpr_names_abi or riscv_fpr_names_numeric when zfinx is disabled
    for some specfic code region.
    (riscv_get_map_state): Recognized mapping symbols $x+arch, and then reset
    the architecture string once the ISA is different.
wangliu-iscas pushed a commit to wangliu-iscas/binutils-gdb that referenced this pull request Sep 30, 2022
RISC-V Psabi pr196,
riscv-non-isa/riscv-elf-psabi-doc#196

bfd/
    * elfxx-riscv.c (riscv_release_subset_list): Free arch_str if needed.
    (riscv_copy_subset_list): Copy arch_str as well.
    * elfxx-riscv.h (riscv_subset_list_t): Store arch_str for each subset list.
gas/
    * config/tc-riscv.c (riscv_reset_subsets_list_arch_str): Update the
    architecture string in the subset_list.
    (riscv_set_arch): Call riscv_reset_subsets_list_arch_str after parsing new
    architecture string.
    (s_riscv_option): Likewise.
    (need_arch_map_symbol): New boolean, used to indicate if .option
    directives do affect instructions.
    (make_mapping_symbol): New boolean parameter reset_seg_arch_str.  Need to
    generate $x+arch for MAP_INSN, and then store it into tc_segment_info_data
    if reset_seg_arch_str is true.
    (riscv_mapping_state): Decide if we need to add $x+arch for MAP_INSN.  For
    now, only add $x+arch if the architecture strings in subset list and segment
    are different.  Besides, always add $x+arch at the start of section, and do
    not add $x+arch for code alignment, since rvc for alignment can be judged
    from addend of R_RISCV_ALIGN.
    (riscv_remove_mapping_symbol): If current and previous mapping symbol have
    same value, then remove the current $x only if the previous is $x+arch;
    Otherwise, always remove previous.
    (riscv_add_odd_padding_symbol): Updated.
    (riscv_check_mapping_symbols): Don't need to add any $x+arch if
    need_arch_map_symbol is false, so changed them to $x.
    (riscv_frag_align_code): Updated since riscv_mapping_state is changed.
    (riscv_init_frag): Likewise.
    (s_riscv_insn): Likewise.
    (riscv_elf_final_processing): Call riscv_release_subset_list to release
    riscv_subsets, rather than only release arch_str in the riscv_write_out_attrs.
    (riscv_write_out_attrs): No need to call riscv_arch_str, just get arch_str
    from riscv_subsets.
    * config/tc-riscv.h (riscv_segment_info_type): Record current $x+arch mapping
    symbol of each segment.

    * testsuite/gas/riscv/mapping-0*: Merged and replaced by mapping.s.
    * testsuite/gas/riscv/mapping.s: New testcase, to test most of the cases in
    one file.
    * testsuite/gas/riscv/mapping-symbols.d: Likewise.
    * testsuite/gas/riscv/mapping-dis.d: Likewise.
    * testsuite/gas/riscv/mapping-non-arch.s: New testcase for the case that
    does need any $x+arch.
    * testsuite/gas/riscv/mapping-non-arch.d: Likewise.
    * testsuite/gas/riscv/option-arch-01a.d: Updated.
opcodes/
    * riscv-dis.c (riscv_disassemble_insn): Set riscv_fpr_names back to
    riscv_fpr_names_abi or riscv_fpr_names_numeric when zfinx is disabled
    for some specfic code region.
    (riscv_get_map_state): Recognized mapping symbols $x+arch, and then reset
    the architecture string once the ISA is different.
wangliu-iscas pushed a commit to wangliu-iscas/binutils-gdb that referenced this pull request Sep 30, 2022
RISC-V Psabi pr196,
riscv-non-isa/riscv-elf-psabi-doc#196

bfd/
    * elfxx-riscv.c (riscv_release_subset_list): Free arch_str if needed.
    (riscv_copy_subset_list): Copy arch_str as well.
    * elfxx-riscv.h (riscv_subset_list_t): Store arch_str for each subset list.
gas/
    * config/tc-riscv.c (riscv_reset_subsets_list_arch_str): Update the
    architecture string in the subset_list.
    (riscv_set_arch): Call riscv_reset_subsets_list_arch_str after parsing new
    architecture string.
    (s_riscv_option): Likewise.
    (need_arch_map_symbol): New boolean, used to indicate if .option
    directives do affect instructions.
    (make_mapping_symbol): New boolean parameter reset_seg_arch_str.  Need to
    generate $x+arch for MAP_INSN, and then store it into tc_segment_info_data
    if reset_seg_arch_str is true.
    (riscv_mapping_state): Decide if we need to add $x+arch for MAP_INSN.  For
    now, only add $x+arch if the architecture strings in subset list and segment
    are different.  Besides, always add $x+arch at the start of section, and do
    not add $x+arch for code alignment, since rvc for alignment can be judged
    from addend of R_RISCV_ALIGN.
    (riscv_remove_mapping_symbol): If current and previous mapping symbol have
    same value, then remove the current $x only if the previous is $x+arch;
    Otherwise, always remove previous.
    (riscv_add_odd_padding_symbol): Updated.
    (riscv_check_mapping_symbols): Don't need to add any $x+arch if
    need_arch_map_symbol is false, so changed them to $x.
    (riscv_frag_align_code): Updated since riscv_mapping_state is changed.
    (riscv_init_frag): Likewise.
    (s_riscv_insn): Likewise.
    (riscv_elf_final_processing): Call riscv_release_subset_list to release
    riscv_subsets, rather than only release arch_str in the riscv_write_out_attrs.
    (riscv_write_out_attrs): No need to call riscv_arch_str, just get arch_str
    from riscv_subsets.
    * config/tc-riscv.h (riscv_segment_info_type): Record current $x+arch mapping
    symbol of each segment.

    * testsuite/gas/riscv/mapping-0*: Merged and replaced by mapping.s.
    * testsuite/gas/riscv/mapping.s: New testcase, to test most of the cases in
    one file.
    * testsuite/gas/riscv/mapping-symbols.d: Likewise.
    * testsuite/gas/riscv/mapping-dis.d: Likewise.
    * testsuite/gas/riscv/mapping-non-arch.s: New testcase for the case that
    does need any $x+arch.
    * testsuite/gas/riscv/mapping-non-arch.d: Likewise.
    * testsuite/gas/riscv/option-arch-01a.d: Updated.
opcodes/
    * riscv-dis.c (riscv_disassemble_insn): Set riscv_fpr_names back to
    riscv_fpr_names_abi or riscv_fpr_names_numeric when zfinx is disabled
    for some specfic code region.
    (riscv_get_map_state): Recognized mapping symbols $x+arch, and then reset
    the architecture string once the ISA is different.
wangliu-iscas pushed a commit to wangliu-iscas/binutils-gdb that referenced this pull request Sep 30, 2022
RISC-V Psabi pr196,
riscv-non-isa/riscv-elf-psabi-doc#196

bfd/
    * elfxx-riscv.c (riscv_release_subset_list): Free arch_str if needed.
    (riscv_copy_subset_list): Copy arch_str as well.
    * elfxx-riscv.h (riscv_subset_list_t): Store arch_str for each subset list.
gas/
    * config/tc-riscv.c (riscv_reset_subsets_list_arch_str): Update the
    architecture string in the subset_list.
    (riscv_set_arch): Call riscv_reset_subsets_list_arch_str after parsing new
    architecture string.
    (s_riscv_option): Likewise.
    (need_arch_map_symbol): New boolean, used to indicate if .option
    directives do affect instructions.
    (make_mapping_symbol): New boolean parameter reset_seg_arch_str.  Need to
    generate $x+arch for MAP_INSN, and then store it into tc_segment_info_data
    if reset_seg_arch_str is true.
    (riscv_mapping_state): Decide if we need to add $x+arch for MAP_INSN.  For
    now, only add $x+arch if the architecture strings in subset list and segment
    are different.  Besides, always add $x+arch at the start of section, and do
    not add $x+arch for code alignment, since rvc for alignment can be judged
    from addend of R_RISCV_ALIGN.
    (riscv_remove_mapping_symbol): If current and previous mapping symbol have
    same value, then remove the current $x only if the previous is $x+arch;
    Otherwise, always remove previous.
    (riscv_add_odd_padding_symbol): Updated.
    (riscv_check_mapping_symbols): Don't need to add any $x+arch if
    need_arch_map_symbol is false, so changed them to $x.
    (riscv_frag_align_code): Updated since riscv_mapping_state is changed.
    (riscv_init_frag): Likewise.
    (s_riscv_insn): Likewise.
    (riscv_elf_final_processing): Call riscv_release_subset_list to release
    riscv_subsets, rather than only release arch_str in the riscv_write_out_attrs.
    (riscv_write_out_attrs): No need to call riscv_arch_str, just get arch_str
    from riscv_subsets.
    * config/tc-riscv.h (riscv_segment_info_type): Record current $x+arch mapping
    symbol of each segment.

    * testsuite/gas/riscv/mapping-0*: Merged and replaced by mapping.s.
    * testsuite/gas/riscv/mapping.s: New testcase, to test most of the cases in
    one file.
    * testsuite/gas/riscv/mapping-symbols.d: Likewise.
    * testsuite/gas/riscv/mapping-dis.d: Likewise.
    * testsuite/gas/riscv/mapping-non-arch.s: New testcase for the case that
    does need any $x+arch.
    * testsuite/gas/riscv/mapping-non-arch.d: Likewise.
    * testsuite/gas/riscv/option-arch-01a.d: Updated.
opcodes/
    * riscv-dis.c (riscv_disassemble_insn): Set riscv_fpr_names back to
    riscv_fpr_names_abi or riscv_fpr_names_numeric when zfinx is disabled
    for some specfic code region.
    (riscv_get_map_state): Recognized mapping symbols $x+arch, and then reset
    the architecture string once the ISA is different.
saagarjha pushed a commit to ahjragaas/binutils-gdb that referenced this pull request Oct 28, 2022
RISC-V Psabi pr196,
riscv-non-isa/riscv-elf-psabi-doc#196

bfd/
    * elfxx-riscv.c (riscv_release_subset_list): Free arch_str if needed.
    (riscv_copy_subset_list): Copy arch_str as well.
    * elfxx-riscv.h (riscv_subset_list_t): Store arch_str for each subset list.
gas/
    * config/tc-riscv.c (riscv_reset_subsets_list_arch_str): Update the
    architecture string in the subset_list.
    (riscv_set_arch): Call riscv_reset_subsets_list_arch_str after parsing new
    architecture string.
    (s_riscv_option): Likewise.
    (need_arch_map_symbol): New boolean, used to indicate if .option
    directives do affect instructions.
    (make_mapping_symbol): New boolean parameter reset_seg_arch_str.  Need to
    generate $x+arch for MAP_INSN, and then store it into tc_segment_info_data
    if reset_seg_arch_str is true.
    (riscv_mapping_state): Decide if we need to add $x+arch for MAP_INSN.  For
    now, only add $x+arch if the architecture strings in subset list and segment
    are different.  Besides, always add $x+arch at the start of section, and do
    not add $x+arch for code alignment, since rvc for alignment can be judged
    from addend of R_RISCV_ALIGN.
    (riscv_remove_mapping_symbol): If current and previous mapping symbol have
    same value, then remove the current $x only if the previous is $x+arch;
    Otherwise, always remove previous.
    (riscv_add_odd_padding_symbol): Updated.
    (riscv_check_mapping_symbols): Don't need to add any $x+arch if
    need_arch_map_symbol is false, so changed them to $x.
    (riscv_frag_align_code): Updated since riscv_mapping_state is changed.
    (riscv_init_frag): Likewise.
    (s_riscv_insn): Likewise.
    (riscv_elf_final_processing): Call riscv_release_subset_list to release
    subset_list of riscv_rps_as, rather than only release arch_str in the
    riscv_write_out_attrs.
    (riscv_write_out_attrs): No need to call riscv_arch_str, just get arch_str
    from subset_list of riscv_rps_as.
    * config/tc-riscv.h (riscv_segment_info_type): Record current $x+arch mapping
    symbol of each segment.
    * testsuite/gas/riscv/mapping-0*: Merged and replaced by mapping.s.
    * testsuite/gas/riscv/mapping.s: New testcase, to test most of the cases in
    one file.
    * testsuite/gas/riscv/mapping-symbols.d: Likewise.
    * testsuite/gas/riscv/mapping-dis.d: Likewise.
    * testsuite/gas/riscv/mapping-non-arch.s: New testcase for the case that
    does need any $x+arch.
    * testsuite/gas/riscv/mapping-non-arch.d: Likewise.
    * testsuite/gas/riscv/option-arch-01a.d: Updated.
opcodes/
    * riscv-dis.c (riscv_disassemble_insn): Set riscv_fpr_names back to
    riscv_fpr_names_abi or riscv_fpr_names_numeric when zfinx is disabled
    for some specfic code region.
    (riscv_get_map_state): Recognized mapping symbols $x+arch, and then reset
    the architecture string once the ISA is different.
@kito-cheng
Copy link
Collaborator Author

Changes:

  • Rebase
  • Wording and grammar tweaks.

@kito-cheng
Copy link
Collaborator Author

We have landed mapping symbol on binutils site and having patch under review on LLVM land (https://reviews.llvm.org/D137417).

@MaskRay Do you mind give a blessing to this PR? :)

riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Outdated Show resolved Hide resolved
riscv-elf.adoc Outdated Show resolved Hide resolved
@kito-cheng
Copy link
Collaborator Author

Changes:

  • Rebase.
  • Mention mapping symbol has type STT_NOTYPE, binding STB_LOCAL, and size zero.
  • Wording tweaks.

@kito-cheng
Copy link
Collaborator Author

Got approve from LLD and binutils maintainer, gonna merge this :)

- What is Mapping Symbol?

This proposal add a new special symbol class, called mapping symbol,
those symbol are used for assist disassembler to having better knowledge
of binary, it could be used for distinguish the code and data region.

Here is two kind of mapping symbol: data and instruction, and both
having an optional extra part in the symbol name for carry extra information.

Symbol Name | Meaning
:---------- | :-----------------------------------------------------------
$d          | Start of a sequence of data.
$x          | Start of a sequence of instructions.
$x<ISA>     | Start of a sequence of instructions with <ISA> extension.

Mapping symbol are also used by other ISA for same purpose, like ARM, AArch64,
C-sky and nds32.

- Data Mapping Symbol

Data mapping symbol could having an extra length information to present
the orignal data layout.

e.g.
```
.foo:
        .word 10
        .word 20
```

Without mapping symbol:
```
$ riscv64-unknown-elf-gcc foo.s -c
$ riscv64-unknown-elf-gcc foo.o -c

Disassembly of section .text:

0000000000000000 <.foo>:
   0:   000a                    c.slli  zero,0x2
   2:   0000                    unimp
   4:   0014                    0x14
        ...
```

With mapping symbol:
```
Disassembly of section .text:

00000000 <.foo>:                                      # $d insert here.
   0:   0000000a        .word   0x0000000a
   4:   00000014        .word   0x00000014
```

- Instruction Mapping Symbol

Instruction mapping symbol with extra ISA info could also used for
ifunc, e.g. library are built with `rv64gc`, but few functions
like memcpy provide two version, one built with `rv64gc`, and one built with
`rv64gcv`, and select by ifunc mechanism at run-time; however the arch
attribute is recording for minimal execution environment requirement, so the ISA
information from arch attribute isn't enough for disassembler to disassemble the
`rv64gcv` version correctly.
@kito-cheng kito-cheng merged commit 6cda892 into master Mar 14, 2023
@kito-cheng kito-cheng deleted the mapping-symbol branch March 14, 2023 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants