Skip to content

riscv_dis_generics

Tsukasa OI edited this page Aug 30, 2022 · 3 revisions

Disassembler: Implement "generic subsets"

Requires

Issue Solved

Issue

It fixes zext.h instruction disassembly on RV{32,64}_Zbb_Zbkb configuration.

On RV32, zext.h on Zbb is a specialized form of pack (Zbkb). On RV64, zext.h on Zbb is a specialized form of packw (Zbkb).

Note that, if Zbkb extension is disabled, zext.h is a single instruction with no generalized forms.

When both Zbb and Zbkb extensions are enabled and a zext.h instruction is disassembled with -M no-aliases, it should print a non-alias instruction (that is, either pack or packw).

# Input source file
_start:
    zext.h a0, a1   # Specialized form on Zbb
# Expected (RV32_Zbb_Zbkb, objdump -d -M no-aliases)
80000028 <_start>:
80000028:	0805c533          	pack	a0,a1,zero

However, it prints an alias.

# Actual (RV32_Zbb_Zbkb, objdump -d -M no-aliases)
80000028 <_start>:
80000028:	0805c533          	zext.h	a0,a1

Note that, depending on -march, an alias is no longer an alias (as I noted earlier).

# Expected/Actual (RV32_Zbb, objdump -d -M no-aliases)
80000028 <_start>:
80000028:	0805c533          	zext.h	a0,a1

In general, this kind of issue occurs when:

  1. There are two instructions
    (one of them is a specialized form of another).
  2. Those requirements are different (separate) but can co-exist.

Because of 2., both instructions cannot be simple aliases (INSN_ALIAS cannot be used). However on non-alias instructions, if a match is found, riscv_disassemble_insn thinks this is it and quits too early.

Because zext.h is declared before pack and packw, generalized forms are not matched for zext.h.

Resolution

As a solution, I propose a new concept: generic subsets.

For instructions with INSN_GENERICS, opcode matching cannot early-quit. Instead it searches an instruction with:

  • Longest mask (by default; when -M no-aliases is not specified)
  • Shortest mask (when -M no-aliases is specified)

Length of the mask equals its population count. More one bits on the mask means more specialized that instruction is.

It fixes disassembler on following instructions and configurations:

  • zext.h (Zbb) ⇔ pack (Zbkb; RV32)
  • zext.h (Zbb) ⇔ packw (Zbkb; RV64)

Note that INSN_GENERICS (new flag in PATCH 1) must be set both on specialized and generic forms. In the example above, those instructions require INSN_GENERICS:

  • zext.h (two XLEN-specific forms)
  • pack
  • packw

This concept can be used to following instruction pairs where the same issue can occur once non-ratified instruction is ratified.

Implemented Proposed (not ratified yet)
orc.b (Zbb) gorci
brev8 (Zbkb) grevi
zip (Zbkb) shfli
unzip (Zbkb) unshfli
rev8 (Zbb/Zbkb) grevi
Clone this wiki locally