Skip to content

Latest commit

 

History

History
163 lines (127 loc) · 7.15 KB

element_groups.adoc

File metadata and controls

163 lines (127 loc) · 7.15 KB

RISC-V Vector Element Groups

Concept - not part of any standard

Some vector instructions treat operands as a vector of one or more element groups, where each element group is a fixed number of elements. For example, complex numbers can be viewed as a two-element group (one real element and one imaginary element). As another example, the SHA-256 cryptographic instructions operate on 128-bit values represented as a 4-element group of 32-bit elements.

This section describes recommendations and terminology for generic instruction set design for vector instructions that operate on element groups.

Note
Element groups effectively replace the EDIV concept that was discussed before ratification of the vector spec, but removed from the ratified spec. Element groups are more flexible, supporting a wider range of group sizes, and avoid having to increase supported SEW for larger groups.

Element Group Size

The element group size (EGS) is the number of elements in one group, and must be a power-of-two (POT).

Note
Support for non-POT EGS was considered but causes many practical complications and so has been dropped. Error checking for vl is a little more difficult. For LMUL>1, non-POT EGSs will result in groups straddling the individual vector registers in a vector register group. Non-POT EGS can also cause large increases in the lowest-common-multiple of element group sizes, which adds constraints to vl setting in order to avoid splitting an element group across stripmine iterations in vector-length-agnostic code.

The element group size is statically encoded in the instruction, often implicitly as part of the opcode.

Note
The vector instructions in the base V vector ISA can be viewed as all having an element group size of 1 for all operands statically encoded in the instruction.
Note
Many operations only make sense with a certain number of elements per group (e.g., complex operations require a element group size of 2 and SHA-256 requires an element group size of 4).
Note
Possible future vector instructions that might operate on variable-sized element groups are not considered here.

Setting vl

Each source and destination operand to a vector instruction might be defined as either a single element group or a vector of element groups. When an operand is a vector of element groups, the vl setting must correspond to an integer multiple of the element group size, with other values of vl raising an illegal instruction exception.

Note
More complex vector instructions could potentially have different non-unit vector lengths for different operands, but these are not considered here.
Note
For example, a SHA-256 instruction would require that vl is a multiple of 4.

When element group instructions are present, an additional constraint is placed on the setting of vl based on an AVL value (Section 6.3 of vector spec 1.0). EGSMAX is the largest EGS supported by the implementation. When AVL > VLMAX, the value of vl must be set to either VLMAX or a positive integer multiple of EGSMAX.

Note
As the base vector extension only has element group size of 1, this constraint is backwards-compatible.
Note
This constraint prevents element groups being broken across stripmining iterations in vector-length-agnostic code when a VLMAX-size vector would otherwise be able to accomodate a whole number of element groups.
Note
When the SEW and LMUL settings cause VLMAX to be smaller than EGSMAX on a certain machine, then vl is set to VLMAX. If an attempt is made to execute an element-group instruction with EGS > vl=VLMAX, an illegal instruction exception will be raised on that instruction. Other element-group instructions with EGS ≤ vl=VLMAX will proceed.
Note
If EEW is encoded statically in the instruction, or if an instruction has multiple operands containing vectors of element groups with different EEW an appropriate SEW must be chosen for vsetvl instructions.
Note
Additional constraints may be required for some element group instructions to ensure legal length values for all operands.

Determining EEW

The vtype SEW can be used to indicate or calculate the effective element size (EEW) of one or more operands of an element group instruction. Where the operand is an element group, SEW and EEW refer to the number of bits in each individual element within a group not the number of bits in the group as a whole.

Alternatively, the opcode might encode EEW of all operands statically and ignore the value of SEW when the operation only makes sense for a single size on each operand.

Note
Many operations are only defined for one EEW, e.g., SHA-256 requires EEW=32. Encoding EEWs statically in the instruction removes a dynamic dependency on the SEW value and the need to check for errors in SEW values. However, ignoring SEW also prevents reuse of the static opcode with a different dynamic SEW, and in many cases, the SEW setting will be needed for regular vector instructions used to process the individual elements in the vector.

Determining EMUL

The vtype LMUL setting can be used to indicate or calculate the effective length multiplier (EMUL) for one or more operands. Element group instructions tend to exhibit a much wider range of relationships between various operand EEW/EMUL values. For example, an instruction might take a vector of length N of 4-element groups with EEW=8b and reduce each group to produce a vector length N of 1-element groups with EEW=32b. In this case, the input and output EMUL values are equal even though the EEW settings differ by a factor of 4.

Each source and destination operand to a vector instruction may have a different element group size, different EMUL, and/or different EEW.

Element Group Width

The element group width (EGW) is the number of bits in the element group as a whole. For example, the proposed SHA-256 instructions operate on an EGW of 128, with EGS=4 and EEW=32. It is possible to use LMUL to concatenate multiple vector registers together to support larger EGW>VLEN.

Note
Implementations may choose not to support instructions with EGW>VLEN due to implementation complexity.
Note
If software using large EGW instructions wants to be portable across a range of implementations, some of which may have VLEN<EGW and hence require LMUL>1, then software can only use a subset of the architectural registers. Profiles can set minimum VLEN requirements to inform authors of portable software.
Note
Element group operations by their nature will gather data from across a wider portion of a vector datapath than regular vector instructions. Some element group instructions might allow temporal execution of individual element operations in a larger group, while others will require all EGW bits of a group to be presented to a functional unit at the same time.

Masking

Some element-group instructions might not support masking. Some element-group instructions might support element-level masking. Other instructions might define mask behavior in terms of a mask element group (e.g., update destination element group if any or all mask bits in mask element group are set, and/or update one or more destination mask values in a mask element group on basis of element group predicate).