-
Notifications
You must be signed in to change notification settings - Fork 43
Are shuffle's lane indices dynamic? #33
Comments
Duh... What I was missing is that Honestly, I think the spec should be a bit more idiot proof here, by:
|
This makes it clear everywhere that `ImmLaneIdx` is an immediate operand and also makes the nomenclature for immediate operands consistent (both `ImmByte` and `ImmLaneIdx` start with the prefix `Imm`) . See WebAssembly#33 .
So it appears that So I am re-opening this for clarification. |
@billbudge how does V8 implement these? |
Duh again. So yeah, the lane indices are dynamic, but that's ok because the |
So... no... the https://github.com/WebAssembly/simd/blob/master/proposals/simd/BinarySIMD.md document says that these operands are immediate mode arguments... And that the operands for extract lanes are immediates as well... I'm leaving this open until someone clarifies and the spec is made idiot proof. To check whether I mean, is it me, or is the spec currently unnecessarily ambiguous ? If instead of all that I would just read |
They are immediate mode arguments, and so are the arguments to hardware shuffle operations (at least on some popular platforms). I don't know why the wording in the spec is the way it is, but you can definitely try submitting a PR to correct it :) |
V8's implementation treats the lane index vector as an immediate value. However, there's a strong case to add the dynamic (dynamic shuffle held in a vector register) shuffle. See #24 |
For what it's worth, there has been a PR that closes this issue open for a long time (#34), who should I ping to figure out what's required for it to make progress ?
There has also been a PR open to address that for a long time and that hasn't seen much progress: #30 . |
@aretmr @PeterJensen are the SIMD champions, perhaps they have thoughts about this. |
* Rename LaneIdx{2,4,8,16,32} to ImmLaneIdx{2,4,8,16,32} This makes it clear everywhere that `ImmLaneIdx` is an immediate operand and also makes the nomenclature for immediate operands consistent (both `ImmByte` and `ImmLaneIdx` start with the prefix `Imm`) . See #33 . * further clarify the representation hierarchies and immediate mode operands * memarg is an immediate mode argument * reference scalar load/stores spec * Fix typo: floating-point value to integer-value * Fix memarg * Fix language in replace value * Update "hierarchy" of types paragraph
IIRC this was fixed by #34 , thank you all! |
Reading the spec:
it appears that the lane indices argument for the
shuffle
intrinsic is a non-immediate mode argument.If my memory doesn't fail me:
LLVM's
shufflevector
requires compile-time indices.Most architectures do not support shuffling vector lanes of arbitrary width with dynamic indices:
That is, if the user is shuffling a
v8x16
, we can use those instructions directly to shuffle with dynamic indices without issues.If the user attempts to shuffle a
v16x8
,v32x4
, etc. what's the best machine code that we can generate for dynamic indices?If I remember correctly, generating the appropriate sequence of shuffle bytes instructions requires knowing the lane indices during machine code generation. If these aren't known, the only thing a code generator could do AFAICT is fall back to scalar code, that is, copy the vectors to memory, do a scalar shuffle there e.g. by copying into a third vector, and copy the result of this third vector back to a vector register. This sounds like a huge performance cliff to me.
What am I missing? How can we generate efficient vector shuffles from dynamic lane indices?
The text was updated successfully, but these errors were encountered: