Are shuffle's lane indices dynamic? #33

gnzlbg · 2018-08-07T15:12:35Z

Reading the spec:

Shuffle lanes

v8x16.shuffle(a: v128, b: v128, s: LaneIdx32[16]) -> v128

it appears that the lane indices argument for the shuffle intrinsic is a non-immediate mode argument.

If my memory doesn't fail me:

LLVM's shufflevector requires compile-time indices.
Most architectures do not support shuffling vector lanes of arbitrary width with dynamic indices:
- x86/x86_64 SSSE3's pshufb (and AVX-512 variants) support shuffling bytes with non-immediate mode indices
- ARM 32/64 vtbl/vtbx family of instructions support shuffling bytes with non-immediate mode indices

That is, if the user is shuffling a v8x16, we can use those instructions directly to shuffle with dynamic indices without issues.

If the user attempts to shuffle a v16x8, v32x4, etc. what's the best machine code that we can generate for dynamic indices?

If I remember correctly, generating the appropriate sequence of shuffle bytes instructions requires knowing the lane indices during machine code generation. If these aren't known, the only thing a code generator could do AFAICT is fall back to scalar code, that is, copy the vectors to memory, do a scalar shuffle there e.g. by copying into a third vector, and copy the result of this third vector back to a vector register. This sounds like a huge performance cliff to me.

What am I missing? How can we generate efficient vector shuffles from dynamic lane indices?

The text was updated successfully, but these errors were encountered:

gnzlbg · 2018-08-07T15:20:01Z

Duh...

What I was missing is that LaneIdx32 is an immediate mode operand.

Honestly, I think the spec should be a bit more idiot proof here, by:

renaming LaneIdx32 to ImmLaneIdx32 (for consistency with ImmByte)
renaming the s argument in v128.shuffle to sImm, imm, or similar

This makes it clear everywhere that `ImmLaneIdx` is an immediate operand and also makes the nomenclature for immediate operands consistent (both `ImmByte` and `ImmLaneIdx` start with the prefix `Imm`) . See WebAssembly#33 .

gnzlbg · 2018-08-08T10:58:27Z

So it appears that LaneIdx are not always immediate mode operands (e.g. in the extract_lanes instructions), which would mean that my original interpretation of the spec is correct, and that the vector shuffles accept dynamic run-time indices.

So I am re-opening this for clarification.

gnzlbg · 2018-08-08T11:05:13Z

@billbudge how does V8 implement these?

gnzlbg · 2018-08-08T11:14:17Z

Duh again. So yeah, the lane indices are dynamic, but that's ok because the shuffle only operates on v8x16 and one can generate efficient code for that on pretty much every architecture.

gnzlbg · 2018-08-08T11:24:34Z

So... no... the https://github.com/WebAssembly/simd/blob/master/proposals/simd/BinarySIMD.md document says that these operands are immediate mode arguments...

And that the operands for extract lanes are immediates as well...

I'm leaving this open until someone clarifies and the spec is made idiot proof. To check whether v8x16 shuffle takes immediate arguments or not, I read the definition in the spec, which doesn't mention it and doesn't call the argument imm, like other places in the spec do. Then I have to jump to the definition of the LaneIdx types, which do not start with Imm_ like ImmBytes do, and then I have to look at the binary encoding doc which does state that these are immediates. . .

I mean, is it me, or is the spec currently unnecessarily ambiguous ?

If instead of all that I would just read v8x16.shuffle(..., imm: ImmLaneIdx[16]) and had a comment next to it saying that imm is an immediate mode operand, the spec would be crystal clear.

penzn · 2018-09-25T02:21:23Z

They are immediate mode arguments, and so are the arguments to hardware shuffle operations (at least on some popular platforms). I don't know why the wording in the spec is the way it is, but you can definitely try submitting a PR to correct it :)

billbudge · 2018-12-12T21:56:34Z

V8's implementation treats the lane index vector as an immediate value. However, there's a strong case to add the dynamic (dynamic shuffle held in a vector register) shuffle. See #24

gnzlbg · 2018-12-13T10:18:55Z

For what it's worth, there has been a PR that closes this issue open for a long time (#34), who should I ping to figure out what's required for it to make progress ?

However, there's a strong case to add the dynamic (dynamic shuffle held in a vector register) shuffle.

There has also been a PR open to address that for a long time and that hasn't seen much progress: #30 .

binji · 2018-12-13T23:50:38Z

who should I ping to figure out what's required for it to make progress ?

@aretmr @PeterJensen are the SIMD champions, perhaps they have thoughts about this.

dtig · 2018-12-14T00:36:36Z

I've left comments on #34, and opened issue #58 to handle triage of the older issues/PRs.

* Rename LaneIdx{2,4,8,16,32} to ImmLaneIdx{2,4,8,16,32} This makes it clear everywhere that `ImmLaneIdx` is an immediate operand and also makes the nomenclature for immediate operands consistent (both `ImmByte` and `ImmLaneIdx` start with the prefix `Imm`) . See #33 . * further clarify the representation hierarchies and immediate mode operands * memarg is an immediate mode argument * reference scalar load/stores spec * Fix typo: floating-point value to integer-value * Fix memarg * Fix language in replace value * Update "hierarchy" of types paragraph

gnzlbg · 2019-03-02T14:15:05Z

IIRC this was fixed by #34 , thank you all!

gnzlbg changed the title ~~shuffle lanes takes dynamic lane indices~~ Are shuffle's lane indices dynamic? Aug 7, 2018

gnzlbg closed this as completed Aug 7, 2018

gnzlbg mentioned this issue Aug 7, 2018

Rename LaneIdx{2,4,8,16,32} to ImmLaneIdx{2,4,8,16,32} #34

Merged

gnzlbg reopened this Aug 8, 2018

gnzlbg closed this as completed Aug 8, 2018

gnzlbg reopened this Aug 8, 2018

gnzlbg closed this as completed Mar 2, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Are shuffle's lane indices dynamic? #33

Are shuffle's lane indices dynamic? #33

gnzlbg commented Aug 7, 2018 •

edited

Loading

gnzlbg commented Aug 7, 2018

gnzlbg commented Aug 8, 2018 •

edited

Loading

gnzlbg commented Aug 8, 2018

gnzlbg commented Aug 8, 2018

gnzlbg commented Aug 8, 2018 •

edited

Loading

penzn commented Sep 25, 2018

billbudge commented Dec 12, 2018

gnzlbg commented Dec 13, 2018

binji commented Dec 13, 2018

dtig commented Dec 14, 2018

gnzlbg commented Mar 2, 2019

Are shuffle's lane indices dynamic? #33

Are shuffle's lane indices dynamic? #33

Comments

gnzlbg commented Aug 7, 2018 • edited Loading

gnzlbg commented Aug 7, 2018

gnzlbg commented Aug 8, 2018 • edited Loading

gnzlbg commented Aug 8, 2018

gnzlbg commented Aug 8, 2018

gnzlbg commented Aug 8, 2018 • edited Loading

penzn commented Sep 25, 2018

billbudge commented Dec 12, 2018

gnzlbg commented Dec 13, 2018

binji commented Dec 13, 2018

dtig commented Dec 14, 2018

gnzlbg commented Mar 2, 2019

gnzlbg commented Aug 7, 2018 •

edited

Loading

gnzlbg commented Aug 8, 2018 •

edited

Loading

gnzlbg commented Aug 8, 2018 •

edited

Loading