-
Notifications
You must be signed in to change notification settings - Fork 128
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Arm64: Implement support for emulated masked vector loadstores
In order to support `vmaskmov{ps,pd}` without SVE128 this is required. It's pretty gnarly but they aren't often used so that's fine from a compatibility perspective. Example SVE128 implementation: ```json "vmaskmovps ymm0, ymm1, [rax]": { "ExpectedInstructionCount": 9, "Comment": [ "Map 2 0b01 0x2c 256-bit" ], "ExpectedArm64ASM": [ "ldr q2, [x28, #32]", "mrs x20, nzcv", "cmplt p0.s, p6/z, z17.s, #0", "ld1w {z16.s}, p0/z, [x4]", "add x21, x4, #0x10 (16)", "cmplt p0.s, p6/z, z2.s, #0", "ld1w {z2.s}, p0/z, [x21]", "str q2, [x28, #16]", "msr nzcv, x20" ] }, ``` Example ASIMD implementation ```json "vmaskmovps ymm0, ymm1, [rax]": { "ExpectedInstructionCount": 41, "Comment": [ "Map 2 0b01 0x2c 256-bit" ], "ExpectedArm64ASM": [ "ldr q2, [x28, #32]", "mrs x20, nzcv", "movi v0.2d, #0x0", "mov x1, x4", "mov w0, v17.s[0]", "tbz w0, #31, #+0x8", "ld1 {v0.s}[0], [x1]", "add x1, x1, #0x4 (4)", "mov w0, v17.s[1]", "tbz w0, #31, #+0x8", "ld1 {v0.s}[1], [x1]", "add x1, x1, #0x4 (4)", "mov w0, v17.s[2]", "tbz w0, #31, #+0x8", "ld1 {v0.s}[2], [x1]", "add x1, x1, #0x4 (4)", "mov w0, v17.s[3]", "tbz w0, #31, #+0x8", "ld1 {v0.s}[3], [x1]", "mov v16.16b, v0.16b", "add x21, x4, #0x10 (16)", "movi v0.2d, #0x0", "mov x1, x21", "mov w0, v2.s[0]", "tbz w0, #31, #+0x8", "ld1 {v0.s}[0], [x1]", "add x1, x1, #0x4 (4)", "mov w0, v2.s[1]", "tbz w0, #31, #+0x8", "ld1 {v0.s}[1], [x1]", "add x1, x1, #0x4 (4)", "mov w0, v2.s[2]", "tbz w0, #31, #+0x8", "ld1 {v0.s}[2], [x1]", "add x1, x1, #0x4 (4)", "mov w0, v2.s[3]", "tbz w0, #31, #+0x8", "ld1 {v0.s}[3], [x1]", "mov v2.16b, v0.16b", "str q2, [x28, #16]", "msr nzcv, x20" ] }, ``` There's a little bit of an improvement where nzcv isn't needed to get touched on the ASIMD implementation, but I'll leave that for a future improvement.
- Loading branch information
1 parent
7f74c83
commit ad18bff
Showing
2 changed files
with
172 additions
and
45 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters