Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[arm64] Addressing mode for vectors #67435

Closed
EgorBo opened this issue Apr 1, 2022 · 1 comment · Fixed by #67490
Closed

[arm64] Addressing mode for vectors #67435

EgorBo opened this issue Apr 1, 2022 · 1 comment · Fixed by #67490
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Milestone

Comments

@EgorBo
Copy link
Member

EgorBo commented Apr 1, 2022

I noticed that we lose some perf in various SpanHelpers on arm64 due to missing addressing modes which brake pipelining, minimal repro:

    Vector128<byte> Add(ref byte b1, ref byte b2, nuint offset) =>
        Vector128.LoadUnsafe(ref b1, offset) + 
        Vector128.LoadUnsafe(ref b2, offset);

Current codegen:

        add     x0, x1, x3
        ld1     {v16.16b}, [x0]
        add     x0, x2, x3
        ld1     {v17.16b}, [x0]
        add     v16.16b, v16.16b, v17.16b
        mov     v0.16b, v16.16b

Expected codegen:

        ldr     q16, [x1, x3]
        ldr     q17, [x2, x3]
        add     v16.16b, v16.16b, v17.16b
        mov     v0.16b, v16.16b

same for [addr + imm] e.g. Vector128.LoadUnsafe(ref b2, 16)

cc @tannergooding

@EgorBo EgorBo added the tenet-performance Performance related issue label Apr 1, 2022
@dotnet-issue-labeler dotnet-issue-labeler bot added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI untriaged New issue has not been triaged by the area owner labels Apr 1, 2022
@ghost
Copy link

ghost commented Apr 1, 2022

Tagging subscribers to this area: @JulieLeeMSFT
See info in area-owners.md if you want to be subscribed.

Issue Details

I noticed that we lose a lot of perf in various SpanHelpers on arm64 due to missing addressing modes which brake pipelining, minimal repro:

    Vector128<byte> Add(ref byte b1, ref byte b2, nuint offset) =>
        Vector128.LoadUnsafe(ref b1, offset) + 
        Vector128.LoadUnsafe(ref b2, offset);

Current codegen:

        add     x0, x1, x3
        ld1     {v16.16b}, [x0]
        add     x0, x2, x3
        ld1     {v17.16b}, [x0]
        add     v16.16b, v16.16b, v17.16b
        mov     v0.16b, v16.16b

Expected codegen:

        ldr     q16, [x1, x3]
        ldr     q17, [x2, x3]
        add     v16.16b, v16.16b, v17.16b
        mov     v0.16b, v16.16b

same for [addr + imm] e.g. Vector128.LoadUnsafe(ref b2, 16)

cc @tannergooding

Author: EgorBo
Assignees: -
Labels:

tenet-performance, area-CodeGen-coreclr, untriaged

Milestone: -

@EgorBo EgorBo added this to the Future milestone Apr 1, 2022
@EgorBo EgorBo removed the untriaged New issue has not been triaged by the area owner label Apr 1, 2022
@ghost ghost added the in-pr There is an active PR which will close this issue when it is merged label Apr 2, 2022
@ghost ghost removed the in-pr There is an active PR which will close this issue when it is merged label Apr 3, 2022
@JulieLeeMSFT JulieLeeMSFT modified the milestones: Future, 7.0.0 Apr 4, 2022
@ghost ghost locked as resolved and limited conversation to collaborators May 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI tenet-performance Performance related issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants