Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unnecessary memcpy with mir-opt-level=2 and above #77506

Closed
MSxDOS opened this issue Oct 3, 2020 · 2 comments
Closed

Unnecessary memcpy with mir-opt-level=2 and above #77506

MSxDOS opened this issue Oct 3, 2020 · 2 comments
Labels
A-mir-opt Area: MIR optimizations C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@MSxDOS
Copy link

MSxDOS commented Oct 3, 2020

pub fn test() -> Vec<u32> {
    let mut vec = Vec::with_capacity(2);
    vec.push(1);    
    vec.push(2);
    vec
}

mir-opt-level=1:

        push    rbx
        mov     rbx, rdi
        mov     edi, 8
        mov     esi, 4
        call    qword ptr [rip + __rust_alloc@GOTPCREL]
        test    rax, rax
        je      .LBB1_1
        mov     qword ptr [rbx], rax
        movaps  xmm0, xmmword ptr [rip + .LCPI1_0]
        movups  xmmword ptr [rbx + 8], xmm0
	mov     rdi, rbx
        mov     esi, 1
        call    alloc::vec::Vec<T>::push
        mov     rdi, rbx
        mov     esi, 2
        call    alloc::vec::Vec<T>::push
        mov     rax, rbx
        pop     rbx
        ret
.LBB1_1:
        mov     edi, 8
        mov     esi, 4
        call    qword ptr [rip + alloc::alloc::handle_alloc_error@GOTPCREL]
        ud2

mir-opt-level=2
And here it looks like the whole Vec struct is getting copied to a temp stack location for push and then to the final dest:

	push    r14
        push    rbx
        sub     rsp, 24
        mov     rbx, rdi
        mov     edi, 8
        mov     esi, 4
        call    qword ptr [rip + __rust_alloc@GOTPCREL]
	test    rax, rax
        je      .LBB1_1
        mov     qword ptr [rsp], rax
        movaps  xmm0, xmmword ptr [rip + .LCPI1_0]
        movups  xmmword ptr [rsp + 8], xmm0
        mov     r14, rsp
        mov     rdi, r14
        mov     esi, 1
        call    alloc::vec::Vec<T>::push
	mov     rdi, r14
        mov     esi, 2
        call    alloc::vec::Vec<T>::push
        mov     rax, qword ptr [rsp + 16]
        mov     qword ptr [rbx + 16], rax
        movups  xmm0, xmmword ptr [rsp]
        movups  xmmword ptr [rbx], xmm0
	mov     rax, rbx
        add     rsp, 24
        pop     rbx
        pop     r14
        ret
.LBB1_1:
        mov     edi, 8
        mov     esi, 4
        call    qword ptr [rip + alloc::alloc::handle_alloc_error@GOTPCREL]
        ud2

https://rust.godbolt.org/z/fWvKW6

cc @jonas-schievink Aren't those supposed to be rather eliminated by #72632 ? That said, I've seen 2 and especially 3 producing worse code than 1 even before it landed so there may be something else interfering.

@rustbot rustbot added A-mir-opt Area: MIR optimizations C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 3, 2020
@jonas-schievink
Copy link
Contributor

Hmm, hard to tell what's going wrong here, the inliner is making the MIR somewhat hard to read. Note that you have to pass -Zunsound-mir-opts to get the destination propagation pass to run though (but it doesn't fix the issue).

@MSxDOS
Copy link
Author

MSxDOS commented Oct 6, 2020

The example is fixed since some recent nightly. Could it be #77525 ?

Closing this for now.

@MSxDOS MSxDOS closed this as completed Oct 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-mir-opt Area: MIR optimizations C-enhancement Category: An issue proposing an enhancement or a PR with one. I-slow Issue: Problems and improvements with respect to performance of generated code. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

3 participants