Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bounds checking in dot #650

Closed
wraith1995 opened this issue Aug 16, 2019 · 3 comments
Closed

Bounds checking in dot #650

wraith1995 opened this issue Aug 16, 2019 · 3 comments
Labels
performance runtime performance

Comments

@wraith1995
Copy link

wraith1995 commented Aug 16, 2019

The dot product of two static arrays has a bounds check in the inner loop as

using StaticArrays
using LinearAlgebra
a = @SVector rand(4)
b = @SVector rand(4)
@code_typed dot(a,b)

produces

CodeInfo(
1 ──       goto #18 if not true
2 ──       goto #17 if not true
3 ──       nothing::Nothing
4 ┄─ %4  = φ (#3 => 0.0, #16 => %32)::Float64
│    %5  = φ (#3 => 0, #16 => %33)::Int64
│    %6  = (Base.slt_int)(%5, 4)::Bool
└───       goto #17 if not %6
5 ── %8  = (Base.add_int)(%5, 1)::Int64
│    %9  = (Base.sub_int)(%8, 1)::Int64
│    %10 = (Base.add_int)(1, %9)::Int64
└───       goto #14 if not false
6 ── %12 = (Base.slt_int)(0, %8)::Bool
└───       goto #10 if not %12
7 ── %14 = (Base.sle_int)(%10, 4)::Bool
└───       goto #9 if not %14
8 ── %16 = (Base.sle_int)(1, %10)::Bool
└───       goto #11
9 ──       goto #11
10 ─       goto #11
11 ┄ %20 = φ (#8 => %16, #9 => false, #10 => false)::Bool
└───       goto #13 if not %20
12 ─       goto #14
13 ─       invoke Base.throw_boundserror($(QuoteNode(1:4))::UnitRange{Int64}, %8::Int64)::Union{}
└───       $(Expr(:unreachable))::Union{}
14 ┄       goto #15
15 ─       goto #16
16 ─ %27 = (Base.getfield)(a, :data)::NTuple{4,Float64}
│    %28 = (Base.getfield)(%27, %10, false)::Float64
│    %29 = (Base.getfield)(b, :data)::NTuple{4,Float64}
│    %30 = (Base.getfield)(%29, %10, false)::Float64
│    %31 = (Base.mul_float)(%28, %30)::Float64
│    %32 = (Base.add_float)(%4, %31)::Float64
│    %33 = (Base.add_int)(%5, 1)::Int64
│          $(Expr(:simdloop, false))::Any
└───       goto #4
17 ┄ %36 = φ (#4 => %4, #2 => 0.0)::Float64
18 ┄ %37 = φ (#17 => %36, #1 => 0.0)::Float64
└───       goto #19
19 ─       return %37
) => Float64

This seems surely wrong to me, but the dot function uses @inbounds correctly and as far as I can tell, all the needed @propagate_inbounds are in place! What on earth is going wrong?

For reference, I'm on julia v1.1.1 and StaticArrays v0.11.0.

@wraith1995 wraith1995 changed the title Bounds checking in dot. Bounds checking in dot Aug 16, 2019
@c42f c42f added the performance runtime performance label Aug 16, 2019
@c42f
Copy link
Member

c42f commented Aug 17, 2019

I'm not sure this is an actual problem in practice? The type inferred code looks a bit mysterious here, but the native code looks good:

julia> @code_native debuginfo=:none dot(a, b)
	.text
	vmovupd	(%rdi), %ymm0
	vmulpd	(%rsi), %ymm0, %ymm0
	vextractf128	$1, %ymm0, %xmm1
	vaddpd	%ymm1, %ymm0, %ymm0
	vhaddpd	%ymm0, %ymm0, %ymm0
	vzeroupper
	retq
	nopw	(%rax,%rax)

Note that in #603 @tkoolen changed dot so that loop unrolling is done at the LLVM level which had performance benefits as you can see there.

The origin of the throw_boundserror call in the code_typed output seems to be the indexing of the UnitRange 1:prod(S) in the loop index computation. I do think that's a bit mysterious and it seems to be due to the use of the @simd macro (perhaps interfering somehow with inference). Consider the following MWE with and without the @simd:

function foo(r)
    s = zero(eltype(r))
    @inbounds @simd for i in r
        s += i
    end
    s
end

@code_warntype foo(1:4)

@c42f
Copy link
Member

c42f commented Aug 17, 2019

@vchuravy
Copy link
Contributor

We don't do DCE elimination on typed IR. The line goto #14 if not false (L11) is the boundscheck elimination

@tkoolen tkoolen closed this as completed Aug 17, 2019
@c42f c42f mentioned this issue Oct 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance runtime performance
Projects
None yet
Development

No branches or pull requests

4 participants