Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to optimize constant expressions in functions defined in a let block in a for loop #15967

Closed
iamed2 opened this issue Apr 20, 2016 · 4 comments
Labels
potential benchmark Could make a good benchmark in BaseBenchmarks

Comments

@iamed2
Copy link
Contributor

iamed2 commented Apr 20, 2016

This is a simplified case from some Dates code @omus was working on.

Here is the working original code:

types = (Real, Complex)
for (i, A) in enumerate(types), (j, B) in enumerate(types)
    @eval Base.isless(::Type{$A}, ::Type{$B}) = $(i < j)
end

This is optimized to:

define i1 @julia_isless_53927(%jl_value_t*, %jl_value_t*) #0 {
top:
  ret i1 true
}

I was thinking one could rewrite the function definition without @eval using a let block, and that works in this case (0.5 only):

let i = 1, j = 2, A = Real, B = Complex
    Base.isless(::Type{A}, ::Type{B}) = i < j
end

Optimized to:

define i1 @julia_isless_53886(%jl_value_t*, %jl_value_t*) #0 {
top:
  ret i1 true
}

But if I put this back in the for loop:

types = (Real, Complex)
for (i, A) in enumerate(types), (j, B) in enumerate(types)
    let i = i, j = j, A = A, B = B
        Base.isless(::Type{A}, ::Type{B}) = i < j
    end
end

It doesn't optimize:

define %jl_value_t* @julia_isless_53933(%jl_value_t*, %jl_value_t*) #0 {
top:
  %2 = alloca [8 x %jl_value_t*], align 8
  %.sub = getelementptr inbounds [8 x %jl_value_t*], [8 x %jl_value_t*]* %2, i64 0, i64 0
  %3 = getelementptr [8 x %jl_value_t*], [8 x %jl_value_t*]* %2, i64 0, i64 5
  %4 = getelementptr [8 x %jl_value_t*], [8 x %jl_value_t*]* %2, i64 0, i64 2
  store %jl_value_t* null, %jl_value_t** %4, align 8
  %5 = getelementptr [8 x %jl_value_t*], [8 x %jl_value_t*]* %2, i64 0, i64 3
  store %jl_value_t* null, %jl_value_t** %5, align 8
  %6 = getelementptr [8 x %jl_value_t*], [8 x %jl_value_t*]* %2, i64 0, i64 4
  store %jl_value_t* null, %jl_value_t** %6, align 8
  store %jl_value_t* null, %jl_value_t** %3, align 8
  %7 = getelementptr [8 x %jl_value_t*], [8 x %jl_value_t*]* %2, i64 0, i64 6
  store %jl_value_t* null, %jl_value_t** %7, align 8
  %8 = getelementptr [8 x %jl_value_t*], [8 x %jl_value_t*]* %2, i64 0, i64 7
  store %jl_value_t* null, %jl_value_t** %8, align 8
  %9 = bitcast [8 x %jl_value_t*]* %2 to i64*
  store i64 12, i64* %9, align 8
  %10 = load i64, i64* bitcast (%jl_value_t*** @jl_tls_states to i64*), align 8
  %11 = getelementptr [8 x %jl_value_t*], [8 x %jl_value_t*]* %2, i64 0, i64 1
  %12 = bitcast %jl_value_t** %11 to i64*
  store i64 %10, i64* %12, align 8
  store %jl_value_t** %.sub, %jl_value_t*** @jl_tls_states, align 8
  %13 = load %jl_value_t*, %jl_value_t** inttoptr (i64 4657600112 to %jl_value_t**), align 16
  %14 = icmp eq %jl_value_t* %13, null
  br i1 %14, label %fail, label %pass

fail:                                             ; preds = %top
  %15 = load %jl_value_t*, %jl_value_t** @jl_undefref_exception, align 8
  call void @jl_throw(%jl_value_t* %15)
  unreachable

pass:                                             ; preds = %top
  store %jl_value_t* %13, %jl_value_t** %4, align 8
  %16 = load %jl_value_t*, %jl_value_t** inttoptr (i64 4657600144 to %jl_value_t**), align 16
  %17 = icmp eq %jl_value_t* %16, null
  br i1 %17, label %fail1, label %pass.2

fail1:                                            ; preds = %pass
  %18 = load %jl_value_t*, %jl_value_t** @jl_undefref_exception, align 8
  call void @jl_throw(%jl_value_t* %18)
  unreachable

pass.2:                                           ; preds = %pass
  store %jl_value_t* %16, %jl_value_t** %5, align 8
  store %jl_value_t* inttoptr (i64 4561445688 to %jl_value_t*), %jl_value_t** %3, align 8
  store %jl_value_t* %13, %jl_value_t** %7, align 8
  store %jl_value_t* %16, %jl_value_t** %8, align 8
  %19 = call %jl_value_t* @jl_apply_generic(%jl_value_t** %3, i32 3)
  store %jl_value_t* %19, %jl_value_t** %6, align 8
  %20 = load i64, i64* %12, align 8
  store i64 %20, i64* bitcast (%jl_value_t*** @jl_tls_states to i64*), align 8
  ret %jl_value_t* %19
}
@yuyichao
Copy link
Contributor

Not sure if this is a dup but it looks related to #15276, since the lowring changes i and j to boxes.

@iamed2
Copy link
Contributor Author

iamed2 commented Apr 21, 2016

Use case made it into a PR: #15982

@KristofferC
Copy link
Member

This seems fixed:

julia> for (i, A) in enumerate(types), (j, B) in enumerate(types)
           let i = i, j = j, A = A, B = B
             Base.isless(::Type{A}, ::Type{B}) = i < j
           end
       end

julia> @code_llvm isless(Complex, Real)

define i8 @julia_isless_65888(i8**, i8**) #0 !dbg !5 {
top:
  ret i8 0
}

julia> @code_llvm isless(Real, Complex)

define i8 @julia_isless_65892(i8**, i8**) #0 !dbg !5 {
top:
  ret i8 1
}

@tkelman tkelman added the potential benchmark Could make a good benchmark in BaseBenchmarks label Jan 22, 2017
@iamed2
Copy link
Contributor Author

iamed2 commented Jan 23, 2017

Hooray!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
potential benchmark Could make a good benchmark in BaseBenchmarks
Projects
None yet
Development

No branches or pull requests

4 participants