Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function with four or more method definitions has unstable return type #23210

Open
wsshin opened this issue Aug 11, 2017 · 10 comments
Open

Function with four or more method definitions has unstable return type #23210

wsshin opened this issue Aug 11, 2017 · 10 comments
Labels
compiler:inference Type inference

Comments

@wsshin
Copy link
Contributor

wsshin commented Aug 11, 2017

Consider the following code:

abstract type MyAbs end
test(x::AbstractVector, m::MyAbs) = test(Vector{Int}(x), m)

struct MyType1 <: MyAbs end
test(x::Vector{Int}, ::MyType1) = 1

struct MyType2 <: MyAbs end
test(x::Vector{Int}, ::MyType2) = 2

struct MyType3 <: MyAbs end
test(x::Vector{Int}, ::MyType3) = 3

Here, the first definition of test defers execution to one of the next three definitions of test depending on the specific type of m.

Now, if we call test on Vector{Int} and MyAbs, the return type is always Int and therefore stable:

julia> VERSION
v"0.7.0-DEV.1283"

julia> code_warntype(test, (Vector{Int}, MyAbs))
Variables:
  ...
Body:
  ...
  end::Int64

Variables:
  ...
Body:
  ...
  end::Int64

Variables:
  ...
Body:
  ...
  end::Int64

Variables:
  ...
Body:
  ...
  end::Int64

However, if we define one more, 4th concrete subtype of MyAbs and corresponding test function, suddenly calling test on Vector{Int} and MyAbs returns Any and becomes unstable:

julia> struct MyType4 <: MyAbs end

julia> test(x::Vector{Int}, ::MyType4) = 4
test (generic function with 5 methods)

julia> code_warntype(test, (Vector{Int}, MyAbs))
Variables:
  ...
Body:
  ...
  end::Int64

Variables:
  ...
Body:
  ...
  end::Int64

Variables:
  ...
Body:
  ...
  end::Int64

Variables:
  ...
Body:
  ...
  end::Int64

Variables:
  ...
Body:
  ...
  end::Any

Note the last return type is Any.

This is very strange. Is 4 some kind of an intentional threshold?

This is a problem when, for example, I define an array of MyAbs containing instances of MyType1, ..., MyType4 and iterate over it by a for loop to execute test on each element of the array. The output of test is not type-stable.

The problem is reproducible in Julia 0.6 as well.

@ChrisRackauckas
Copy link
Member

Sounds like another instance of #22370 ?

@mauro3
Copy link
Contributor

mauro3 commented Aug 11, 2017

It does work correctly if you pass in a specific type though:

julia> code_warntype(test, (Vector{Int}, MyType4))                                                                                                    
Variables:                                                                                                                                            
  #temp#@_1                                                                                                                                           
  #temp#@_2
  #temp#@_3

Body:
  begin                                                                                                                                               
      return $(QuoteNode(4))                                                                                                                          
  end::Int64                                                                                                                                          

@wsshin
Copy link
Contributor Author

wsshin commented Aug 11, 2017

@mauro3, yes, but the point is that there are cases where we cannot pass specific types to test at the point of type inference, as I mentioned at the end of the original posting. Here is a specific example:

julia> function testfor(x::Vector{Int}, mvec::Vector{MyAbs})
           s = 0
           for m = mvec
               s += test(x, m)
           end
           return s
       end
testfor (generic function with 1 method)

julia> mvec = [MyType1(), MyType2(), MyType3(), MyType4()]
4-element Array{MyAbs,1}:
 MyType1()
 MyType2()
 MyType3()
 MyType4()

julia> @code_warntype testfor([0,0,0], mvec)
Variables:
  #self#::#testfor
  x::Array{Int64,1}
  mvec::Array{MyAbs,1}
  s::Any
  #temp#::Int64
  m::MyAbs

Body:
  begin
      s::Any = 0
      #= line 3 =#
      #temp#::Int64 = 1
      4:
      unless (Base.not_int)((#temp#::Int64 === (Base.add_int)((Base.arraylen)(mvec::Array{MyAbs,1})::Int64, 1)::Int64)::Bool)::Bool goto 14
      SSAValue(2) = $(Expr(:invoke, MethodInstance for getindex(::Array{MyAbs,1}, ::Int64), :(Base.getindex), :(mvec), :(#temp#)))::MyAbs
      SSAValue(3) = (Base.add_int)(#temp#::Int64, 1)::Int64
      m::MyAbs = SSAValue(2)
      #temp#::Int64 = SSAValue(3)
      #= line 4 =#
      s::Any = (s::Any + (Main.test)(x::Array{Int64,1}, m::MyAbs)::Any)::Any
      12:
      goto 4
      14:
      #= line 6 =#
      return s::Any
  end::Any

@wsshin
Copy link
Contributor Author

wsshin commented Aug 11, 2017

@ChrisRackauckas, thanks for bringing up #22370! Indeed, I tried rebuilding Julia with max_methods::Int = 5 instead 4 in function InferenceParams, and the above mentioned instability went away. In other words, after making this change:

diff --git a/base/inference.jl b/base/inference.jl
index 27945928da..b79f6ff79a 100644
--- a/base/inference.jl
+++ b/base/inference.jl
@@ -32,7 +32,7 @@ struct InferenceParams
                     inline_cost_threshold::Int = 100,
                     inline_nonleaf_penalty::Int = 1000,
                     inline_tupleret_bonus::Int = 400,
-                    max_methods::Int = 4,
+                    max_methods::Int = 5,
                     tupletype_len::Int = 15,
                     tuple_depth::Int = 4,
                     tuple_splat::Int = 16,

I got this:

julia> @code_warntype testfor([0,0,0], mvec)
Variables:
  #self#::#testfor
  x::Array{Int64,1}
  mvec::Array{MyAbs,1}
  s::Int64
  #temp#::Int64
  m::MyAbs

Body:
  begin
      s::Int64 = 0
      #= line 3 =#
      #temp#::Int64 = 1
      4:
      unless (Base.not_int)((#temp#::Int64 === (Base.add_int)((Base.arraylen)(mvec::Array{MyAbs,1})::Int64, 1)::Int64)::Bool)::Bool goto 15
      SSAValue(3) = $(Expr(:invoke, MethodInstance for getindex(::Array{MyAbs,1}, ::Int64), :(Base.getindex), :(mvec), :(#temp#)))::MyAbs
      SSAValue(4) = (Base.add_int)(#temp#::Int64, 1)::Int64
      m::MyAbs = SSAValue(3)
      #temp#::Int64 = SSAValue(4)
      #= line 4 =#
      SSAValue(2) = (Main.test)(x::Array{Int64,1}, m::MyAbs)::Int64
      s::Int64 = (Base.add_int)(s::Int64, SSAValue(2))::Int64
      13:
      goto 4
      15:
      #= line 6 =#
      return s::Int64
  end::Int64

@yuyichao
Copy link
Contributor

Do note that making this "type stable" does NOT save you from the dispatch cost. For the type stability of the result along, you can simply add type assertions.

@wsshin
Copy link
Contributor Author

wsshin commented Aug 12, 2017

Here is my workaround to this problem (and further question).

The comments in base/inference.jl reads

function abstract_call_gf_by_type(f::ANY, atype::ANY, sv::InferenceState)
    tm = _topmod(sv)
    # don't consider more than N methods. this trades off between
    # compiler performance and generated code performance.
    # typically, considering many methods means spending lots of time
    # obtaining poor type information.
    # It is important for N to be >= the number of methods in the error()
    # function, so we can still know that error() is always Bottom.
    # here I picked 4.

From this comment, I guessed that I shouldn't abuse calling a function with abstract arguments. In the original example, the function

test(x::AbstractVector, m::MyAbs) = test(Vector{Int}(x), m)

takes two abstract types as input arguments. So, I replaced this function with four functions taking concrete m. The resulting code looks:

abstract type MyAbs end

struct MyType1 <: MyAbs end
test(x::Vector{Int}, ::MyType1) = 1
test(x::AbstractVector, m::MyType1) = test(Vector{Int}(x), m)

struct MyType2 <: MyAbs end
test(x::Vector{Int}, ::MyType2) = 2
test(x::AbstractVector, m::MyType2) = test(Vector{Int}(x), m)

struct MyType3 <: MyAbs end
test(x::Vector{Int}, ::MyType3) = 3
test(x::AbstractVector, m::MyType3) = test(Vector{Int}(x), m)

struct MyType4 <: MyAbs end
test(x::Vector{Int}, ::MyType4) = 4
test(x::AbstractVector, m::MyType4) = test(Vector{Int}(x), m)

This modification removed the type instability. I also find this trick is not limited to four subtypes of MyAbs: even if I add several more subtypes of MyAbs, I don't have type instability, but I don't understand why...

Anyway, having this workaround was a good news. However, the same trick does not work in extending Base.in. I describe the phenomenon in detail in this discourse question.

@wsshin
Copy link
Contributor Author

wsshin commented Aug 12, 2017

@yuyichao, thanks for the notes on performance. Also, you are correct that type instability can be resolved by type assertion.

However, I feel it very non-intuitive to users that we need to use type assertions depending on the number of subtypes. Suppose I had type-stable code with three subtypes. Then one day, I added one more subtypes and suddenly the code becomes type-unstable and thus slow. As an average programmer, how am I supposed to know that the type instability occurred due to the number of subtypes? Until @ChrisRackauckas pointed it out, I had no idea about it, and I feel very very lucky that he pointed it out so early in this thread; otherwise I would be still pulling out my hair to figure out what I did wrong.

Also, I still don't understand why the workaround described above does not work for extending Base.in. Base functions are extended quite frequently when developers define new modules, and users of these modules wouldn't expect they would need to type-assert when using Base functions, so I feel asking users to use type assertions in this case is bad... I guess there must be some way to achieve type stability of extended Base functions. Or there really isn't any???

@yuyichao
Copy link
Contributor

depending on the number of subtypes.

Well, not subtypes, methods.

I added one more subtypes and suddenly the code becomes type-unstable and thus slow

So no this won't happen.

@wsshin
Copy link
Contributor Author

wsshin commented Aug 12, 2017

Sorry, what I meant was adding one more subtype and associated methods, because the example above defined one set of methods per subtype. (E.g., when I added MyType4 <: MyAbs, another method of the test function was added.)

@JeffBezanson
Copy link
Member

I also find this trick is not limited to four subtypes of MyAbs: even if I add several more subtypes of MyAbs, I don't have type instability, but I don't understand why...

I can't reproduce this; if more than 4 methods match we will infer Any.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:inference Type inference
Projects
None yet
Development

No branches or pull requests

5 participants