Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Julia gets stuck on recursive function #31572

Closed
simeonschaub opened this issue Apr 1, 2019 · 5 comments · Fixed by #31734
Closed

Julia gets stuck on recursive function #31572

simeonschaub opened this issue Apr 1, 2019 · 5 comments · Fixed by #31734
Labels
compiler:inference Type inference

Comments

@simeonschaub
Copy link
Member

I have already posted this on discourse and since this definitely seems to be a bug, I'm opening an issue.
I defined the following merge function for a custom Dict type:

struct MixedKeyDict{T<:Tuple} #<: AbstractDict{Any,Any}
    dicts::T
end

Base.merge(f::Function, d::MixedKeyDict, others::MixedKeyDict...) = _merge(f, (), d.dicts, (d->d.dicts).(others)...)
Base.merge(f, d::MixedKeyDict, others::MixedKeyDict...) = _merge(f, (), d.dicts, (d->d.dicts).(others)...)

function _merge(f, res, d, others...)
    ofsametype, remaining = _alloftype(Base.heads(d), ((),), others...)
    return _merge(f, (res..., merge(f, ofsametype...)), Base.tail(d), remaining...)
end

_merge(f, res, ::Tuple{}, others...) = _merge(f, res, others...)
_merge(f, res, d) = MixedKeyDict((res..., d...))
_merge(f, res, ::Tuple{}) = MixedKeyDict(res)

function _alloftype(ofdesiredtype::Tuple{Vararg{D}}, accumulated, d::Tuple{D,Vararg}, others...) where D
    return _alloftype((ofdesiredtype..., first(d)),
                      (Base.front(accumulated)..., (last(accumulated)..., Base.tail(d)...), ()),
                      others...)
end

function _alloftype(ofdesiredtype, accumulated, d, others...)
    return _alloftype(ofdesiredtype,
                      (Base.front(accumulated)..., (last(accumulated)..., first(d))),
                      Base.tail(d), others...)
end

function _alloftype(ofdesiredtype, accumulated, ::Tuple{}, others...)
    return _alloftype(ofdesiredtype,
                      (accumulated..., ()),
                      others...)
end

_alloftype(ofdesiredtype, accumulated) = ofdesiredtype, Base.front(accumulated)

For two dicts, this works perfectly well:

julia> d = MixedKeyDict((Dict(1=>3),Dict(4. => 2)))
MixedKeyDict{Tuple{Dict{Int64,Int64},Dict{Float64,Int64}}} with 2 entries:
  1   => 3
  4.0 => 2

julia> e = MixedKeyDict((Dict(1=>7),Dict(5. => 9)))
MixedKeyDict{Tuple{Dict{Int64,Int64},Dict{Float64,Int64}}} with 2 entries:
  1   => 7
  5.0 => 9

julia> merge(+, d, e)
MixedKeyDict{Tuple{Dict{Int64,Int64},Dict{Float64,Int64}}} with 3 entries:
  1   => 10
  4.0 => 2
  5.0 => 9

But as soon as I add another one, Julia gets completely stuck and I have to kill the whole process.

julia> f = MixedKeyDict((Dict(2=>7),Dict(5. => 9)))
MixedKeyDict{Tuple{Dict{Int64,Int64},Dict{Float64,Int64}}} with 2 entries:
  2   => 7
  5.0 => 9

julia> merge(+, d, e, f) #gets stuck

I thought there might be a bug in my code, so I tried debugging it with Debugger.jl, but the weird part is,that it works just fine if I just @enter the function and let it continue without any breakpoints set:

julia> using Debugger

julia> @enter merge(+, d, e, f)
In merge(f, d, others) at /home/simeon/Documents/Julia/TypeStableDicts/src/TypeStableDicts.jl:33
>33  Base.merge(f::Function, d::MixedKeyDict, others::MixedKeyDict...) = _merge(f, (), d.dicts, (d->d.dicts).(others)...)

About to run: (getproperty)(<suppressed 71 bytes of output>, :dicts)
1|debug> c
MixedKeyDict{Tuple{Dict{Int64,Int64},Dict{Float64,Int64}}} with 4 entries:
  2   => 7
  1   => 10
  4.0 => 2
  5.0 => 18

If I tell Debugger.jl to use the compiled mode, it gets stuck again. This seems to be a bug in the compiler. If I call the underlying _merge-function directly, I get the same behavior, although calling _alloftype directly runs fine. @KristofferC suggested, it might be bug in the way Julia tries to optimize this code.

My Julia version:

julia> versioninfo()
Julia Version 1.1.0
Commit 80516ca202 (2019-01-21 21:24 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, skylake)

I've also tried it with the latest nightly, but no luck there either.

@JeffBezanson JeffBezanson added the compiler:inference Type inference label Apr 1, 2019
@JeffBezanson
Copy link
Member

Seems to be recursion during inference.

@Jutho
Copy link
Contributor

Jutho commented Apr 8, 2019

@maartenvd and I also have examples, with a more elaborate and complicated codebase, where Julia takes several hours to run a function, which afterwards runs in less than a second. Recursion is also likely the culprit, though we don't have found a good way of debugging this behaviour yet.

The strange thing is that, if you ctrl+C the process, in such a way that Julia itself, i.e. the REPL, does not quit, you can rerun this expression, and then it also runs fine right away.

@maartenvd
Copy link

So during recursion it will widen the type when it fails, and i have added a println there. In both cases you see something kinda similar; recursion keeps widening the same types to the same types, almost as if it never really caches it's result and has to re-evaluate it... We also have recursive functions that call eachother.

@JeffBezanson
Copy link
Member

The strange thing is that, if you ctrl+C the process, in such a way that Julia itself, i.e. the REPL, does not quit, you can rerun this expression, and then it also runs fine right away.

This is because we catch exceptions that happen during type inference and keep running without it (compiling un-inferred versions of functions), in case of an exception caused by an internal bug.

@chriselrod
Copy link
Contributor

chriselrod commented Apr 13, 2019

I ran into this with GenericLinearAlgebra's GenericLinearAlgebra.cholRecursive!.
It hung. When I pressed ctrl+C, it spits out one instance of:

julia> using GenericLinearAlgebra

julia> A = randn(1000,1000); A = A'A;

julia> GenericLinearAlgebra.cholRecursive!(copy(A), Val{:L})
^C^C^C^C^CWARNING: Force throwing a SIGINT
Internal error: encountered unexpected error in runtime:
InterruptException()
jl_egal at /home/chriselrod/Documents/languages/jdev/src/builtins.c:172
compare_fields at /home/chriselrod/Documents/languages/jdev/src/builtins.c:92
is_lattice_equal at ./compiler/typelattice.jl:182
is_argtype_match at ./compiler/inferenceresult.jl:21 [inlined]
cache_lookup at ./compiler/inferenceresult.jl:153
abstract_call_method_with_const_args at ./compiler/abstractinterpretation.jl:215
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:115
abstract_call at ./compiler/abstractinterpretation.jl:808
abstract_call at ./compiler/abstractinterpretation.jl:598
abstract_eval_call at ./compiler/abstractinterpretation.jl:837
^Cabstract_eval at ./compiler/abstractinterpretation.jl:907
typeinf_local at ./compiler/abstractinterpretation.jl:1164
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1220
typeinf at ./compiler/typeinfer.jl:12

Followed by a flood of repetitions:

bstract_call_method_with_const_args at ./compiler/abstractinterpretation.jl:222
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:115
abstract_call at ./compiler/abstractinterpretation.jl:808
abstract_call at ./compiler/abstractinterpretation.jl:598
abstract_eval_call at ./compiler/abstractinterpretation.jl:837
abstract_eval at ./compiler/abstractinterpretation.jl:907
typeinf_local at ./compiler/abstractinterpretation.jl:1164
^Ctypeinf_nocycle at ./compiler/abstractinterpretation.jl:1220
typeinf at ./compiler/typeinfer.jl:12

finally concluding with:

abstract_call_method at ./compiler/abstractinterpretation.jl:369
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:92
abstract_call at ./compiler/abstractinterpretation.jl:808
abstract_call at ./compiler/abstractinterpretation.jl:598
abstract_eval_call at ./compiler/abstractinterpretation.jl:837
abstract_eval at ./compiler/abstractinterpretation.jl:907
typeinf_local at ./compiler/abstractinterpretation.jl:1164
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1220
typeinf at ./compiler/typeinfer.jl:12
typeinf_edge at ./compiler/typeinfer.jl:482
abstract_call_method at ./compiler/abstractinterpretation.jl:369
abstract_call_gf_by_type at ./compiler/abstractinterpretation.jl:92
abstract_call at ./compiler/abstractinterpretation.jl:808
abstract_call at ./compiler/abstractinterpretation.jl:598
abstract_eval_call at ./compiler/abstractinterpretation.jl:837
abstract_eval at ./compiler/abstractinterpretation.jl:907
typeinf_local at ./compiler/abstractinterpretation.jl:1164
typeinf_nocycle at ./compiler/abstractinterpretation.jl:1220
typeinf at ./compiler/typeinfer.jl:12
typeinf_ext at ./compiler/typeinfer.jl:568
typeinf_ext at ./compiler/typeinfer.jl:599
jfptr_typeinf_ext_1 at /home/chriselrod/Documents/languages/jdev/usr/lib/julia/sys.so (unknown line)
jl_apply_generic at /home/chriselrod/Documents/languages/jdev/src/gf.c:2191
jl_apply at /home/chriselrod/Documents/languages/jdev/src/julia.h:1604 [inlined]
jl_type_infer at /home/chriselrod/Documents/languages/jdev/src/gf.c:207
jl_compile_method_internal at /home/chriselrod/Documents/languages/jdev/src/gf.c:1773
jl_apply_generic at /home/chriselrod/Documents/languages/jdev/src/gf.c:2196
do_call at /home/chriselrod/Documents/languages/jdev/src/interpreter.c:323
eval_value at /home/chriselrod/Documents/languages/jdev/src/interpreter.c:411
eval_stmt_value at /home/chriselrod/Documents/languages/jdev/src/interpreter.c:362 [inlined]
eval_body at /home/chriselrod/Documents/languages/jdev/src/interpreter.c:758
jl_interpret_toplevel_thunk_callback at /home/chriselrod/Documents/languages/jdev/src/interpreter.c:884
Interpreter frame (ip: 3)
Core.CodeInfo(code=Array{Any, (5,)}[
  Expr(:call, Base.getproperty, :GenericLinearAlgebra, :(:cholRecursive!)),
  Expr(:call, :copy, :A),
  Expr(:call, Core.apply_type, :Val, :(:L)),
  Expr(:call, SSAValue(1), SSAValue(2), SSAValue(3)),
  Expr(:return, SSAValue(4))], codelocs=Array{Int32, (5,)}[1, 1, 1, 1, 1], ssavaluetypes=5, ssaflags=Array{UInt8, (0,)}[], method_for_inference_limit_heuristics=nothing, linetable=Array{Any, (1,)}[Core.LineInfoNode(method=Symbol("top-level scope"), file=Symbol("REPL[8]"), line=1, inlined_at=0)], slotnames=Array{Symbol, (0,)}[], slotflags=Array{UInt8, (0,)}[], slottypes=nothing, rettype=Any, parent=nothing, min_world=1, max_world=-1, inferred=false, inlineable=false, propagate_inbounds=false, pure=false)jl_interpret_toplevel_thunk at /home/chriselrod/Documents/languages/jdev/src/interpreter.c:893
jl_toplevel_eval_flex at /home/chriselrod/Documents/languages/jdev/src/toplevel.c:797
jl_toplevel_eval_flex at /home/chriselrod/Documents/languages/jdev/src/toplevel.c:746
jl_toplevel_eval_flex at /home/chriselrod/Documents/languages/jdev/src/toplevel.c:746
jl_toplevel_eval_in at /home/chriselrod/Documents/languages/jdev/src/toplevel.c:826
eval at ./boot.jl:330
jl_apply_generic at /home/chriselrod/Documents/languages/jdev/src/gf.c:2191
eval_user_input at /home/chriselrod/Documents/languages/jdev/usr/share/julia/stdlib/v1.3/REPL/src/REPL.jl:86
macro expansion at /home/chriselrod/Documents/languages/jdev/usr/share/julia/stdlib/v1.3/REPL/src/REPL.jl:118 [inlined]
#26 at ./task.jl:268
jl_apply_generic at /home/chriselrod/Documents/languages/jdev/src/gf.c:2191
jl_apply at /home/chriselrod/Documents/languages/jdev/src/julia.h:1604 [inlined]
start_task at /home/chriselrod/Documents/languages/jdev/src/task.c:583
unknown function (ip: 0xffffffffffffffff)

and then suddenly display the result. Rerunning showed the results immediately.

vtjnash added a commit that referenced this issue Apr 15, 2019
when we hit union-splitting, we need to ensure type limits are very aggressive
and preferably also independent of the height of the recursion chain

fix #31572
vtjnash added a commit that referenced this issue Apr 17, 2019
when we hit union-splitting, we need to ensure type limits are very aggressive
and preferably also independent of the height of the recursion chain

fix #31572
vtjnash added a commit that referenced this issue Apr 18, 2019
when we hit union-splitting, we need to ensure type limits are very aggressive
and preferably also independent of the height of the recursion chain

fix #31572
vtjnash added a commit that referenced this issue Apr 19, 2019
when we hit union-splitting, we need to ensure type limits are very aggressive
and preferably also independent of the height of the recursion chain

fix #31572
KristofferC pushed a commit that referenced this issue Apr 20, 2019
when we hit union-splitting, we need to ensure type limits are very aggressive
and preferably also independent of the height of the recursion chain

fix #31572

(cherry picked from commit 32b091b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:inference Type inference
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants