Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ctrl-C does not work when running multi-threaded code #35524

Open
ViralBShah opened this issue Apr 19, 2020 · 14 comments
Open

Ctrl-C does not work when running multi-threaded code #35524

ViralBShah opened this issue Apr 19, 2020 · 14 comments
Labels
multithreading Base.Threads and related functionality

Comments

@ViralBShah
Copy link
Member

When Ctrl-C'ing multi-threaded code, it crashes Julia altogether.

julia> function fib(n::Int)
           if n < 2
               return n
           end
           t = Threads.@spawn fib(n - 2)
           return fib(n - 1) + fetch(t)
       end^C

julia> fib(50)
^C^C^C^C^Cfatal: error thrown and no exception handler available.
InterruptException()
sigatomic_end at ./c.jl:425 [inlined]
task_done_hook at ./task.jl:442
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2144 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2322
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1692 [inlined]
jl_finish_task at /buildworker/worker/package_linux64/build/src/task.c:198
start_task at /buildworker/worker/package_linux64/build/src/task.c:697
unknown function (ip: (nil))
@ViralBShah ViralBShah added the multithreading Base.Threads and related functionality label Apr 19, 2020
@Keno
Copy link
Member

Keno commented Apr 19, 2020

Ctrl-C doesn't properly work in single threaded code either ;)

@ViralBShah
Copy link
Member Author

Works better, I think.

@StefanKarpinski
Copy link
Member

Seems like structured concurrency would help here, although whenever there's a @sync (explicit or implicit) it should be possible to make this work to the extent that threads can be interrupted successfully (so not 100%, but somewhat).

@Keno
Copy link
Member

Keno commented Apr 20, 2020

We just really need to stop having Ctrl-C throw regular exceptions. It's extremely surprising that everything can suddenly also throw interrupt exceptions (not to mention it not being modeled in the compiler).

@timholy
Copy link
Member

timholy commented Apr 24, 2020

xref #25790 (comment)

@ViralBShah
Copy link
Member Author

Actually, with 1.4 (maybe even 1.3?) I do notice killing single threaded Julia processes is cumbersome too with ctrl-c. @timholy 's explanation was helpful to understand.

@ViralBShah
Copy link
Member Author

I that ctrl-c is less well-behaved than pre-1.3 even for single threaded code, in 1.4. You have to keep it pressed for a while, and you get the big Julia stacktrace.

@tkf
Copy link
Member

tkf commented May 3, 2020

I mentioned it in the other issue #25790 (comment) but it'd be nice to solve this with structured concurrency #33248.

@OkonSamuel
Copy link
Contributor

OkonSamuel commented May 4, 2020

When Ctrl-C'ing multi-threaded code, it crashes Julia altogether.

julia> function fib(n::Int)
           if n < 2
               return n
           end
           t = Threads.@spawn fib(n - 2)
           return fib(n - 1) + fetch(t)
       end^C

julia> fib(50)
^C^C^C^C^Cfatal: error thrown and no exception handler available.
InterruptException()
sigatomic_end at ./c.jl:425 [inlined]
task_done_hook at ./task.jl:442
_jl_invoke at /buildworker/worker/package_linux64/build/src/gf.c:2144 [inlined]
jl_apply_generic at /buildworker/worker/package_linux64/build/src/gf.c:2322
jl_apply at /buildworker/worker/package_linux64/build/src/julia.h:1692 [inlined]
jl_finish_task at /buildworker/worker/package_linux64/build/src/task.c:198
start_task at /buildworker/worker/package_linux64/build/src/task.c:697
unknown function (ip: (nil))

I recently bumped into a similar issue in a code i wrote. I had to quit julia to stop the threads from running

@c42f
Copy link
Member

c42f commented May 7, 2020

Ctrl-C is really hard because a large proportion of code is unsafe for async / non-cooperative cancellation by default. For example, there's all sorts of places in Base where we might be holding locks or other resources which aren't precisely protected by a try-catch when the "impossible" happens and a signal is received between creating the resource and "immediately" protecting it. I'm thinking of code like

lk = lock(obj)
# < what happens if we're interrupted here ?
try
    f()
finally
    unlock(lk)
end

cf. the Java Thread.stop() debacle.

Cancellation can be made safe by having a small number of well defined and documented cooperative cancellation points (eg, IO). This is what pthreads do (see man pthreads "Cancellation points"). But this can result in Ctrl-C not actually cancelling the task for quite some time. Which isn't what you really want.

Structured concurrency helps a bit because it gives a systematic way for cleanup to propagate during cancellation. But in itself I don't think it helps resolve the Ctrl-C now-or-later, unsafe-or-safe conundrum.

@vtjnash
Copy link
Member

vtjnash commented May 7, 2020

Yep. We even actually already use cancellation points for this, it's just also not sufficient and causes other problems (such as, in the pthreads case, being unable to close file descriptors). Refs #6283

@c42f
Copy link
Member

c42f commented May 7, 2020

We just really need to stop having Ctrl-C throw regular exceptions

One way to do this is to have Ctrl-C set a flag which is checked at cancellation points. That's well and good, but it does mean Ctrl-C won't cancel things right now, but rather at some later time. Possibly much later, or never if you happen to have written a tight infinite loop!

Any thoughts on how we could handle this? One option might be to extend our existing double-Ctrl-C handling. Currently I recall we avoid delivering InterruptException in ccall'd code which is expected to be unsafe for Julia exceptions. But even normal julia code is actually unsafe for InterruptException! It's delivered asynchronously in a way that can't be easily modeled by programmers (or by the compiler?).

@tkf
Copy link
Member

tkf commented May 7, 2020

Yeah, I agree that there are problems outside of what structured concurrency can do. But my point is that, even if you can magically solve the problems you mentioned, there are a bunch of problems that are hard to solve without structured concurrency.

lk = lock(obj)
# < what happens if we're interrupted here ?
try

I think this is why we should be recommending lock(...) do instead of manual try-finally. Inside of lock(f, ...) implementation, each lock can use some very low-level compiler machinery to ask not to insert cancellation point within the critical region.

Which isn't what you really want.

I think it's unavoidable in a performance-oriented language like Julia. Surely nobody wants random cancellation points in their carefully-written tight loops. Using only the I/O operation as the cancellation point and letting people manually opt-in by yield or something sounds like a good compromise.

@c42f
Copy link
Member

c42f commented May 11, 2020

I think this is why we should be recommending lock(...) do instead of manual try-finally.

Absolutely! (The Base implementation of lock(f, lk) is exactly the code I quoted, but of course that could be fixed ;-) )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
multithreading Base.Threads and related functionality
Projects
None yet
Development

No branches or pull requests

8 participants