Error Handling in Tasks #32677

rohanmclure · 2019-07-25T12:57:53Z

Not sure if this is an extension of #10405 or a novel issue, but the following example provides two tasks, one of which will throw an error, the other is blocking. This makes async code difficult to debug.

c = Channel(0)
@sync begin
       @async begin
               put!(c, nothing)
       end
       @async begin
               this_is_not_defined_and_will_error()
               take!(c)
       end
end

Interestingly, reversing the @async blocks will cause the error to propagate properly.

I've verified this is the case in 1.1.0 and 1.3.0-alpha

vchuravy · 2019-07-28T17:34:30Z

This is likely due to the fact that we wait in order on the tasks, instead of peeking if one has errored.

edit: In Go there is ErrorGroup, https://godoc.org/golang.org/x/sync/errgroup which is close to the behaviour one might want

c42f · 2019-08-09T09:07:23Z

go ErrorGroup seems about right — it wraps each function so that a failure will cancel the whole group:

https://github.com/golang/sync/blob/112230192c580c3556b8cee6403af37a4fc5f28c/errgroup/errgroup.go#L54

That seems like a better default behavior for @sync. Maybe we could have some syntax in @sync to set a cancellation policy

@sync cancellation=foo begin
    @async f1()
    @async f2()
end

though I'm not sure what kind of thing foo should be.

StefanKarpinski · 2019-08-09T14:22:09Z

cc @tkf who has looked into cancellation and Trio-style structured I/O a bit already.

tkf · 2019-08-10T03:55:53Z

It looks like the problem here is that the code in the OP is mixing two synchronization paradigms (structured concurrency (@sync-@async) and CSP (unbuffered Channel)) that are not aware of each other. I think this can be solved by attaching Channels (and maybe any kind of synchronization mechanisms) to the @sync block. For example, a very simple minded solution would be to close all Channels attached to a @sync block at the end of it. So you would write something like

@sync begin
    c = @Channel(0)
    ...do stuff...
end

which is expanded to

let tasks = [],  # we call it `sync_varname` now
    channels = []

    ans = begin
        c = Channel(0)
        push!(channels, c)

        ...do stuff...
    end

    Base.sync_end(tasks, channels)

    ans
end

(Or maybe use task_local_storage instead of let? This way, Channel does not have to be a macro.)

In Base.sync_end(tasks, channels), it should not iterate tasks sequentially but rather process the tasks as they complete. Once one of the task fail, it should close all channels.

Of course, closeing channels is not ideal because it makes error handling hard; you have to distinguish errors due to the automatic close in @sync with other ones. It would be nice if there are variants of take!/put! like

maybe_take!(c::Channel{T}) :: Union{Some{T}, Cancelled}
maybe_put!(c::Channel) :: Union{Nothing, Cancelled}

where close makes maybe_*!(c) return a Cancelled immediately. This way, I can write a helper macro (say) @await such that

y = @await maybe_take!(c)

expands to

x = maybe_take!(c)
x isa Cancelled && return x
x = something(y)

This way, you can create a convention that @await-able functions are the ones which return Union{T, Cancelled}. Furthermore, the @await-able functions that check the cancellation tokens associated with a @sync block would be the cancellable functions. This solves my concern in #6283 (comment) about making functions cancellable.

(A more sophisticated version of @await can be based on ResultTypes.jl.)

tro3 · 2020-12-04T00:33:43Z

Is this issue on the drawing board for 1.6 or further in the future? I'd be happy to help here over the holidays, if needed.

StefanKarpinski · 2020-12-04T19:42:34Z

The 1.6 release is branching right about now so no more features. If you want to work on this for 1.7 though, that would be great.

…ntee exceptions propagate upward

tro3 · 2020-12-16T18:52:07Z

@StefanKarpinski - so I have a fix for this (way simplified from the above), based on parallelizing sync_end. It does not address the larger concurrency issues, though. It just prevents the task-order-based lockup. I've tested against the full suites in the 1.5-release and 1.6-release branches.

Would it make more sense to issue a PR into Experimental, or directly into task.jl? Or is this considered too delicate a PR for a non-core developer (I won't take offense)?

StefanKarpinski · 2020-12-18T14:36:51Z

I'm not sure about that. @vtjnash or @Keno would have to offer an opinion on the technical aspect.

tro3 · 2020-12-25T15:50:59Z

So after wandering down a few different paths and getting feedback on #38916 and #38992, I came to the conclusion that the solution to this problem was (an inferior implementation of) what is already in Experimental.sync_end. So I'm just going to point to that in my startup.jl until it moves to Base, which IMO it should do. Happy to do the PR for this and add a lockup test suite if needed.

User-764Q · 2021-10-12T05:10:49Z

Hi this is a bit over my head, but I tried it both ways round today on Julia 1.6.2 and it errored both times.

`
julia> @sync begin
@async begin
take!(c)
end
@async begin

                  put!(c, nothing)
          end
   end

ERROR: TaskFailedException

nested task error: UndefVarError: c not defined
Stacktrace:
 [1] macro expansion
   @ ./REPL[1]:3 [inlined]
 [2] (::var"#5#7")()
   @ Main ./task.jl:411

...and 1 more exception.

Stacktrace:
[1] sync_end(c::Channel{Any})
@ Base ./task.jl:369
[2] top-level scope
@ task.jl:388

`

julia> @sync begin
@async begin
put!(c, nothing)
end
@async begin

                  take!(c)
          end
   end

ERROR: TaskFailedException

nested task error: UndefVarError: c not defined
Stacktrace:
 [1] macro expansion
   @ ./REPL[1]:3 [inlined]
 [2] (::var"#1#3")()
   @ Main ./task.jl:411

...and 1 more exception.

Stacktrace:
[1] sync_end(c::Channel{Any})
@ Base ./task.jl:369
[2] top-level scope
@ task.jl:388

`

vtjnash · 2024-02-03T03:25:21Z

Seems this is now Experimental.sync_end, which we should move to Future.sync_end, and eventually Base.sync_end

JeffBezanson added the error handling Handling of exceptions by Julia or the user label Jul 30, 2019

vchuravy mentioned this issue Aug 9, 2019

add TaskFailedException to propagate backtrace of failed task in wait #32814

Merged

tkf mentioned this issue Aug 12, 2019

API Request : Interrupt and terminate a task #6283

Open

tkf mentioned this issue Jan 5, 2020

Bug or misuse? Exceptions not propagating through multithreaded channels #34262

Closed

tro3 mentioned this issue Dec 4, 2020

qmap hangs forever when using do notation tro3/ThreadPools.jl#14

Closed

tro3 mentioned this issue Dec 10, 2020

Taking Structured Concurrency Seriously #33248

Open

tro3 pushed a commit to tro3/julia that referenced this issue Dec 12, 2020

Closes JuliaLang#32677. Improves parallelization in sync_end to guara…

d19fd7a

…ntee exceptions propagate upward

tro3 mentioned this issue Dec 17, 2020

WIP: Fixing the @sync/@async disappearing exception issue #38916

Closed

tro3 mentioned this issue Dec 26, 2020

RFC: Parallelize sync_end to remove async hang mechanism #39007

Open

jonas-schulze mentioned this issue Nov 22, 2021

[Distributed] Make worker state variable threadsafe #42239

Closed

Seelengrab mentioned this issue May 31, 2023

signal handling: User-defined interrupt handlers #49541

Open

7 tasks

vtjnash closed this as completed Feb 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error Handling in Tasks #32677

Error Handling in Tasks #32677

rohanmclure commented Jul 25, 2019 •

edited

Loading

vchuravy commented Jul 28, 2019 •

edited

Loading

c42f commented Aug 9, 2019

StefanKarpinski commented Aug 9, 2019

tkf commented Aug 10, 2019 •

edited

Loading

tro3 commented Dec 4, 2020

StefanKarpinski commented Dec 4, 2020

tro3 commented Dec 16, 2020

StefanKarpinski commented Dec 18, 2020

tro3 commented Dec 25, 2020

User-764Q commented Oct 12, 2021

vtjnash commented Feb 3, 2024

Error Handling in Tasks #32677

Error Handling in Tasks #32677

Comments

rohanmclure commented Jul 25, 2019 • edited Loading

vchuravy commented Jul 28, 2019 • edited Loading

c42f commented Aug 9, 2019

StefanKarpinski commented Aug 9, 2019

tkf commented Aug 10, 2019 • edited Loading

tro3 commented Dec 4, 2020

StefanKarpinski commented Dec 4, 2020

tro3 commented Dec 16, 2020

StefanKarpinski commented Dec 18, 2020

tro3 commented Dec 25, 2020

User-764Q commented Oct 12, 2021

vtjnash commented Feb 3, 2024

rohanmclure commented Jul 25, 2019 •

edited

Loading

vchuravy commented Jul 28, 2019 •

edited

Loading

tkf commented Aug 10, 2019 •

edited

Loading