Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove AggregateException wrapping in Async CE for extension and prevent threadpool transition #129

Open
bartelink opened this issue Dec 13, 2022 · 10 comments
Labels
bug Something isn't working enhancement New feature or request topic: ce extensions Extensions to other CE's like task or async
Milestone

Comments

@bartelink
Copy link
Member

bartelink commented Dec 13, 2022

At present the current impl of for within async { expressions raises two concerns for me:

  • AwaitTaskCorrect semantics would be preferable to promulgating usage of Async.AwaitTask in a place that most people will not necessarily even infer that it's in play.
  • AIUI Async.StartAsTask includes an unnecessary transition to the Thread Pool (with a potential context switch?) which could instead safely be replaced with Async.StartImmediateAsTask in this context. EDIT: moved to separate thread #135
  • the starting of the child Async does not propagate the continuation token (not sure whether fixing that is possible/required) EDIT: separate issue, see #133
@bartelink
Copy link
Member Author

Fringe ramblings which you may consider related: I tend to have an Async.startImmediateAsTask helper that I try to route the starting of child Tasks through within async {s :-

module private Async =

    let startImmediateAsTask ct computation = Async.StartImmediateAsTask(computation, cancellationToken = ct)

Aside: the term Immediate never really spoke to me; I tended to quickly discount it from consideration. I assume that naming is probably intentional. While I doubt aliasing an established name is likely to pass any scrutiny, perhaps naming such a helper something like Async.executeTask might express the intention better? (I'd call it Async.runAsTask, but for me the use of Run connotes Task.Run, i.e. implies an invocation via the Thread Pool)

Related: Potentially would advocate for inclusion of any startImmediateAsTask helper in a shims lib alongside #128

@abelbraaksma
Copy link
Member

abelbraaksma commented Dec 14, 2022

Yes, before I wrote this extension, I wasn't aware of the issues with Async.AwaitTask. This can clearly be improved. I'll probably need a test that showcases how this goes off the rails, to ensure the implementation of AwaitTaskCorrect (by whatever other name) is, well, correct.

As with #128, we should consider placing these in a separate library. Though for the sake of this one being a no-dependency library by choice, for now we'll just implement it here (and/or copy or link it from whatever other library we conceive).

@abelbraaksma abelbraaksma changed the title 'Fix' Async for semantics Remove AggregateException wrapping in Async CE for extension and prevent threadpool transition Dec 14, 2022
@abelbraaksma abelbraaksma added bug Something isn't working enhancement New feature or request topic: ce extensions Extensions to other CE's like task or async labels Dec 14, 2022
@bartelink
Copy link
Member Author

This can clearly be improved. I'll probably need a test that showcases how this goes off the rails, to ensure the implementation of AwaitTaskCorrect (by whatever other name) is, well, correct.

Yes, the honoring of Async.CancellationToken aspect to terminate the wait is notably subtle to me and makes covering it well worthwhile.
But by the time you've done that, the canonical impl will likely see some change (and I bet the perf can be improved)

Though for the sake of this one being a no-dependency library by choice

I'm in a similar quandary with Equinox - the core lib is tiny and making it depend on another lib only for AwaitTaskCorrect feels like jumping the shark and going leftpadesque. There are 5 other libs that need both TaskSeq and AwaitTaskCorrect. But the bottom line is that having a separate lib, no matter how small, is simply the right thing to do. UNLESS/until it goes into FSharp.Core.... (see conclusion below)

FsKafka is another case where the lib is a 400 line slab that has a stupid other file that dilutes that simplicity and directness thanks to AwaitTaskCorrect

FSharp.AWS.DynamoDB has a copy too, and I really would like for the world to have a way of knowing that a given exe has only one set of AwaitTask semantics in play when reasoning about stack traces and the inevitably hairy cancellation and/or hang issues (see dotnet/fsharp#13165 - a large part of whittling that down was doing stupid busywork diffing to rule that sort of stuff out). (for that lib, the larger issue is that it should probably present a Task layer, and then layer its current Async API over that - atm Equinox is having to do a redundant layer of Task over Async which then immediately drops to Task in the underlying impl within FSAWSDDB)

Having said all that, I respect the value of not having a package dependency and will likely distribute a copy of the final AwaitTaskCorrect into FSharo.AWS.DynamoDB, FsKafka, Equinox and Propulsion if that's what TaskSeq does (and will argue against making TaskSeq's version public to avoid people havign to self-discipline to not take a depedency on TaskSeq only for AwaitTaskCorrect). But I think I am saying I'm voting for *.ignore and Async.toTask to go in a single nuget until they can go into FSharp.Core where they should really be for everyone's sake.

@bartelink
Copy link
Member Author

bartelink commented Dec 14, 2022

Related: Potentially would advocate for inclusion of any startImmediateAsTask helper in a shims lib alongside #128
Addendum to this re fsprojects/FSharp.Control.AsyncSeq#74 (comment)

  • An (Async|Task).parallelThrottled would go well with Async.runAsTask and/or (Async|Task|ValueTask).ignore (forcing specification of a cancellationtoken and a degree of parallelism to be considered when going parallel). At present I have some ugly cases where I am abusing Async.Parallel within code that's Task-based. It might be a stretch for them to go into FSharp.Core for reasons of redundancy as per Don's comments, but having a clear surface area that makes one consider cancellation in all cases, and degreeOfParallelism in cases of going parallel would help in general IMO
  • are iterParallel(Async)? and mapParallel(Async)? missing from the roadmap?
  • could consider the notion of a generic Async.throttle as a way to constrain parallelism vs having a degreeOfParallism aspect, though a) that's likely less optimal and b) arguably that is less likely to have users end up in good designs
  • as it stands I will cart likely cart around a mapAsyncParallelThrottled helper as per Parallel sequence runs consequentially (non-parallel) FSharp.Control.AsyncSeq#74 (comment) until such time as either that becomes directly available, or there's a mapAsyncParallel that I then constrain via having an Async.Throttle and/or Task-awaitable lightweight semaphore wrapper
  • wrt all these helpers, having AwaitTaskCorrect/Async.toTask in FSharp.Core (perhaps as part of the MS 7.0 push? https://twitter.com/rbartelink/status/1597511316176252928) would allow these smaller helpers/API surface area considerations to be considered more clearly than having the need for that to be resolved be lurking in the background of all these fringe issues.

@abelbraaksma
Copy link
Member

abelbraaksma commented Dec 14, 2022

Thanks for all the input, some work is cut out for me :).

But I think I am saying I'm voting for *.ignore and Async.toTask to go in a single nuget until they can go into FSharp.Core where they should really be for everyone's sake.

Alternatively: we can take a source dependency instead. That way it can be internal.

are iterParallel(Async)? and mapParallel(Async)? missing from the roadmap?

All of AsyncSeq is potentially on the roadmap, but some design decisions there I don’t want to repeat. My first goal is to complete the ‘low hanging fruit’ roadmap, based on Seq‘s surface area.

This gives us task-like behaviour. Like F# task, it doesn’t support throttling or parallelism, but unlike task there’s more of a use case to support it explicitly. However, tasks are hot started, which makes this a different kind of challenge (since TaskSeq has both task-like and seq-like behaviour, it’s a blend between the two essentially).

See also #77 for the differences.

@bartelink
Copy link
Member Author

Ah, I missed the mention of the fact that parallelism is off the table for now in https://github.com/fsprojects/FSharp.Control.TaskSeq#further-reading-iasyncenumerable - focusing on getting stuff mapped out in the core feature set absolutely makes sense, especially given the non-trivial nature of the ancillary aspects as highlighted by this very issue.

@abelbraaksma
Copy link
Member

abelbraaksma commented Dec 19, 2022

Just FYI: the StartAsTask and StartImmediateAsTask points that you brought up here escaped my attention somehow. They're now separately under the umbrella of #135.

This thread can then solely focus on the issues around AggregateException and AwaitTaskCorrect (needs a better name).

@bartelink
Copy link
Member Author

bartelink commented Dec 19, 2022

👍 Apologies for mixing the concerns in the first instance. Putting this here as this thread has already run amok; not sure if it's really TaskSeq business but it would be good to see it result in either an agreed set of helpers, or a summary document somewhere. Some of this is covered in #128, but I'd like to explore the surface area a tiny bit more here first before this reverts to being a simple issue about the for impl being corrected...

TL;DR there needs to be a canonical set of helpers somewhere that emphasize:

  • not being painful to use with piping
  • with good names

TaskSeq currently
a) uses/should use the bulk of these
b) exposes some of the others

Below the fold, I have itemised the APIs I believe should be considered for wrapping.

My hope is that we can:
a) converge on what APIs go into the TaskEx signature set so it can go into a fresh issue
b) Close #128 as it is currently confusing about whether it is trying to address ignore, awaitTask, or the whole lib, pointing to ☝️
c) Close this in favour of a fresh issue that links to AwaitTaskCorrect and #135


ignore

Async.Ignore has always been ugly and undiscoverable. While I tend to do let! _ = <async stuff I want to ignore result of>, it's commonly the last expression in a function, and having to do let! _ = <thing I'm wrapping> in () is too much.

=> As appears to the be direction taken in this PR, any given construct should have a correct ignore impl, that corrects observes exceptions etc

module Async =
    let inline ignore (a : Async<'t>) = Async.Ignore
module Task =
    let inline ignore (t : Task<'t>) = ...
module ValueTask =
    let inline ignore (t : ValueTask<'t>) = ...

StartAsTask / Start

  • Start normally doesn't confuse people - It's clear it's in the thread pool, and cancellation can normally be swept under the rug.
  • StartAsTask has signature computation * ?taskCreationOptions *. ?cancellationToken
    • passing a cancellation token requires Async.StartAsTask (computation, ?cancellationToken = ct)
    • the name does not scream thread hop/thread pool
    • passing a cancellation token should not be glossed over, ever; Async being replaced with Task everywhere does not change the importance of treating the correct propagation of cancellation token as a first class concern that should always be on the table

Suggested API:

module Async =
    let inline runAsTask ct (a : Async<'t>) = Async.StartAsTask(a, cancellationToken = ct)

StartImmediateAsTask

As noted, this should be the default way in which Asyncs are started.

  • Because it is not starting a thread, it does not have a taskCreationOptions arg so it happens one can use it directly for piping.
  • The immmediate bit does not align with anything I'm aware of, although that also means it's meaning, once understood, is not ambiguous. I'd be suggesting execute as a verb to imply "do it here and now" alongside run as a better name igf the precedent had not been established.

My best suggestion is thus to have:

module Async =
    let inline startImmediateAsTask ct (a : Async<'t>) = Async.StartImmediateAsTask(a, cancellationToken = ct)
   // ALTERNATELY, esp if runAsTask name survives:
    let inline executeAsTask ct (a : Async<'t>) = Async.StartImmediateAsTask(a, cancellationToken = ct)

AwaitTask

The default impl is baked and wrong, but adding an overload is both desirable and questionable

  • this lib has Async.toTask and Task.ofAsync
module Async =
    let inline ofTask (t : Task<'t>) : Async<'t> = AwaitTaskCorrect t
module Task =
    let inline toAsync (t : Task<'t>) : Async<'t> = Async.ofTask t

Async.Parallel

The degree of parallelism parameter was added late in the game, but is critical - adding an arbitrary number of items to the threadpool should be a very questionable act.

module Async =
    let parallelThrottled dop computations =
        Async.Parallel(computations, maxDegreeOfParallelism = dop)

Further puzzlers wrt Parallel (which probably mean this should be excluded from the surface area of any set of helpers):

  • A common case is to use this to run but await failure of multiple Async<unit> tasks, having to do an Async.Ignore<unit[]> is ugly for that
  • How do you swap back/forth from that to Task, considering cancellation tokens and unwrapping AggregateException

@abelbraaksma
Copy link
Member

abelbraaksma commented Dec 19, 2022

Maybe you can copy that whole text as a new issue? It feels like it actually belongs in #128, no? Basically the whole idea of agreeing on a surface area for Task/Async, likely in its own repo and package.

Maybe I should just go set one up and we can continue there?

@bartelink
Copy link
Member Author

Forgot I own #128 🤡, that does make kinda sense. My intent was to transplant all this into a fresh issue.

But I did want for us to have a brief pre-discussion about the outline surface area before spreading it widely (and re-atting Don and TheAngryByrd etc)

I appreciate you're probably trying to avoid jumping straight back into this after expending lots of time and effort only yesterday, but making sure that we've both happy with the first cut before I create it afresh would be ideal, even if that means holding off for some days (it might also make sense to defer this until we (aka you) have validated that an impl of TaskSeq using AwaitTaskCorrect + StartImmediateAsTask internally actually works correctly; it might also yield some minor additional insights as to what the surface area needs to be)....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request topic: ce extensions Extensions to other CE's like task or async
Projects
None yet
Development

No branches or pull requests

2 participants