Error and cancellation handling #729
Replies: 0 comments 61 replies
-
IMHO, I agree with Dimitri in saying that cancellation can be implemented using exceptions. Disclaimer: I was not close to the P1677 discussions, and I don't have in-depth knowledge on how exceptions work in Val. I would try to lay down some of my intuitions and arguments based on that; I haven't done the exercise to make this too formal. I think at the core of the dispute, there are a few types of arguments:
For all of these, it's important to notice that there are fundamental differences between C++ and Val, so arguments cannot simply be transferred from one language to the other. 1. What cancellation meansIn P1677 and S/R framework, we accept as "real" scenarios the following:
We insist that "cancelled" is as real as "completion with success" because we encounter it in practice so often. But, by the same argument, “partial success” should also be a “real scenario”. For example, while trying to read 10 MB from the network we only get 1 MB of data; this can happen because the network was disconnected, or that the user cancelled the download. For this example, it is probably useful to keep track of the downloaded data, so this case cannot be simply mapped to error or cancellation completions. We can frequently find examples like this, so, should we create new types of completions? We can most likely imagine other real scenarios that would make us believe that we need to add other completion types. Maybe we want to distinguish between upstream and downstream cancellation, or whatever… I do believe that a good way to break completion into categories is to look at pre- and post-conditions. A success completion signal would guarantee that post-conditions of the operations are met. All the other scenarios would not guarantee this. For example, for a stack-push operation, the post-condition is that the stack is not empty. If we have an error during the push operation, or if we cancel this operation, the post-condition cannot be always met. Thus, from a semantic perspective, it makes sense to divide the types of completions into two parts:
2. ComposabilityLooking at P1677, we see examples on how exceptions may not compose. But, I would argue that these cases are due to limitations of C++ ecosystem, and how we are trying to build S/R on top of it. I would argue that a good error hygiene avoids the composability problems. Let’s look at an example on how one might write concurrent code in Val. We want to start a new series of computations (
I am assuming that all these computations are magically using a shared stop token (not very Val-friendly, but possible) Now, without knowing how the error and cancellation is handled for this computation piece, we can see that this composes really well. Just like it would compose in P2300. Please note that composing computations is a monadic operation. That is true when we write S/R code with respect to errors and cancellation. But that is also true when we write code that may throw exceptions (that is: 90% or more C++ code uses monads for composition). The thing that is important in the above example is that errors and cancellation composes the exact same way: in a monadic fashion. Thus, from a composability point of view, it makes sense to use the same mechanism for handling cancellation as we use for handling errors. 3. Usability and significance for S/RS/R framework can complete a computation in 3 ways:
This can be transformed, without losing generality, into something like: From this perspective, one can argue that it is not relevant how we pack the states together, as long if we can convey the same information. We can put cancellation and error case toghether, without losing the anything from the computation. The only difference is how we actually use this result. In C++, the usability may be a bit cumbersome, but that is because of C++ design choices. In Val, it turns out that expressing cancellation as yet another type of error is not a usability problem (as far as I understood it; Dimitri and Dave can correct me). Moreover, if we simplify a computation to only produce a value type, then all the computations would have a shape of This is actually the main selling point for me to make Val combine cancellation with exceptions. 4. Exceptions assumptionsComing from a C++ world, we typically make the following assumptions for exceptions: And here, the second point mostly derives from the first point. Other than that, it’s just an arbitrary convention. My understanding is that Val fixes the performance issue. Thus, it may be acceptable for Val to say that exception can be used more often in cases of cancellation. — |
Beta Was this translation helpful? Give feedback.
-
Up to now I have been on my phone stealing time from other work. This reduces expressiveness and leads to even more misunderstanding. I am not sure where to start:
I have problems getting what is in my head into a form that fits into someone else's head. I have had a few coworkers that had the patience to iterate with me until they were able to say back to me in their own words something that matched what was in my head. Often the process improves what is in my head and develops better ways to express the ideas. I revere these people. I think that the most important thing to add is that others don't always agree with what is in my head - and once I know that they disagree with what is actually in my head - I am content. I have no interest in persuasion or manipulation. I just want to share my perspective accurately. Any preference?
Yes, others are much better at writing papers than I am. 🤷 I am not happy about this. I put a lot of effort into communication in all forms. |
Beta Was this translation helpful? Give feedback.
-
Attempt 1Discover concepts by studying Data Structures and Algorithms
Stepanov, A., 2006, Notes On Programming, Lecture 13. Iterators (pdf)
auto populateMovies(auto maxTime, auto serverCA, auto serverUS, auto filter, auto tileContainer) {
return then(
timeout(when_all(
when_any(retry(requestMovies(filter, serverCA)), retry(requestMovies(filter, serverUS))),
when_any(retry(requestThumbnails(filter, serverCA)), retry(requestThumbnails(filter, serverUS)))), maxTime),
[tileContainer](auto movies, auto thumbnails){
for (auto movie : movies) {
tileContainer.add(make_movie_tile(movie, thumbnails));
}
});
}
Some issues with cancellation as an error
|
Beta Was this translation helpful? Give feedback.
-
This is a good start.
To me, none of the issues seem decisive (admittedly I don't understand the “failure for the application” issue). It seems like you might have discovered some kind of truth within the constraints of C++ as practiced today, but I'm not even sure of that. I certainly don't see anything determinative for a new language, yet. Am I missing something? |
Beta Was this translation helpful? Give feedback.
-
First, I don't believe that every function needs to cooperate, unless you count being unconcerned with exceptions as “cooperating.” Any function that doesn't break invariants needs no catch blocks (or other cleanup mechanisms for that matter) and can just let exceptions pass through. That's the vast majority of code, so in fact most functions don't need to cooperate in any meaningful way. Second, the kind of cooperation that is required is—unless I'm missing something—exactly the same kind of cooperation required for correct error handling. So AFAICT, this is not some new, unique problem.
I'm not gonna do that. Photoshop alone is > 30M lines. @sean-parent should feel free to correct me, but IIUC, Adobe application code is full of (user-initiated) cooperative cancellation that is propagated via an exception. That's certainly the way I did user cancellation when I was writing my own desktop application. Worked a treat.
You mean specifically these async compositional primitives. Sure, there are a few fundamental ones, and if you invent a new async combinator, I'd expect you to have special cancellation handling there, too. Seems fine to make those do a little work to check the reasons any sub-task "failed."
As with any other unwinding process. In fact, everything in the whole program must be written to not leave any violated invariants. That's sort of the meaning of “invariant.” Aside from needing special attention in async combinators and not (usually) requiring any direct reporting to the application's user. I still don't see anything fundamental that distinguishes cancellation from other kinds of unwinding. 🤷♂️ |
Beta Was this translation helpful? Give feedback.
-
To me, the issue is pretty simple. Kirk already said it all, but I'll reiterate/summarize: The fundamental difference is:
Everything flows from there. "ERROR" feels failure-ish to me and "STOPPED" doesn't. The algorithms we've written very frequently want to handle these differently. If I use the rule of thumb in C++ to only use exceptions for unexpected and infrequently-occurring conditions, then "ERROR" is an exception and "STOPPED" isn't. Why? Because if I'm waiting for a result, I expect to get it ... unless I tell it to stop, in which case I expect it to stop. Stop requests are normal operating procedure for some very common generic algorithms like I know there are some APIs that can't "fail" ever (i.e., won't do what I ask it to do), but I still want them to be stoppable. And I want that distinction surfaced in the type system and the programming model. "Will-always-do-what-you-ask-me-to-do" is a nice thing to know about a component. |
Beta Was this translation helpful? Give feedback.
-
It is unfortunate that the discussion seems to have stalled. I think it is a great topic and I would hate to miss an opportunity to contribute important insights to language design as a discipline. Perhaps stepping back from technical details might help us get back on track. After some time trying to let the different arguments simmer in my mind, I think the main contention is a philosophical question. When @ericniebler says that some things feel "failure-ish" and some don't, I get the sense that "failure-ish" is a personal appreciation. To give an example, in this comment, @kirkshoop said:
I answered:
I think both point of views are valid, depending on one's definition of "failure-ish". My penchant for rigorous formal definitions causes me to typically dislike "-ish" qualifiers. Revisiting the entire discussion so far, I think we still haven't found clear formal criteria. That does not necessarily mean there can't be a useful distinction between specific instances in specific applications, it only means we're lacking a general-purpose and unambiguous definition. When confronted to this kind of situation, a reasonable approach might be to identify the characteristics of the different concepts and unify them under one abstraction. In our case, I've been using the term "error" to denote this abstraction, but for the sake of the discussion, let's use a different one: "unconventional result". Let's also put aside any assumption on handling mechanisms, type system representations, or any other kind of technical detail for the moment. Using that new term, let's see if we can agree on the following statements:
Crucially, notice that I am not saying that cancellations are errors and that I'm not arguing against the fact that cancellations should be expected. Checkpoint: do we agree so far? Now, I posit that there are likely more than two categories of unconventional results. If that assumption is correct, then it is likely simpler to handle all of them with a single general-purpose mechanism. To give an example, I wrote a parser combinator library where each composable parser may produce a "hard failure" or a "soft failure". A hard failure occurs when a parser failed to consume a suffix after having consumed a prefix. It typically propagates far, and requires cleanup to restore the character stream. A soft failure occurs when a parser failed to recognize a prefix and is typically handled close to its origin. I also posit that it is likely some unconventional results fall under multiple categories. If that assumption is correct, a strict tree-shaped hierarchy would be difficult to define. It would be best to assign an unconventional result to one or more categories using an attribute system (e.g., traits). Checkpoint: do we agree so far? Until we can find a set of criteria that let us take any unconventional result and determine whether it falls under a single rigorously defined general-purpose category, I posit that a general-purpose language should not get different features to interact with them. To give an example, I think that neither cancellations nor soft failures fit the requirement. The former do not have a rigorous definition (AFAICT) and the latter are not general-purpose. Finally, I note that if I were concerned about the compiler pessimizing the handling of unconventional results that are likely to occur, one reasonable solution might be to provide the compiler with a hint in the form of an annotation. |
Beta Was this translation helpful? Give feedback.
-
This is a very long thread with some good observations that I don't have time to fully respond to at the moment but I did want to make an observation. I believe (and may be wrong) that part of Kirk's argument is that the function result is not just propagating "was canceled" but also "cancel was requested". That is how I read that x can satisfy the post conditions for U. If the post conditions for U are satisfied, the operation is complete and cannot be canceled or retrun an error - but it may (it some system) return it completed successfully and also received a request to cancel.
If the above is correct then we need to discuss how requests to cancel are signaled and if that can be separated from handling the cancelation request. I believe it can be. Part of what makes cancelation different is that it is an out of band signal.
Once a request is received, I can't see any difference in how it should be handled compared to an error - which is different than how success is handled.
Is there a "basic canceled guarantee" that is different from the basic exception guarantee? I don't believe so.
For when_any, the when_any may request cancelation of the dependent tasks once it receives a non-error value. The operation performing the when_any may receive a cancelation request which is forwarded to all dependent operations of the when any. Any operation being performed by the when_any may receive a cancelation request from any external agent or may return an error. I don't see a reason that when_any would handle a canceled operation any differently than a failed operation.
I think there is a very large conversation to be had about how requests for cancelation are signaled.
…________________________________
From: Kirk Shoop ***@***.***>
Sent: Saturday, October 1, 2022, 1:45 PM
To: val-lang/val-lang.github.io ***@***.***>
Cc: Sean Parent ***@***.***>; Mention ***@***.***>
Subject: Re: [val-lang/val-lang.github.io] Error and cancellation handling (Discussion val-lang/val-lang.github.io#33)
I am personally comfortable with the approach I suggested and do not think it is violating any fundamental principle.
Interesting. I think I understand your logic.
Here is the same logic applied to U.
"I think it is expressive enough to implement the use cases [of set_value] that have been shown, based on the fact that I believe it is not only possible but actually easy to create a mechanism that filters any subset of , for any arbitrary predicate.
I do not believe that any value in deserves special treatment, because the correct behavior of a program depends on any value that may be in this set."
And resulting picture is:
struct Receiver {
template<class... Un>
void set_some_word(Un&&...);
};
Here is the same logic applied to "A function that accepts multiple parameter can be represented as a function that accepts a tuple".
"I think it is expressive enough to implement the use cases [of multiple arguments] that have been shown, based on the fact that I believe it is not only possible but actually easy to create a mechanism that filters any subset of , for any arbitrary predicate.
I do not believe that any value in deserves special treatment, because the correct behavior of a program depends on any value that may be in this set."
The result is that functions in this language have exactly one argument value and one result value.
—
Reply to this email directly, view it on GitHub<https://github.com/val-lang/val-lang.github.io/discussions/33#discussioncomment-3779487>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AARMSLBTJJAVJK43W33YHALWBCPH3ANCNFSM6AAAAAAQSR5M3M>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
Why is that an interesting distinction? Doesn't the former imply the latter? |
Beta Was this translation helpful? Give feedback.
-
In P2300 model, there is no distinguish between the two cases. The receiver just gets a notification that the sender was cancelled, discarding information on how it was cancelled. There are several main method in which a receiver gets a signal about cancellation:
In all cases, the receiver just knows that the sender has been cancelled, but it doesn't know how. |
Beta Was this translation helpful? Give feedback.
-
Because the latter does not imply the former.
…________________________________
From: Dave Abrahams ***@***.***>
Sent: Saturday, October 1, 2022 3:52:50 PM
To: val-lang/val-lang.github.io ***@***.***>
Cc: Sean Parent ***@***.***>; Mention ***@***.***>
Subject: Re: [val-lang/val-lang.github.io] Error and cancellation handling (Discussion val-lang/val-lang.github.io#33)
part of Kirk's argument is that the function result is not just propagating "was canceled" but also "cancel was requested"
Why is that an interesting distinction? Doesn't the former imply the latter?
—
Reply to this email directly, view it on GitHub<https://github.com/val-lang/val-lang.github.io/discussions/33#discussioncomment-3779717>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AARMSLEQFZFV3INI43VKKTDWBC6EFANCNFSM6AAAAAAQSR5M3M>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
(This is (more or less) email I sent to Dave. I'm posting it here with a warning: I'm not going to be very present for this conversation.) Eric Niebler sent me a note saying that you’re considering cancellation in Val, hoping I could warn you away from having Val make the same mistake C++ currently does. Here are my thoughts; feel free to share them. My position is that errors and cancellation are served by two different design patterns, which I call “failed strategy” and “serendipitous success,” respectively. While both design patterns involve what we may call “exceptional exit” from a function, C++ exceptions are suited to the failed strategy pattern, and not well suited to the serendipitous success pattern. In the failed strategy pattern, a function determines that it is unable to meet its local goal, and exits (exceptionally) without meeting the goal. This causes a cascade of enclosing functions to also not meet their goals, and exit exceptionally. The cascade stops when it reaches the innermost function that has an alternate strategy for meeting its own goal. (Sometimes we must phrase the goals carefully to see this pattern, using goals with an “or” in them. One function might have “Do X” as its goal, while an enclosing function might have a goal of “Do XYZ or announce a failure to do XYZ.”) In the serendipitous success pattern, a function exits (exceptionally) when its local goal becomes moot due to the serendipitous success of some enclosing function’s goal. The local goal may still be achievable, but is no longer necessary to the already-successful enclosing function. This causes a cascade of enclosing functions to also exit, generally without meeting their goals. The cascade stops at the outermost function that has already met its goal. (Again, we must phrase the goals carefully to see this pattern, sometimes using goals with an “or” in them. One function might have “Do X” as its goal, while an enclosing function might have a goal of “Do XYZ or receive permission to not do XYZ after all.”) The principal differences between the two patterns are: • Reasons for failure relate to the lowest-level goals, while reasons for cancellation stem from high level goals. This means that communication in these two patterns runs in different directions. Potential reasons for cancellation need to be communicated from high to low before a potential cancellation, and actual reasons for failure need to be communicated from low to high after failure. • If two enclosing functions provide alternate strategies, the innermost one stops a failure cascade and enacts its alternate strategy. But if two enclosing functions have serendipitously satisfied goals, the cancellation cascade continues to the outermost satisfied function. The two patterns also have an interaction: when a low-level function has failed and, despite that, a high-level function has serendipitously succeeded, we wish to exit to the successful high-level function, even if alternate strategies for responding to the failure are provided at intermediate levels. The C++ exception model is designed around the failed strategy pattern: the decision to throw is made at a low level, reasons are communicated upward, and the exception is caught at the innermost handler. But catch(…), which is essential to the strategy pattern, gets in the way of cancellation. That is, “this intermediate function has another strategy to try” shouldn’t be taken as “the goals of this intermediate function outweigh those of its callers.” In the “Cancellation is Serendipitous Success” paper, I laid out a scheme for high-level functions to express goals that may be tested by low-level functions at designated cancellation points. Briefly: a new kind of catch clause would specify an expression; during the execution of the try block, the expression would be evaluated at cancellation points; if the expression evaluates to true, the stack is unwound and execution proceeds with the corresponding handler. (There is no exception object in this sort of exceptional exit.) I later came to think that I might be repeating a design mistake that went into the existing C++ exceptions: providing a complex mechanism without first providing the simple parts that make up that mechanism. Going back to the drawing board, I considered the possibility of building a less elegant library-based cancellation mechanism. One piece was missing: a destructor-respecting longjmp. I sketched out, but didn't finish, a paper on that idea; I'm including it here. Don't pay too much attention to the syntax; I'm sure a younger language could to better. (Looking over that draft, I am reminded that in C++ we have a third form of exceptional exit: the destruction of a suspended coroutine. This also must not be blocked by catch(…). I also note that most of the complexity of the paper came from preventing a jump into a suspended coroutine; that could be simplified by saying coroutines can’t yield from inside a try-block. Or a wacky alternative idea: one could design a mechanism for throwing into a suspended coroutine, reactivating it so that it can catch the exception.) |
Beta Was this translation helpful? Give feedback.
-
I am wondering whether there has been some more internal discussion or initial modelling on this topic? I have been closely following this conversation and it was extremely educational for me. Would love to know if there is an update. |
Beta Was this translation helpful? Give feedback.
-
rust-analyzer (LSP for Rust) has both errors and cancellation, and might be a somewhat interesting case study. Cancellation is needed to abort in-progress type checking and such when the user modifies source files. To do cancellation, we rely on dynamic semantics. Nothing in the function signature tells that it is cancelable. Similarly, nothing signals at the call site that cancellation is possible. For error handling, we use "errors are values" semantics. Fallible functions return a This distinction is useful because we want to minimize the extent of code which can fail, and maximize the extent of code which can be cancelled. More or less everything can be cancelled, very few things can raise/need to handle errors. If a function can't be cancelled, you want to make it cancelable because otherwise you increase latency. If a function can fail, you want to find a way to to make it infallible (eg, by pushing IO to the caller), to simplify the code. Practically, "errors are values" requires extra ceremony in high order code (eg, you need In terms of how these two are implemented, I don't think we care much. What happens is that cancellation is handled by stack unwinding via unwind tables, and errors are using normal returns. It seems plausible that using unwinding as an implementation strategy for value-based error semantics could actually be faster. What helps though is that these use two distinct mechanisms, so code handling errors does not need to worry about accidentally catching cancellation and vice verse. Which brings me to a third thing! Another case of roughly the same shape we have is panicking. As rust-analyzer is a long-running "desktop" application, we don't really want to outright crash the process if some stupid small feature somewhere has index out of bounds. So, for bugs like assertion failures, we want to abort coarse-grained feature, show a "send bug report" dialog to the user, and then move on. Internally, panicking is implemented using the same unwinding mechanism as cancellation (with a difference between the two that panicking captures&resolves a backtrace, while cancellation doesn't). This does create a small problem that the code catching cancellations should explicitly pass through unwinds originating from panics. All three mechanisms (cancellation, error handling, abandonment) lean heavily on "transactional" semantics of the underlying data store which doesn't allow data to get into a bad state. There is a thin slice of code which can mess up state, that bit isn't cancelable, can't fail, and an assertion failure there would crash the process. |
Beta Was this translation helpful? Give feedback.
-
I'm opening this thread to continue the discussion about error and cancellation handling, started by Eric Niebler here: https://github.com/val-lang/val-lang.github.io/discussions/32#discussioncomment-3694357.
The goal is to discuss the optimal design for Val's error and "serendipitous-success" handling in terms of expressiveness and ergonomics.
Beta Was this translation helpful? Give feedback.
All reactions