-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Two-phase unwinding and its implications #123
Comments
@aheejin This is a really great summary of our discussion and framing of the key questions! Thanks for putting it together! I have only one thing to note: the filter functions do not need to be run on a separate stack; they can be run on the leaf of the stack being examined. I also have something new to add relevant to Question 1. We've been working on stack switching, and we were stuck on how to provide a good way to detach stacks, say in order to attach them to a JS promise so that execution can be continued later. While most of the proposal uses exception-handling events incorporating the changes above, we recently realized that the best way to specify when to detach stacks would be to let code specify a phase-one filter-like function. I won't go into the details (we'll hopefully have a draft up soon), but likely this sort of functionality will need two-phase exception handling. To clarify per Question 3, stack switching does not need two-phase in the current EH proposal; it just seems to need the current EH proposal to be compatible with it. |
Agreed, it's not totally straight-forward, but doable. If we decide to spec filter functions, we will also need to think about what happens if the filter function throws another exception which is not caught within the scope of the filter function. |
This part seems pretty core. How should we evaluate whether two-phase unwinding is important for the MVP of exception handling? Do we have new information that leads us to change the previous judgement that it wasn't an essential feature to support? Would a later evolution from the current design to two-phase unwinding be possible if we were OK with having some duplication in the instruction set? |
The new information we have is that there is no clear way to extend the current exception mechanism to allow two-phase unwinding. Another piece of new (to me) information is that there exist languages (e.g. C#) that need to use two-phase unwinding because they make the phases semantically observable. IMO, supporting a niche feature of C# is not important enough that we have to introduce two-phase unwinding into this initial EH proposal, but it might be important enough (in combination with other benefits of two-phase unwinding) that we would want to make this proposal easily extensible to support two-phase unwinding in the future. The alternatives are that either we would never support two-phase unwinding or we would have to introduce a whole new exception handling mechanism to support two-phase unwinding in the future, which would significantly complicate WebAssembly's semantics. |
I personally think the importance of C#'s niche feature is not very significant. We asked Blazor developers and they said the usage of that feature is very rare and they didn't think it warrants a whole redesign. If we go two-phase unwinding route, I think a more important reason would be it allows us to preserve the whole stack intact in case an exception is not caught, which will help debugging. What changed in our assessment about the two-phase unwinding is little unclear. I only found one mention of that in this repo here: #49 (comment). It seems that we didn't want to enforce all implementors to have two-phase unwinding, and wanted to leave it as an implementation choice. It is true that we can implement this in languages' own library. The downside of that can be it will not work with multiple languages and it will be slower (it will not be zero-cost anymore). But that will simplify the spec and allow implementors not implement the (relatively more complicated) two-phase unwinding. |
This is an interesting feature indeed, but it does not require the full-fledged two-phase design proposed here. Without filter functions (which need to be executed in order to know whether they match) it would be enough to split |
How can we know if it will be caught or not in advance without running filter functions? Do you mean tag checks? But what I meant here by "caught" is that it really be caught in the language level. For example, in this C++ program, try {
throw 3;
} catch (float f) {
} This exception will not be caught in C++. What the filter function for this catch will check is if the current exception is of type Please let me know if you didn't mean this and I'm mistaken. |
Note the "without filter functions" in my comment. What I meant was: As long as we don't introduce filter functions, we don't need heavy machinery to figure out if an exception is caught or not. All it needs is splitting But from your comment I learned that even for C++ we would need filter functions, so my comment is pretty pointless I guess. |
What problems do you see with the JavaScript approach of snapshotting the stack when the exception is allocated and/or first thrown, and then also having a feature in DevTools to allow pausing on uncaught or all exceptions? Does this provide insufficient information for debugging? For example, maybe the purpose of this is to make the DevTools catch prediction more accurate, so we preserve not just the stack but the full execution state, in the case where there's a catch block enclosing the throw which does not correspond to this type of exception. If this were the goal, then I could vaguely imagine a scheme for a custom section that could be specifically to help DevTools catch prediction, but not affect the normal runtime semantics. |
@backes It's not the case that filter functions need to be run on a separate stack. Many language runtimes run them on the current stack. That is, semantically they are just like function calls. And implementation-wise, they're similar to function calls, except that the engine finds the code pointer for the call by walking the stack (to find the filter function) and when making the call the engine provides the code pointer with the stack frame that has all the local variables the filter function uses. We will likely need to support this functionality anyways for stack inspection, which is very widely used. For example, stack inspection is used to collect the roots for an implemented-in-linear-memory (rather than host supported) garbage collector, and also to collect the current stack trace in terms of source code (rather then WebAssembly code). In other words, most language runtimes rely on stack inspection, i.e. the first phase, in some way or another. C# is only an outlier in that its semantics visibly relies on stack inspection as well.
Note that the same is true for single-phase exception handling. Putting the above together, without supporting the same functionality that two-phase exception handling would require, most language implementations would likely have to implement all exception handling on their own and not use the exception-handling proposal (except for interop) in order to provide full functionality. |
As you said, we can get the stack trace with JavaScript approach of snapshotting the stack, but we wouldn't be able to inspect the full stack and all memory and locals at each call frame, the functionality most debuggers provide when a program crashes. I don't think this can be done by some auxiliary info in the custom section..? And not very sure what you mean by catch prediction. Could you elaborate? |
Across multiple (all?) browsers, DevTools supports pausing on only exceptions which are uncaught. For example, in Chrome, open up the "sources" panel and click on the stop sign/pause button icon in the top right corner--if you don't check the "Pause on caught exceptions" box, then only exceptions which DevTools thinks are uncaught will lead to a breakpoint. During this pause, you can inspect the full program state. Catch prediction the term used inside Chrome to refer to the algorithm used to guess whether an exception will be caught, at the point in time that it's thrown. Catch prediction is just a heuristic (e.g., due to some edge cases around returning from finally blocks), but it's a very useful one. One problem, common to this current proposal and JS, is that all catch blocks predict as catching all exceptions. If we could expose some metadata to DevTools so that it could approximate the filters at the time the exception is thrown, then catch prediction could become more accurate without changing how ordinary program execution works. I'd be happy to discuss this further on this thread, in a call, gchat, whatever works for you. But I'm actually not working on Wasm debugging; it'd probably be good to pull in people who are actually working on that to this conversation, if we're making a decision motivated by debugging (not sure who the right contacts are at the moment). |
Thanks for writing this up @aheejin and summarising a longish discussion. I'm still a bit confused, though.
I don't follow where this conclusion is coming from. AFAICS, the problem with the example you give is that it branches out of the handler, presumably without rethrowing to propagate to the next handler. That would be wrong in this context regardless of whether you have an exnref or not -- the problem is the control flow, not whatever the way is the exn value is provided. Can you elaborate on how specifically exnref is relevant to the problem?
As a side note, replacing a block-like catch clause with a branch is something we also did for the continuations/stack switching stuff I presented in February (where it is part of the resume instruction instead of try). That seems simpler than another block. If it was earlier in the lifetime of the current proposal, I would indeed propose to simplify the existing
where But the motivation here seems to avoid the exnref, so I can perhaps say more once I understand what actual problem that solves, see question above.
I believe this could be added as a later refinement to the semantics in the current proposal just fine, for example by adding a filtering variant of the
This works like the existing The existing (The extension could similarly be made to the simplified branching There might be alternatives to this design, e.g., replacing the filter function with another jump label and an additional |
Here's my sense of the problem. In two-phase EH systems, the first phase concludes by determining where(/if) to unwind to. In WebAssembly, this destination seems to best correspond to a label (and there is no event involved). The important implication of this is that unwinding is conducted with a destination in mind. That destination information has to be maintained throughout the unwinding process (which ends either upon reaching the destination or if an unwinder aborts the unwinding process, e.g. by branching out of the unwinding block). This is typically done by treating unwinders on the stack as functions, running each unwinder and then continuing the unwinding process to the predetermined destination once the unwinder concludes. This unwinders-as-functions strategy requires unwinders to have a clear start and end. But the reason
There is no need for multiple stacks. You can just call the filter function on the current stack, giving it the relevant stack-frame pointer. This implementation strategy is capable of more than just filter functions. In fact, it's how Common Lisp implements resumable/restartable exceptions. In order to avoid digging into Common Lisp, I found this blog post by someone exploring/discussing how to add resumable exceptions to the Java language and runtime, in case it's helpful for anyone wanting to understand how one implements two-phase exception handling on a single stack. |
The problem is, I branch out without rethrowing in order to rethrow from another
To solve this mismatch, the toolchain does this kind of transformation:
What this does is basically to introduce an inner try-catch, and within its catch body, branch to the right handler code. Here 'handler body' was originally in The problem is, this kind of transformation happens within toolchain, and the user should be unaware of it. Assuming we have two-phase unwinding and a filter function is attached to every If we don't allow |
I don't think that your reasoning or understanding of You seem to think that it is very important that unwinders have to be executed exactly like functions. When I asked why (in our Zoom meeting), your answer was based on your specific assumption about the stack structure in the unwinding process, which I think can vary depending on the implementation. Also I think that toolchain can make sure we don't make While I have not been able to understand your reasons well, I'm not posting this comment to suggest your reasons are not valid. Maybe they can be discussed separately; I'm just saying that what you described in #123 (comment) does not seem to be the same thing as what I said in the issue, because I don't want to confuse @rossberg and others. |
I don't follow. A label is local, it doesn't seem like a suitable representation for denoting an unwinding destination, which is a concept that spans multiple functions. I'm sure we don't want to introduce cross-function branches, so the notion of unwind destination (or continuation) has to be encapsulated in the VM's semantics anyway. How are labels relevant at that point? I also don't understand in what sense there is no "event" involved. A filter needs to have access to the exception value, I think?
This appears to me to confuse multiple issues. Exnref aside, you can jump out of the catch under the current design, and at the end of the handler it does not automatically continue unwinding. That has to be programmed explicitly. And once you do, it seems irrelevant whether there is an exnref? It is worth discussing wether we should make try-finally a built-in concept, to automate this. But a finally clause wouldn't get an exnref, so again I don't see the connection. Either way, AFAICS nothing prevents a language implementation from using this mechanism in such a way that unwind code is encapsulated in functions (which are called explicitly from the respective handler).
That's what I'm saying, I believe. But that's much more straightforward when filter functions are actually represented as separate Wasm functions. I'm not sure what notion of stack-frame pointer you have in mind, though. Can you clarify whether the relatively simple generalisation of the current design that I suggested above would be insufficient from your POV, and if so why?
Yes, but that only provides a limited form of resumability. Unfortunately, it is less expressive than general effect handlers, where you can capture the resume in a closure (corresponding to a first-class exnref). It thus is insufficient for expressing any relevant control abstraction. It is the special case of an effect handlers where the resume (i.e., the exnref) does not escape. See some of Daan's papers, where he talks about optimising this case (and others). |
I think I know what you mean, but AFAICS, exnref or its removal is mostly unrelated. The problem with two-phase unwind is all about control flow. Exnref is just some data object, it doesn't really affect the problem in any way I can identify.
That may hint at the disconnect I observe with this discussion. There is a hidden assumptions about a particular factorisation. I think having the catch code as it exists now doing the filtering is simply not the right approach! If you do it the other way round as per my suggestion above -- that is, having the catch code merely do the unwinding, and having the filter as a separate side mechanism running prior to that -- then I think all these problems go away? |
Here's some C# (6.0) code that illustrates some of what can happen behind the scenes in two-phase exception handling:
If you run it, it prints
So what happens is the phase-one code implemented in the CLR executes each of the Again, many runtimes rely on having some way to inspect the frames on the stack like this. C# is only unusual (though not alone) in letting programmers write custom code to run during the inspection that makes up the first phase of exception handling. |
In C#, is it the case that the finally clause is executed before the exception handler itself? |
Finally clauses are executed as the stack is unwound. Phase-one code does not unwind the stack (though it may conclude by initiating stack unwinding). That's why "Logging" is printed before "Unwinding" in my example above. |
I have a perhaps different mental model of how future two-phase unwinding would work and I'd be curious to hear if this mental model is either insufficient for the use cases at hand or there's a conflict I don't see. So let's say we extended wasm with the ability to iterate over the stack and dynamically extract opt-in bits of information from blocks on the stack. In particular, blocks could be associated with functions (which stack-iterating code could dynamically extract as a function reference). Given that, it seems like one could implement two-phase unwinding (and the C# example above, and Windows SEH, on which I assume C# is based) by having the first phase be implemented in terms of stack iteration and the second phase be implemented in terms of the present EH proposal. Thus, the fact that the present proposal destructively unwinds wouldn't be a problem b/c it only happened in the second phase. One nice thing about such an approach is that it would allow the source-language compiler/runtime to fully control what happened in the event of an uncaught exception. In particular, the source-language compiler/runtime could probably produce a better custom error report (that is thrown or (There is the subtle detail of how to ensure that the first-phase iteration can specify the exact catch site for the second-phase to unwind to (considering that a single syntactic |
So assuming I've understood you correctly, what you're noting is that stack "inspection" is generally useful, and that two-phase exception unwinding is essentially a stack inspection that can end with executing an instruction that "specif[ies] the exact catch site for the second-phase to unwind to". Stack inspection can be implemented in a variety of ways, such as opt-in bits (known as stack marks) and stack walks, or such as function calls to handlers on the stack. Regardless of how it's implemented, stack inspection seems like it'll be useful for a number of things, just one of which is two-phase exception handling, so I've prepped a few slides for tomorrow to foster a discussion on the topic and get some CG feedback, which will in turn hopefully provide useful information for this thread. |
That last bit - reporting from phase I the computed handler for an exception - will likely need special instructions that are not used by other stack inspection use cases. |
With I'm not exactly sure which part I should clarify, so if you can point out which part of #123 (comment) you don't follow, it would be helpful. What I tried to explain there is, But the first search phase is not aware of all this transformation we had to make, and searches up the stack in the order of 'new inner catch' -> 'catch1' -> 'catch2'. Here 'catch1' should not be included in the search path, and if it happens to have a filter function that matches the exception, the search phase result will be incorrect. Again, you may think that from the spec's point of view it may look OK, because at the point of branching and rethrowing the exception, it is a new exception. but in order to represent C++ (or other LLVM- or basic block- based compiler) exceptions, a single C++ exception can be translated to this inner try-catch and rethrow unbeknownst to users, due to the unwind mismatch problem. And for debugging a user would expect the whole stack (before and after this branching and rethrowing) to be preserved in case the exception is uncaught.
For example if we have a C++ code like this,
Do you mean we shouldn't be doing filtering, such as "Is this exception an
By the way, I'm open to Zoom meetings to clarify things. :) |
I think this is an interesting idea. But in two-phase unwinding it is generally efficient to save the result of search phase so that you don't repeat it in the second phase. So we implement first phase as a separate inspection proposal, I think we have to make sure the interop between the two phase as well.
This post happened to be mostly dedicated to this part: how we make sure that the result of first phase can be the same as the second phase; |
I work on Chrome DevTools at Google; to me as an implementer on the debugger: Two-phase unwinding to get break-on-uncaught-exception would be a killer feature! Thanks for the writeup and for pushing this discussion forward! |
@RossTate, I understand the example, but not necessarily how it answers my question. ;) @aheejin, above, @lukewagner probably explained much better the general idea I was alluding to: the catch block would not do the filtering, but some separate mechanism would. I proposed an explicit filtering mechanism, but you can probably find something more low-level as well. In neither case the meaning of rethrow is affected much, as long as you could only invoke it in the second phase, i.e., when the exception has already been caught (in my catch-when strawman above, a dynamic check could prevent the filtering functions from attempting a rethrow); alternatively, you could also make it equivalent to returning false in the filter. So yes, in that sense it would be a new throw (in the unwind phase). (And that is only one possible solution. FWIW, two-phase unwind can be expressed way more cleanly with effect handlers (or a subset thereof) instead of yet another ad-hoc mechanism. In fact, adding a resume_throw instruction to the current proposal would already be enough. But let's not get into that now. Point is, I think there are multiple ways to slice this and no inherent problem with the current proposal.) @pfaffe, I agree that break-on-uncaught is desirable. However, debug functionality is different from language functionality, and the former should not necessarily require the latter. For example, a debugger can inspect anybody's function locals, whereas the language should clearly not enable arbitrary code to do that itself. |
Hi all, sorry I'm late to the party. @RossTate thank you for Manuel Simoni's article, it really helped me better understand the connection between this proposal and the condition systems of common-lisp and of dylan. To me it seems that in fact this article contains yet another solution to extend the current exception handling proposal with resumption. To test this belief, I wrote up how I think the Simoni's solution would look like formally for WebAssembly, extending the current EH proposal (based on the formal spec of my PR #121). I didn't work out all the rules yet, but this is not something for this proposal anyway, just to show that this could be possible. The writeup: https://ioannad.github.io/exception-handling/core/appendix/restart-exceptions.html WDTY? Btw, about |
Wow, @ioannad, way to go all out! I'm not versed in #121, but from what I can tell, you conceptually have the right idea. Most importantly, you are modeling a First, you run the Second, in terms of design, in WebAssembly you probably want a resumable throw to be separate from an unresumable throw. This way, in the unresumable case, engines know they can do clean up while they look for a matching Third, I don't think you account for what happens if you reach the end of Putting all those subtleties together suggests (to me) that |
I do not think we should even try to handle this. Use Interface Types ... unless I am missing something? |
If you mean that we cannot hope to solve every isolation-related problem without interface types, then I agree with that statement. |
You can implement
God, I don't know. Maybe this could work. Who can tell? We can spend our time trying to come up with hacks for overcoming shortcomings of the current design, but I'd rather we spend our time improving the current design. I believe you and @rossberg are the two who have raised objections to making the improvements. But your prior comment made it sound like you're no longer objecting. Is that a correct interpretation? |
I don't understand why we are talking about I also would like to cautiously avoid mixing discussions of this proposal change and stack inspection. So if my understanding is correct, @rossberg and @backes preferred the idea of filter function, and @RossTate preferred the idea of filter block, which is also the "stack inspection" proposal he is referring to. I am not strongly opinionated on that part, but I personally think C#'s one niche feature shouldn't be the reason for the decision. What I mean is, if we choose to support filter blocks (instead of filter functions), which is more complicated for VMs to implement, there should be a better reason than "C# has this feature", which is very rarely used and even Blazor people said they didn't care about. But anyway, as I noted in this post, if we decide to support two-phase in the spec level, the spec change can be divided into two parts: changing the first proposal to be extensible to two-phase, and actually doing two-phase. This post was mostly dedicated to the first part, and I don't want the first part to be dependent on whether we later do stack inspection of not. |
Oh, sorry @aheejin, we were discussing hypotheticals, which is off topic. I agree with your summary of the discussion and your recommendation. |
This is a presentation that summarizes WebAssembly/exception-handling#123. I think the discussion is likely to take long, so I'm not sure how long I should book for this.
This is a presentation that summarizes WebAssembly/exception-handling#123. I think the discussion is likely to take long, so I'm not sure how long I should book for this.
This is a presentation that summarizes WebAssembly/exception-handling#123. I think the discussion is likely to take long, so I'm not sure how long I should book for this.
This is a presentation that summarizes WebAssembly/exception-handling#123. I think the discussion is likely to take long, so I'm not sure how long I should book for this.
Win32 extensions to C/C++ also require two phase EH. It might be nice to support that code. glibc on the other hand, copies exceptions to heap, and if heap is low, to a global. |
To restate my point, I don't think this should be considered a C# niche, but actually an important thing on Windows, that exists for a few very good reasons that seem to be not well documented. |
|
@jaykrell You're right that we would need two-phase exception handling to fully support the Windows SEH C++ extension. Do you know of any large/important/popular projects that would want to use this extension on non-Windows platforms (e.g. WebAssembly)? |
Is the extension you are talking about SEH C++ extension? Or is there another C/C++ extension? If it is SEH, to support it, I think we need not only the two-phase unwinding but more, for example, the ability to catch traps, right? I don't think that is fully supported even in Clang. So I'm not sure we are gonna aim to support all feature of SEH, if you think there is a certain capability you want to ask for I'd appreciate if you elaborate more on that.
Do you have any good pointers on this?
I don't understand. You mean catch blocks are treated as if they are also filters?
Also not sure what you mean here. An example would be helpful. By "exception objects contain pointers to stack", do you mean something like continuations? If so, that is not the goal of this proposal, but people are discussing about that in https://github.com/WebAssembly/design repo. WebAssembly/design#1359 is one proposal to address that kind of things. Not sure if this is you referred to though.
Yes, it uses linear memory in single thread, and thread locals in multiple threads. We didn't change that part in libc++abi. We haven't tested EH with threads yet though. Was there anything you wanted to suggest on this point? |
I'm just talking about SEH. I don't mean continuations. I don't mean stack copying. The catch block is not part of the first phase, not a filter. It should also be pointed out, there aren't really phases. Or filters. If the handler decides it wants to handle the exception, it does not return some special return value. The second phase then proceeds through the handler and through RtlDispatchException, and then on to the throw'ing function and on up till it hits the target frame. This time the parameters to the handlers indicate the second pass In the case of __except, when it gets to the block, as I recall, it does a fairly simple RtlRestoreContext right into the middle of the function. The __except block cannot rethrow In the case of catch, there is callback supplied to RtlUnwind, that calls the catch block (the catch block being actually a separate function, both with its own local variables, and access to the enclosing scope's local variables). While the catch block runs, the stack pointer has not yet been restored. The catch block can rethrow. The consolidated frame I guess serves to avoid running destructors a second time and such, if the catch rethrows. The catch block need not ever return (or rethrow). The program can just keep running there. And show a few things in debugger. Later, sorry. |
@aheejin, I believe @jaykrell is referring to the "filter expressions" in Windows SEH, which can perform arbitrary computation before deciding whether to resume normal execution, handle the exception, or continue searching for an exception handler. A key point here is that when the filter runs, the stack has not been unwound yet, so if the filter expression goes off and does something like run a web server, the stack frame that originally threw the exception will never be cleaned up. Supporting these Microsoft C++ extensions would clearly require two-phase exceptions because it exposes the two phases separately to user code. LLVM has some support for these extensions, although as an implementation detail, it requires that frontends outline filter expressions into separate filter functions. How much we care about this depends on whether anyone would actually want to use this language extension with WebAssembly, which is why I asked whether @jaykrell knows of any projects that would want to use it. |
Thanks for the pointers. I read more about SEH, and my takeaways are
Here you mean So for example, there are function A and B. A has
Is this not correct? You mean, 3 and 4 should be swapped? And I'm not familiar with SEH, so I didn't understand many things you referred to about those Also you said rethrows in SEH are surprisingly complicated, so I searched but I couldn't find how rethrows work in SEH. Pointers on that will be appreciated. |
Two-phase exception handling can support this functionality, and without continuations. This is how Common Lisp implements "restartable" exceptions, and it seems likely that this is how SEH supports this feature as well. |
IIUC, this would be simple to add to a follow up proposal that uses (edit: I hadn't seen Ross's very similar comment above when I wrote this. Oops!) |
Yeah, I was not suggesting it is not possible to support that with two-phase. That's will be a simple add-on. I was simply saying that in the current state of suggested features, mainly attach some filter (function or block) to |
When I said catch block I meant C++ catch block. I don't know of people needing all this, in WebAssembly, but I thought at a time of defining a new platform, it might be nice to achieve this broader compatibility, with the long term goal of porting tons of codebases to WebAssembly. I will try to show an example of what catch does. It isn't actually a great thing, but it comes about because the thrown object lives on the stack below the function that contains the catch. |
f1 and f2 appear to only be on the stack once, but that is just the most recent ones, in the midst of the exception handling. If you step through it, you'll see f1 and f2 and then RtlRestoreContext, and then step a bit more and frames disappears, w/o rsp being incremented. |
Closed by #125 and 9/15/2020 CG meeting. |
Spec issue: WebAssembly/bulk-memory-operations#111 This commit changes the semantics of bulk-memory instructions to perform an upfront bounds check and trap if any access would be out-of-bounds without writing. This affects the following: * memory.init/copy/fill * table.init/copy (fill requires reftypes) * data segment init (lowers to memory.init) * elem segment init (lowers to table.init)
Foreword
Recently, the need for two-phase stack unwinding has been suggested. Two-phase unwinding is a useful feature for debugging, because the stack can be preserved when an exception is not caught and the program crashes. Also, it is necessary to implement the full semantics of languages such as C# (due to its
when
clause). While each language may implement it in their own library if necessary, supporting it in the spec level can be more convenient or faster. We didn’t think it was an essential feature to support in the spec before, and I think whether we should support this in the spec level or not is also something we should discuss.In this issue, I’ll address the implication to the current proposal if we decide to support it. This issue is not only about adding syntax for two-phase unwinding itself; it is more about the extensibility of the current proposal in case we support the two-phase later.
Two-phase unwinding
Two-phase unwinding consists of two phases. The first phase is the ‘search’ phase; it walks up the stack without unwinding it and checks each
catch
instruction (= landing pad) to see if it catches the current exception. If it does not catch the current exception, we continue the search. If the search phase ends and none of the catches catches the current exception, we end the unwinding there without running the second phase, leaving the call stack intact. If we found acatch
that catches the exception, we remember the result, and begin the second, ‘cleanup’ phase. In this phase, we actually unwind the stack, while running all cleanup (= destructor) code, until we reach thecatch
that catches the exception. After we arrive at the destinationcatch
, we stop the second phase unwinding there, and transfer the control flow to thatcatch
.Two-phase unwinding will require
catch
instruction to have a way of filtering an exception. That can be an index to a wasm function, or a code block attached to acatch
instruction. And the VM should be able to run these functions or code blocks without disturbing the call stack during the first ‘search’ phase of two phase unwinding.But other than adding a filter function to
catch
, we may need more changes to the current proposal. The most important one of them isexnref
, which I will elaborate on next.Background on first-class
exnref
We introduced first-class
exnref
type in 2018, mainly to support more expressive code transformation in the toolchain.The specific problem that sparked this discussion was that, after the “CFG stackification” phase in the toolchain, in which we place marker instructions like
block
,try
,loop
, andend
, it is possible that there can be mismatches in calls’ or throws’ unwind destinations. This problem was first discussed in #29 (with different solutions then). To re-summarize the problem here, suppose we have the following CFG:And the CFG is sorted in this order. Then after placing
try
markers, the wasm code will look like:The first-class
exnref
allows us to fix this kind of mismatches because with that nowexnref
s can escape their originalcatch
blocks they are originally caught and we can freely branch to the right handler to use them.Incompatibility of first-class
exnref
and two-phase unwindingSuppose we extend our current proposal to support two-phase stack unwinding. In the first phase of the unwinding, we walk up the stack to examine each
catch
’s filter function without actually unwinding the stack, meaning we shouldn’t run anycatch
bodies. In order to do that, we can use the internal EH pad stack maintained in the VM. But the problem when we haveexnref
s that can escapecatch
bodies is, we don’t have a way to find out the nextcatch
in this first search phase.In this example, when
foo
throws, the first phase should checkcatch1
first, and if it does not catch the current exception, it should checkcatch2
next. But withexnref
, the code can look like this:In this case, semantically the program shouldn’t run
catch2
body anymore, because we branch out of bothtry
blocks. But the first phase, which does not run anycatch
bodies and only check filter functions, will checkcatch2
after checkingcatch1
. This was not a problem when we only have a single phase, because in that case we run the actual program and unwind the stack as we go. But with two-phase unwinding, we need a way to examine a sequence of catches in the first phase without running the program.Recently #118 and #122 claimed the current proposal is not extensible to support the two-phase unwinding feature. I, @dschuff, @tlively and @RossTate had a video meeting, and while I’m not sure Ross’s reasoning was the same as the things I described here, we agreed that escaping of first-class
exnref
can cause problems for future two-phase unwinding.Necessary changes to the current proposal
In short, we need to remove the first-class
exnref
, which pretty much means going back to the first version of the exception handling proposal.catch
instruction now can be back tocatch $tag
andcatch_all
(but this may not be a must). We may not needbr_on_exn
anymore if we restore tagged catches, but having it can also increase code generation flexibility, in which case, it will assume the current exception as an implicit argument. But if we remove first-classexnref
, we need a way to solve the unwind destination mismatch problem I described above.Adding
catch_br
instructionOne possible addition to the current spec I and @dschuff thought about is
catch_br
instruction. So now we have two kinds of try-catches: one is the same as the current one:On top of that, we now have one additional syntax:
catch_br
does not have a body, so it does not need anend
at the end. If any instruction betweentry
andcatch_br
throws, we transfer control flow to an outercatch
that is specified by the label. The label will be written as an immediate in binary, as in the case ofbr
. The unwind mismatch example above can be solved using this instruction:Now we introduce an internal
try
-catch_br
, so whenbar
throws, we can transfer the control flow tocatch2
, bypassingcatch1
. We can follow these labels in the first phase search too. Because these labels (= immediates) are statically determined, we don’t need to run or refer to any catch bodies to search the stack.Splitting of
catch
andunwind
The second, ‘cleanup’ phase involves running destructor code and unwinding the stack. But after running destructor code, we need a way to resume the second phase unwind until we arrive at the destination catch found by the search phase.
rethrow
will not solve this problem, becauserethrow
is the same asthrow
except it retains auxiliary information, such as stack traces, so it will trigger a full two-phase search from scratch. One way to fix this is splittingcatch
andunwind
block, so that we addtry
-unwind
, and assume at the end ofunwind
block the second phase search is automatically resumed:This was one of changes suggested in #118. Another alternative to this is to add another instruction
resume
. This is different fromrethrow
; this does not initiate a full two-phase search. This merely resumes the second phase unwinding. The toolchain will be responsible for generatingresume
after destructor code.Concluding remarks
Apparently this is a lot to put in a single issue post, but I’d like to start discussions from here. There are many things we need to discuss:
Apparently we can’t discuss all these at once; I think we should start from 1 and 2.
Also I’d like to hear from VM people as well, because two-phase unwinding, especially running filter functions in a separate stack, will likely to be not a simple matter from the VM side.
The text was updated successfully, but these errors were encountered: