Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore rethrow's immediate argument? #126

Closed
aheejin opened this issue Sep 16, 2020 · 39 comments
Closed

Restore rethrow's immediate argument? #126

aheejin opened this issue Sep 16, 2020 · 39 comments

Comments

@aheejin
Copy link
Member

aheejin commented Sep 16, 2020

We left out rethrows immediate argument in the last CG meeting's poll because we weren't sure about the use cases, but @rossberg suggested one in #125 (comment).

I think having this functionality is in general good as long as there exist languages that

What do you think?

@fgmccabe
Copy link

This seems odd to me. Different languages have different semantics for throwing exceptions in general, including rethrow.
It seems pretty unlikely that a single low level mechanism will be cover all the conflicting requirements.
A better approach IMO would be to look at how the throw instruction would work in a rethrow situation.
TLDR; this feels like the wrong question (should rethrow take an immediate).

@aheejin
Copy link
Member Author

aheejin commented Sep 16, 2020

@fgmccabe

A better approach IMO would be to look at how the throw instruction would work in a rethrow situation.

I'm not exactly sure what this means. Do you mean, can we use throw instead for rethrowing, which in turn means "should we even keep rethrow"?

I think having two kinds of throwing instruction, currently throw and rethrow, offers more options to language implementors. Some languages, such as C++, do not need that functionality, but other languages might. And both the current proposal and the newly-decided proposal have it. I didn't mean to re-discuss whether we should keep rethrow or not here. (If someone wants to discuss it, it is a separate discussion.)

Restoring the immediate means whether we should allow rethrow to choose an exception to rethrow other than the topmost (or the recent-most) one, and that was the focus of this issue.

@RossTate
Copy link
Contributor

It's hard to discuss tradeoffs between design alternatives for a feature without known use cases for the feature. I think what @fgmccabe is noting is that, when you consider the potential use cases, you don't find much consistency. For example, Java, Python, and C# each have different semantics for rethrowing exceptions. But maybe those are not meant to line up with the rethrow instruction. So having concrete examples of how rethrow would be used would help facilitate this discussion.

@takikawa
Copy link
Collaborator

takikawa commented Sep 16, 2020

I read the justification for including this that @rossberg posted, and I agree that making nesting of handlers easier/possible is a natural choice from the language semantics point of view.

However I'm a bit concerned about the complication for implementing this, as I think the immediate argument for rethrow effectively introduces an implicit exception stack that's separate from the normal value stack (the previous design with exnref was more consistent in this regard, with everything being on the value stack). This second stack may need to be explicitly tracked by an implementation, to ensure outer exception references remain live.

@tlively
Copy link
Member

tlively commented Sep 16, 2020

Is it correct that rethrow-with-immediate can always be emulated by adding a branch out to the correct catch scope and then doing a rethrow-without-immediate? If so, I'm not convinced by the composability argument since that's a fairly simple transformation. I would be convinced if we had a usecase where rethrow-with-immediate would be used often enough that there would be a code size win from not having to do that transformation, but given our lack of rethrow use cases, I don't think that's a compelling hypothetical, either.

@aheejin
Copy link
Member Author

aheejin commented Sep 16, 2020

@takikawa

However I'm a bit concerned about the complication for implementing this, as I think the immediate argument for rethrow effectively introduces an implicit exception stack that's separate from the normal value stack (the previous design with exnref was more consistent in this regard, with everything being on the value stack). This second stack may need to be explicitly tracked by an implementation, to ensure outer exception references remain live.

Don't you need to keep this exception stack anyway even without the immediate argument? As @tlively said, we can still go from the inner catch scope to outer catch scope (by a branch or fallthrough) and rethrow the outer exception from there.

@aheejin
Copy link
Member Author

aheejin commented Sep 16, 2020

@tlively

Is it correct that rethrow-with-immediate can always be emulated by adding a branch out to the correct catch scope and then doing a rethrow-without-immediate?

I think so.

but given our lack of rethrow use cases, I don't think that's a compelling hypothetical, either.

For C++ we lack use cases, but for languages that use rethrow functionality like that in #125 (comment), I'm not sure..?

@RossTate
Copy link
Contributor

That comment does not refer to any actual language. We currently have zero use cases. Python and C# both modify their exceptions when they rethrow them, and so they would have to implement their rethrow with throw of a new exception payload. Java could just as easily throw the rethrown exception. So we have multiple languages that would not use a rethrow instruction to implement exception rethrowing, and we have another language that could just as well implement exception rethrowing with throw.

Without knowing what rethrow is useful for, we cannot evaluate its design alternatives. The clever translation @tlively gave may or may not work depending on what rethrow is supposed to be doing. Until we have concrete examples of how it's supposed to be used and what purpose it's supposed to fulfill, we can all picture different variations of what rethrow does and is for and unintentionally speak past each other. We cannot effectively design features without use cases for those features.

@takikawa
Copy link
Collaborator

takikawa commented Sep 16, 2020

Don't you need to keep this exception stack anyway even without the immediate argument?

@aheejin Thanks, I think you're right about this, nevermind my comment about rethrow then.

@rossberg
Copy link
Member

@aheejin, I have a really dumb question: how did you implement a correct semantics of finally or destructors under the previous proposal without using rethrow? As far as I can tell, rethrow is the only way to (re)throw exceptions whose constructor you don't know. It seems that if you don't use rethrow then any such feature will misbehave on foreign exceptions, e.g., JS ones.

@aheejin
Copy link
Member Author

aheejin commented Sep 17, 2020

@rossberg
(I wrote and deleted a comment just not and please ignore that)

Sorry, I think I said something confusing. Our C++ implementation currently uses rethrow for

  1. When an (both C++ and foreign) exception doesn't match the current catch. The current catch catches everything. With the new proposal decided, we use catch (C++ tag), but that doesn't mean we catch all C++ exceptions (i.e., in some catch we only catch int or something), so we have to still use rethrow if an exception doesn't match. This use case is actually not rethrowing but resuming, and it will be replaced by filter functions/blocks after two-phase unwinding is introduced later. In the current single-phase, rethrowing and resuming aren't different.

  2. After running destructors. This is also a resuming behavior and will be replaced by try-unwind-end. (In the single phase, the exception will be effectively rethrown at the end, even though there's no explicit rethrow). C++ doesn't have finally, but for the languages that have it, clang compiles it away, so it will be handled in a similar way as destructor blocks.

What I tried to say was we don't use rethrow for C++'s throw;, which is C++'s rethrowing keyword, but apparently I said things in a very confusing way. The reason is C++'s throw; apparently doesn't preserve stack traces. Our current C++ implementation prototype's catch (...) doesn't catch foreign exceptions, mainly for implementation simplicity. We may want to change this later.

@rossberg
Copy link
Member

@aheejin, that makes sense, thanks for the clarification.

So this seems to imply that we clearly need rethrow, unless we want to introduce all of two-phase unwinding now.

@RossTate
Copy link
Contributor

As far as I can tell, rethrow is the only way to (re)throw exceptions whose constructor you don't know.

The current catch catches everything. With the new proposal decided, we use catch (C++ tag), but that doesn't mean we catch all C++ exceptions (i.e., in some catch we only catch int or something), so we have to still use rethrow if an exception doesn't match.

Notice that this is "rethrowing" an exception with a known tag and available payload. rethrow is not necessary for this. You can just use throw, and the two programs would be semantically equivalent.

@aheejin
Copy link
Member Author

aheejin commented Sep 17, 2020

@RossTate

Notice that this is "rethrowing" an exception with a known tag and available payload. rethrow is not necessary for this. You can just use throw, and the two programs would be semantically equivalent.

Without rethrow,

  1. We have to remember the order of extracted values and reconstruct them, and the bookeeping necessary for that may not be easy.
  2. This also increases code size.
  3. Also note that this is different from C++ throw;. C++ throw; doesn't preserve stack traces but this should. This is effectively uncaught exception and should be skipped by a filter function later when we have two-phase, so in the single phase, in case stack traces are embedded in the exception, this shouldn't be lost.

I think we have a very clear use case for rethrow in the current toolchain for all these reasons. And also, I think we agreed on the proposed spec like yesterday...? Not sure if we want to spend more time to discuss taking something out of that agreed spec only after a day. I opened this issue to discuss whether we need the immediate for rethrow.

@RossTate
Copy link
Contributor

  1. You put the payload (which I believe for C++ is currently just an i32) in a local variable. There are likely other reasons to do this anyways (e.g. to access the payload multiple times).
  2. Barely.
  3. Stack traces are not part of the semantics of WebAssembly exceptions. They are a debugging tool, and there are other, more robust ways for engines to track stack traces to this end.

I opened this issue to discuss whether we need the immediate for rethrow.

For me, the answer depends on what rethrow means and how rethrow will be used. That said, it does sound like there's reason to believe that every hypothetical use does not need an immediate, so I'm personally fine with you closing the issue with that answer.

@aheejin
Copy link
Member Author

aheejin commented Sep 17, 2020

@RossTate

  1. You put the payload (which I believe for C++ is currently just an i32) in a local variable. There are likely other reasons to do this anyways (e.g. to access the payload multiple times).
  2. Barely.
  3. Stack traces are not part of the semantics of WebAssembly exceptions. They are a debugging tool, and there are other, more robust ways for engines to track stack traces to this end.

Without two-phase unwinding, embedding stack traces or other info in exceptions is a way that we can provide some info. We didn't specify which info we should provide in the spec to maximize each engine's leeway according to their necessity though. We are currently using it, and I think this is sufficient to attest to the "Is there a use case" argument.

For me, the answer depends on what rethrow means and how rethrow will be used. That said, it does sound like there's reason to believe that every hypothetical use does not need an immediate, so I'm personally fine with you closing the issue with that answer.

Not sure exactly what you mean. The objective here is to decide whether we should restore the immediate back or not, and if we decide we should, propose to the CG meeting about the change. Have we made the decision?

@rossberg
Copy link
Member

@RossTate:

Notice that this is "rethrowing" an exception with a known tag and available payload.

To implement destructors or finally in a heterogeneous language environment you need to be able to rethrow exceptions whose tag you do not know. For example, when C++-Wasm calls back into JS and JS throws.

@RossTate
Copy link
Contributor

To implement destructors or finally in a heterogeneous language environment you need to be able to rethrow exceptions whose tag you do not know

@rossberg That has been addressed by try/unwind.

Without two-phase unwinding, embedding stack traces or other info in exceptions is a way that we can provide some info.

@aheejin An engine can have a global variable for the stack trace. If a function throws an exception, then the engine can initialize this variable to the empty stack. As it searches for a catch, it can add frames to this stack. If it finds a catch, then it can leave the stack untouched until the catching function returns, at which point it clears the variable, but if the function throws again then it continues to build upon the stack. This will work even for languages that modify exceptions as they go up the stack. It also can be used to get stack traces for traps.

We also want to consider that there are uses of the exception handling proposal that do not want you collecting stack traces because it is a waste of time, i.e. they are using it for dynamic control flow and know the thrown exception will be caught.

Applications that do want a stack trace can always explicitly put one into the payload. I prefer this approach because it takes guesswork out of engines (should the engine trace the stack or not? answer depends on the application).

@aheejin
Copy link
Member Author

aheejin commented Sep 17, 2020

@RossTate
I didn't say rethrow is the only way that we can provide stack traces. This is a way we are currently using it (with single phase unwinding), and was included in the new spec we agreed on, and we have to change our implementation even further without it, so I'd say we keep it. Of course, it is not specified in the spec what kind of auxiliary info a VM should include, so a VM can choose not to provide any side info, and that's fine.

I answered your "is there a use case?" argument. Given that we implemented it, are using it, and haven't had a major problem with it, I don't think every single aspect of the existing spec should also satisfy "Is this the only way to do this thing?" question.

@RossTate
Copy link
Contributor

Have we made the decision?

My impression is that there's no need for an immediate in any hypothetical use. @rossberg is I think the only person we're waiting for updated thoughts from.

@rossberg
Copy link
Member

My point still stands I believe. Implicit naming of nestable constructs is a basic structural design mistake. I gave an example. What would be the advantage of artificially crippling the instruction that way?

@aheejin
Copy link
Member Author

aheejin commented Sep 17, 2020

Note that specifying exception to rethrow was capable using exnref in the previous (meaning the spec before this CG meeting) spec, so we are not introducing any new functionalities.

@RossTate
Copy link
Contributor

RossTate commented Sep 17, 2020

@rossberg Not having an index makes the instruction size smaller for the extremely common case. And having an index would be redundant because it can easily be encoded through existing constructs.

@aheejin
Copy link
Member Author

aheejin commented Sep 17, 2020

@rossberg Not having an index makes the instruction size smaller for the extremely common case.

Barely.
And it requires the toolchain to do some unnecessary code transformation in the other cases, increasing complexity and code size there.

@rossberg
Copy link
Member

@RossTate:

Not having an index makes the instruction size smaller for the extremely common case.

In this thread you seem to be simultaneously arguing that the entire instruction is useless and that it is gonna be extremely common. I have a hard time reconciling both these claims.

@RossTate
Copy link
Contributor

I believe the use cases for the instruction could just as easily be served by throw. But if we're going to have such an instruction, it sounds like it will be used in a very particular pattern, namely the following:

(catch $exn
   (local.set $exn_payload)
   (br_if $exn_handler (some_test (local.get $exn_payload)))
   (rethrow)
)

If that's the case, then that suggests we should have catch rethrow the exception whenever the end is reached (instead of having a rethrow instruction). Once we do that, then catch and unwind are almost identical except that one is only used for certain exceptions and provides the payload. (This change has the added benefit of making the type annotation on try unnecessary, though actually taking advantage of that is a separate discussion.)

Taking things a step further, languages like Python modify the exception as it goes up the stack. Rather than having catch implicitly store the payload for when the exception gets rethrown at the end, we could instead require the end of the block to provide a new payload. At this point, catch is really doing unwinding with a payload (which is a more advanced form of unwinding), so we might as well rename it to unwind_tagged or something.

Putting this all together, the pair of unwind and unwind_tagged has strictly more functionality than the quadruple of unwind, catch, catch_all, and rethrow while at the same time likely being a more compact encoding of the common patterns. (Sorry, just thought of this idea.)

@aheejin
Copy link
Member Author

aheejin commented Sep 17, 2020

@RossTate Umm... Can we please not turn everything upside down again and bikeshed new instructions two days after we passed a new spec...? Brainstorming and bikeshedding language design can be fun, but I think we really need to settle down and ship things.

Also, I think the approach when designing new instructions and changing already existing and implemented instructions should be different. We can't spend endless time bikeshedding things, and I think changing existing things require a stronger argument than designing things first time unless there's a clear problem that people agree on.

(By the way, usage or rethrow is not at the end of end of catch most time, as you suggest; it is only the case for destructor blocks, which will be replaced by try-unwind.)

@binji
Copy link
Member

binji commented Sep 17, 2020

My guess is that rethrow will be a very rare instruction, even in a large program using exceptions everywhere, so I wouldn't worry too much about the code size. If we're concerned that rethrow may need an immediate in the future, then we can do what we've done before here, and provide a required 0 immediate, which can be extended later if necessary.

@aheejin
Copy link
Member Author

aheejin commented Sep 17, 2020

@binji As I said in #126 (comment), it's not rare at all; currently it's used all the time. The usage 2 will be replaced by try-unwind though. The usage 1 will remain until we actually have filters and two-phase unwinding.

And both 1 and 2 are not rethrowing but actually resuming. (In single-phase they are effectively the same.) C++ does not use rethrow for actual "rethrowing", which is throw;. But other languages might, and that's what @rossberg's use case suggests.

@binji
Copy link
Member

binji commented Sep 17, 2020

Sorry, I mean rare compared to the total number of instructions. But if you have data that would be useful -- what percentage of the instructions in the code section for a medium-sized module are rethrow?

@aheejin
Copy link
Member Author

aheejin commented Sep 17, 2020

@binji I just checked a single file that contains ~35000 instructions and rethrow was a bit less than 1%. But I think this will be reduced by more than 80% I think after we switch to try-unwind, so I wouldn't worry about the code size for it either. But without the immediate I think it is possible that the toolchain has to do some unnecessary code transformation, inserting branches and all. (C++ is not likely to need that though.)

@RossTate
Copy link
Contributor

Clarification request: which of these programs are supposed to be equivalent?

  1. (try
       (call $f)
    catch $exn
       (throw $exn)
    )
    
  2. (try
       (call $f)
    catch $exn
       (rethrow)
    )
    
  3. (call $f)
    

@aheejin
Copy link
Member Author

aheejin commented Sep 17, 2020

@RossTate 2 and 3 will have the same throwing site. The two programs are different still, so not sure what the definition of equivalence is though.

@RossTate
Copy link
Contributor

Equivalence informs tools which program transformations/optimizations are valid. Inlining also changes throwing sites and stack traces. So it seems like either inlining functions that might throw exceptions is an invalid optimization, or the above three programs are equivalent. If the three programs are equivalent, then every program with rethrow can be equivalently expressed without rethrow.

@aheejin
Copy link
Member Author

aheejin commented Sep 17, 2020

Stack traces are usually used for debugging without optimizations turned on. In your example two throwing sites are very close, but if rethrowing happens later in another caller up in the call stack, the stack traces will be vastly different. Your argument that inlining can make things the same is like, all stack traces that ever exist can be the same if all functions are inlined into main. Not sure if that makes sense.

And as I said, I think now is the time to actually make things work and ship things. We made a new spec two days ago. We can still consider changes, but I think they should satisfy one of these:

  1. The current spec has a serious problem
  2. The change is small and incremental, so it does not require a lot of changes in the existing implementations.

@tlively
Copy link
Member

tlively commented Sep 17, 2020

Like @RossTate, I'm not totally convinced that rethrow is necessary. However, I also don't think it's a significant complication to the proposal. AFAIK, the implementation in V8 currently embeds stack traces in thrown exceptions. I know those stack traces do not necessarily match the stack trace semantics of any particular source language, but I think it's plausible that they might be useful for debugging regardless of that. I don't know for sure because we don't have any concrete users of this proposal yet. Since those stack traces are a debugging tool that is not visible in Wasm semantics, in practice it's ok to perform optimizations like inlining that change them, even if in theory it is not.

Given all that, I think a reasonable course of action would be to keep rethrow, taking the risk that it turns out not to be useful. A similar line of reasoning suggests that we should add the immediate index to rethrow as well. Again, it may not turn out to be useful, but it's also not a large cost, so the expected value of adding it just in case and moving on with implementation seems higher to me than the expected value of continuing to disagree about it.

@RossTate
Copy link
Contributor

I feel we need to be explicit about the expectations on stack tracing. When I asked language implementers for advice on exception handling, a common sentiment was to not have stack tracing as a default. It apparently substantially slows down the use of exceptions for dynamic control flow. Most implementations of efficient dynamic control flow either use two-phase approaches (so that they can collect the stack only after they find out the exception is uncaught) or do not collect the stack trace (except for possibly keeping the portion of the stack above the last handler/unwinder in tact).

As an example, for stack switching we think it makes sense to build on this proposal as it provides a mechanism for dynamic control flow. But it would be problematic if doing so causes a stack trace every time, an operation that could easily dominate the time of the stack switch itself.

Plus, as I pointed out above, there are other ways to build and track a stack trace than putting it in the exception payload. The technique I mentioned above would give the same stack trace for all three programs (that I believe should be considered equivalent). Another is for the application to make the stack trace an explicit part of the payload (as an externref) and import a function to collect the stack trace upon building the initial exception (and the application's JS layer could provide a different function to the import depending on whether the application is being debugged or not).

@aheejin
Copy link
Member Author

aheejin commented Sep 17, 2020

I think I'm repeating myself at this point, so I'll summarize my answers here one more time:

  1. We don't require VMs to embed stack traces. It is not the default and it is up to them. So your speed argument does not hold.
  2. Your example is very small that the throwing site and the rethrowing sites are so close, but in real program they can be in different functions in a call stack.
  3. I think it's time to make things work and refrain from bikeshedding the spec we made two days ago.

And one more thing... this issue was about the immediate argument and I hoped this issue to focus on that. Whether we need rethrow at all is a much bigger request for change, which we have not considered in all three versions of the spec we made, so while I don't think we need to make big changes anymore unless there's a really good reason, I think it would've been better if you made a separate issue post.

@RossTate
Copy link
Contributor

Fair request. I've moved discussion of whether to have rethrow to #127.

@aheejin aheejin mentioned this issue Sep 18, 2020
aheejin pushed a commit to aheejin/exception-handling that referenced this issue Sep 22, 2020
* [interpreter] Simplify zero-len and drop semantics

* Update overview

* [spec] Change drop semantics

* [spec] Forgot to adjust prose for *.init ops

* [test] Update generated tests for OOBs and dropping changes (WebAssembly#131)
aheejin added a commit to aheejin/exception-handling that referenced this issue Sep 22, 2020
After the spec change in WebAssembly#126, byte copying order is not observable.
ioannad pushed a commit to ioannad/exception-handling that referenced this issue Feb 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants