-
Notifications
You must be signed in to change notification settings - Fork 452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[test] Fix invalid ref.is_null after select #1521
Conversation
`select` without argument produces a numeric or vector type while `ref.is_null` consumes a reference type. Since those types are disjoint, `select` followed by `ref.is_null` can never be well-typed. The test is fixed by removing the `select` assuming the intent of the test is to check whether `ref.is_null` can be called after `unreachable`.
IIUC, the behaviour captured by this test is actually intended, because in syntactically dead code refs: I think I might see a drafting error in the current spec though: the result type of |
I see, I was actually wondering what was the use of bottom types since they are not used at all in the spec, see this question on zulipchat. If stack-polymorphic operations would indeed produce bottom types that would make a difference (the "function type" wording of the definition of "stack-polymorphic" would need to become "stack type"). But even when fixing the bottom type issue, I still believe that chaining |
The way I think of this is that (riffing off your syntax)
and EDIT:
Yes, we may need to double-check some of the wording of the spec... |
Yes, I agree A related question is then: why is it even necessary to validate dead code? Why not simply jump over it as long as it's syntactically correct (i.e. it was parsed by the text or binary format)? It looks to me that the stack-polymorphic notion was introduced as a way to accept otherwise valid code when validating dead code. For example, why would |
FWIW the current situation with dead code is a personal bugbear of mine! There are a couple of subtleties with the "skip" idea, since there's been a feeling in the group that we don't want to inadvertently create a distinction between "decode-time" vs "validation-time" errors, but I have a proposal "relaxed dead code validation" waiting in the wings that would do something similar. We've not had the bandwidth to push it forward recently, but IMHO eventually we'll bump up against a planned feature that stresses our dead code weirdness to the point that it makes sense to adopt (GC types may yet be such a feature!). |
Interesting, this proposal is similar to how I implement validation, i.e. the idea of wiping the stack and marking it as "polymorphic" (the bottom notation and
Yes, there are some design choices. But my understanding is that those choices should be coherent across some properties:
We can choose which properties are ignored (no dead code is required to observe the property, but it may by accident, i.e it's not a precondition) or used (all dead code must observe the property). |
Actually, another weirdness I've found (not sure if there's a test for it) is AFAICT, my opinion is actually what the reference interpreter does: ;; a.wat
(module (func
(block (result f64)
(block (result f32)
(unreachable)
(br_table 0 1)))
(drop)))
I would suggest to fix both the I would even go slightly further and say that the bottom type should be removed and stack type should just be function type. The stack polymorphic definition is already taking into account the possibility to support arbitrary value type, there's no need for a notion of sub-typing which doesn't exist at execution anyway. Curious to know what the design direction is regarding those issues. |
Oof. Good catch!
This would mean that, with the GC proposal, an implementation would need to calculate LUBs in dead code in cases where two label types might share a common supertype. We're specifically avoiding LUB calculation in the type checking algorithm so this solution probably won't be possible.
The current design was motivated by a desire to avoid any meaningful constraint tracking in the concrete type checking algorithm. Prior to reference types, it was expected that implementations would need to have value types plus a special When adding reference types, we didn't want to inadvertently impose the requirement that implementations would now need to additionally track constraints on this symbol like My (purely personal) opinion that our current handling of dead code has reached a level of conceptual complexity such that the spec is slipping up in describing it, and implementations aren't really correctly implementing it anyway. The relaxed dead code validation proposal above is the result of a fair bit of conversation regarding how to simplify the situation in a backwards-compatible way while avoiding a type checking algorithm with constraint tracking or LUB calculation. EDIT: apologies for the fat-fingered issue closing |
Oh, I see that the GC proposal actually introduces sub-typing in execution. This means validation would also need this sub-typing. And this real sub-typing will make things quite complicated with the fake sub-typing of stack and value polymorphic operations. I see how the relaxed dead-code proposal would be a possible prerequisite to the GC proposal to be tractable by avoiding the fake sub-typing of unreachable code.
I understand that backward-compatibility is important. But I don't think how validation is defined needs to be backward-compatible. What needs to be backward-compatible is the set of valid modules (it's ok to increase the set but not reduce it). The way this set is defined shouldn't matter since implementations can choose to implement this set membership test the way they want. The actual validation rules are just one way to describe this set. But I also understand that without a formal specification, it's easy to define different sets inadvertently. It thus feels safer to just not touch too much the validation definition (although proposals will definitely do).
No worries :-) |
Yes, apologies for any imprecision. I meant the notion of backwards-compatible that you're describing! |
As @conrad-watt says, this test is intended. The underlying relaxation and the introduction of the bottom type was the smallest possible change that avoided the need for computing lubs/glbs during validation as well as the need for unification variables, neither of which would be desirable for Wasm. The same reasoning led to the current relaxed rule for br_table and the use of subtyping premises outside sequencing. There is a loooong history to the whole question of dead code validation. Wrt the current semantics, some of its original motivation is explained here.
Yes, that's an oversight. The notion of operand type was a temporary last-minute hack to fix fall-out from the late removal (deferral) of subtyping from the reftype proposal. So there are likely places I overlooked. Fortunately, that hack is going away again with the funcref proposal. :) |
Thanks for the link to the rationale and the funcref proposal. Let me try to recap to see if I understand correctly the situation:
Feel free to close this PR (unless my understanding above is wrong). |
Almost. The interpreter is actually correct to reject the snippet |
Oh I see, my bad. Thanks for the explanation! I'll edit my previous comment. |
See #1524 for a fix to the editorial issues. |
I'll close this PR then. |
select
without argument produces a numeric or vector type whileref.is_null
consumes a reference type. Since those types are disjoint,select
followed byref.is_null
can never be well-typed.The test is fixed by removing the
select
assuming the intent of the test is to check whetherref.is_null
can be called afterunreachable
.