Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add @feature and @since gates to WIT #332

Merged
merged 10 commits into from
May 28, 2024
148 changes: 140 additions & 8 deletions design/mvp/WIT.md
Original file line number Diff line number Diff line change
Expand Up @@ -851,6 +851,83 @@ Concretely, the structure of a `wit` file is:
wit-file ::= package-decl? (toplevel-use-item | interface-item | world-item)*
```

### Feature Gates

Various WIT items can be "gated", to reflect the fact that the item is part of
an unstable feature or that the item was added as part of a minor version
update and shouldn't be used when targeting an earlier minor version.

For example, the following interface has 4 items, 3 of which are gated:
```wit
interface foo {
a: func();

@since(version = 0.2.1)
b: func();

@since(version = 0.2.2, feature = fancy-foo)
c: func();

@unstable(feature = fancier-foo)
d: func();
}
```
The `@since` gate indicates that `b` and `c` were added as part of the `0.2.1`
and `0.2.2` releases, resp. Thus, when building a component targeting, e.g.,
`0.2.1`, `b` can be used, but `c` cannot. An important expectation set by the
`@since` gate is that, once applied to an item, the item is not modified
incompatibly going forward (according to general semantic versioning rules).

In contrast, the `@unstable` gate on `d` indicates that `d` is part of the
`fancier-foo` feature that is still under active development and thus `d` may
change type or be removed at any time. An important expectation set by the
`@unstable` gate is that toolchains will not expose `@unstable` features by
default unless explicitly opted-into by the developer.

Together, these gates support a development flow in which new features start
with an `@unstable` gate while the details are still being hashed out. Then,
once the feature is stable (and, in a WASI context, voted upon), the
`@unstable` gate is switched to a `@since` gate. To enable a smooth transition
(during which producer toolchains are targeting a version earlier than the
`@since`-specified `version`), the `@since` gate contains an optional `feature`
field that, when present, says to enable the feature when *either* the target
version is greator-or-equal *or* the feature name is explicitly enabled by the
developer. Thus, `c` is enabled if the version is `0.2.2` or newer or the
`fancy-foo` feature is explicitly enabled by the developer. The `feature` field
can be removed once producer toolchains have updated their default version to
enable the feature by default.

Specifically, the syntax for feature gates is:
```wit
gate ::= unstable-gate
| since-gate
unstable-gate ::= '@unstable' '(' feature-field ')'
feature-field ::= 'feature' '=' id
since-gate ::= '@since' '(' 'version' '=' <valid semver> ( ',' feature-field )? ')'
```

As part of WIT validation, any item that refers to another gated item must also
be compatibly gated. For example, this is an error:
```wit
interface i {
@since(version = 1.0.1)
type t1 = u32;

type t2 = t1; // error
}
```
Additionally, if an item is *contained* by a gated item, it must also be
compatibly gated. For example, this is an error:
```wit
@since(version = 1.0.2)
interface i {
foo: func(); // error: no gate
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I certainly understand the reasoning behind requiring the redundant gate, but this does sound like a maintenance annoyance. Is it possible that this restriction could be lifted in the future and the default would be come that items inherit the gates of the parent item?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementing it either way isn't really an issue, so in my mind it comes down to other reasons. At least with Rust #[stable] is exclusively used by the standard library so ergonomics aren't necessarily a high-priority concern as only a few authors interact with it. Additionally many methods/functions often have dozens-to-hundreds of lines of documentation in modules with dozens of functions, so the distance between @since on an interface to a func itself may actually be quite large.

That's what personally makes me lean towards requiring @since on items everywhere as it makes it easier to read primarily.


@since(version = 1.0.1)
bar: func(); // also error: weaker gate
}
```

## Package declaration

WIT files optionally start with a package declaration which defines the ID of
Expand Down Expand Up @@ -890,14 +967,21 @@ nesting both namespaces and packages, which would then generalize the syntax of

## Item: `world`

Worlds define a [componenttype](https://github.com/WebAssembly/component-model/blob/main/design/mvp/Explainer.md#type-definitions) as a collection of imports and exports.
Worlds define a [`componenttype`] as a collection of imports and exports, all
of which can be gated.

Concretely, the structure of a world is:

```ebnf
world-item ::= 'world' id '{' world-items* '}'
world-item ::= gate 'world' id '{' world-items* '}'

world-items ::= export-item | import-item | use-item | typedef-item | include-item
world-items ::= gate world-definition

world-definition ::= export-item
| import-item
| use-item
| typedef-item
| include-item

export-item ::= 'export' id ':' extern-type
| 'export' use-path ';'
Expand All @@ -912,6 +996,8 @@ from the root of a component and used within functions imported and exported.
The `interface` item here additionally defines the grammar for IDs used to refer
to `interface` items.

[`componenttype`]: Explainer.md#type-definitions

## Item: `include`

A `include` statement enables the union of the current world with another world. The structure of an `include` statement is:
Expand All @@ -934,18 +1020,20 @@ include-names-item ::= id 'as' id
## Item: `interface`

Interfaces can be defined in a `wit` file. Interfaces have a name and a
sequence of items and functions.
sequence of items and functions, all of which can be gated.

Specifically interfaces have the structure:

> **Note**: The symbol `ε`, also known as Epsilon, denotes an empty string.

```ebnf
interface-item ::= 'interface' id '{' interface-items* '}'
interface-item ::= gate 'interface' id '{' interface-items* '}'

interface-items ::= typedef-item
| use-item
| func-item
interface-items ::= gate interface-definition

interface-definition ::= typedef-item
| use-item
| func-item

typedef-item ::= resource-item
| variant-items
Expand All @@ -970,6 +1058,7 @@ named-type-list ::= ϵ
named-type ::= id ':' ty
```


## Item: `use`

A `use` statement enables importing type or resource definitions from other
Expand Down Expand Up @@ -1626,3 +1715,46 @@ standalone interface definitions (such `wasi:http/handler`) are no longer in a
`use`s are replaced by direct aliases to preceding type imports as determined
by the WIT resolution process.

Unlike most other WIT constructs, the `@since` and `@unstable` gates are not
represented in the component binary. Instead, they are considered "macro"
lukewagner marked this conversation as resolved.
Show resolved Hide resolved
constructs that take the place of maintaining two copies of a single WIT
document. In particular, when encoding a collection of WIT documents into a
binary, the target version and set of explicitly-enabled feature names
determine whether individual gated features are included in the encoded type or
not.

For example, the following WIT document:
```wit
package ns:[email protected];

interface i {
f: func();

@since(version = 1.1.0)
g: func();
}
```
is encoded as the following component when the target version is `1.0.0`:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My impression is that the main intention for @since and `@unstable is less-so targetting old version of WASI but rather enabling a story for in-progress feature-development. Along those lines I do not yet plan on implementing (not that it can't be done, I just don't plan on doing it initially) support for "view this WIT from version 0.2.0". Instead what I plan on implementing is "view this WIT with these features active".

Given that would it perhaps make more sense to change this example to showcase the gating in that regard?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to talk through the workflow you're planning for how 0.2.1 is released: I imagined that, once 0.2.1 becomes official via the WASI SG, the interfaces/functions added by 0.2.1 would receive the @since(version = "0.2.1") gate so that producer toolchains can immediately pull in the new WITs but keep their default version at 0.2.0 for the transition period where most runtimes don't yet have 0.2.1 deployed (so that the default output continues to run in most places). If that was the case, then I would imagine you would need @since (in addition to @unstable) in the short-term. But are you planning the roll-out of 0.2.1 with a different sequencing or use of gates?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh my assumption has been that when 0.2.1 is released everyone updates on their own schedule. If guest languages update before runtimes that's ok because a runtime would see an 0.2.1 import but realize it has an 0.2.0 version and would work ok. The only bad case would be when you use something only available in 0.2.1 and run it on an 0.2.0 runtime.

Given that there's no need for languages to pull in 0.2.1 WITs but pretend they're 0.2.0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there perhaps value in an intermediate state in which:

  • The newer (0.2.1) WIT is present in the toolchain, and so it's possible to use it if you want it.
  • But it's not yet made available by default because, once it's available, various things might pull it in unnecessarily leading to real failures (because the feature is actually being used) unnecessarily. Hypothetical examples I can think of include: (1) standard libraries that use the feature if it's present, even though they had a fallback path that didn't need the feature, (2) developers who use it because not because they absolutely needed it, but because it was there and they didn't know it was going to be an issue.

```wat
(component
(type (export "i") (component
(export "ns:p/[email protected]" (instance
(export "f" (func))
))
))
)
```
If the target version was instead `1.1.0`, the same WIT document would be
encoded as:
```wat
(component
(type (export "i") (component
(export "ns:p/[email protected]" (instance
(export "f" (func))
(export "g" (func))
))
))
)
```
Thus, `@since` and `@unstable` gates are not part of the runtime semantics of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious what happens if I erroneously compile against an interface that is too far forward? That is, I compile against 0.2.2, but then try to run on a runtime that only knows about 0.2.1. So in this example, my consumer will potentially try to use method c on this interface, but the runtime (or component generic implementor, if we're not talking about WASI) won't know about it yet.

Is reading of this document correct that there's nothing we can do about this at compile time, since we can't deduce which runtime we'll be executing on at that stage? And if so, it seems like there's nothing we can do about it at runtime either due to the restriction in this section.

This might end up being a frustrating experience for users. I'm not well-versed enough in the details yet to know if the consumer will fail at start time (when it fails to find the function to fulfill the import it expects), or while running (when it makes a call to c that can't be handled). The latter of those seems especially challenging, as one could easily miss it during testing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is reading of this document correct that there's nothing we can do about this at compile time, since we can't deduce which runtime we'll be executing on at that stage?

That's correct, yes. The only real way to do something about this is to have the toolchain target a world (and version) that is known to be supported by the targeted runtime. I think that's pretty fundamentally true, and not even just for WIT: to give another example, if in a natively-compiled application you target a specific version of an operating system and make use of functionality not available in older versions, your application won't work.

And if so, it seems like there's nothing we can do about it at runtime either due to the restriction in this section.

Can you say which restrictions you mean, and how they could be changed to address this?

One thing we do want to do, but that's separate from this change, is to support optional imports/exports. Those would allow developers to make use of functionality if it is available, but not forcibly rely on it.

And separately, note that the Component Model does have other ways to address all this. Specifically, since all APIs can be virtualized, it's possible to eliminate imports by wrapping a component in another one that provides an implementation of that import in terms of other functionality. As just one scenario, this could be done as part of a deployment pipeline when that pipeline detects that the runtime environment is lacking some APIs.

This might end up being a frustrating experience for users. I'm not well-versed enough in the details yet to know if the consumer will fail at start time (when it fails to find the function to fulfill the import it expects), or while running (when it makes a call to c that can't be handled). The latter of those seems especially challenging, as one could easily miss it during testing.

This would show up as a link-time error, not at runtime. But again, besides what I described above it's not clear what could be done about this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's one other feature of runtimes that helps mitigate this scenario that @azaslavsky is describing (which, iiuc, is what is currently implemented in Wasmtime): let's say I compile my component targeting a world that imports [email protected] but my component only uses features also present in [email protected]. In this scenario, Wasmtime will ignore the minor version part of [email protected], and just see if it has an implementation of i@1. If the runtime only has an implementation of [email protected], then linking will succeed (again, assuming the component is only using functions also present in [email protected]). Thus, we can leverage semantic versioning to be a bit more permissive than otherwise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct, yes. The only real way to do something about this is to have the toolchain target a world (and version) that is known to be supported by the targeted runtime. I think that's pretty fundamentally true, and not even just for WIT: to give another example, if in a natively-compiled application you target a specific version of an operating system and make use of functionality not available in older versions, your application won't work.

That's true, though on (most) traditional Unix/Linux systems, your interfaces with the outside world are either an arbitrary copy of libc (if you're talking to the system itself), or an untyped, random-bytes-on-the-wire IPC message, if you are talking to another user space program. The orchestrating system doesn't have much visibility into which kinds of interactions the binary it's running expects.

Since all WASM interfaces with the outside world are so well-described and typed, I was imagining that we could provide more information than say, a Unix would when you try to use a binary compiled for an interface ahead of what your system implements.

Taking a variant of the example above, of function c that exists in 0.2.2 but not 0.2.1, I could imagine the following error scenarios, listed in order of perceived developer ergonomics:

  1. Arbitrary runtime failure a la unix. It seems like we avoid this in all cases by checking the interfaces at link time.
  2. A link time error with no further description: "The interfaces of the consumer and implementor did not match".
  3. A link time error with an explanation of what is wrong: "The consuming component expected c, but the implementing component does not provide c.
  4. A link time error with an explanation of what is wrong, plus version info: "The consuming component (version 0.2.3) expected c, but the implementing component (version 0.2.1) does not provide c." I think this the best we could do as things stand.
  5. A link time error with an explanation of what is wrong, plus a recommendation on how to fix: "The consuming component expected c, but the implementing component does not provide c. It was added in version 0.2.2." I think this especially useful for someone composing complex software from a tree of many components, without necessarily knowing much about any of their specifics.

All I'm saying is that the latter is the most ergonomic and actionable error for a linker to provide: it tells you exactly which version to bump to and why. But I could see an argument that such things are best handled by the package manager service/client that you use to pull your components (wa.dev, etc), though that would prevent for example local debug runs from seeing messaging like the above.

Anyway, I see the downsides of exposing this information at link time too, like it becoming load-bearing in unexpected ways. Maybe it is something left to some future revision, or not implemented at all. :)

Can you say which restrictions you mean, and how they could be changed to address this?

My reading of this document is that it prohibits exposing the since or feature annotations in a manner that allows runtimes to see them.

There's one other feature of runtimes that helps mitigate this scenario that @azaslavsky is describing (which, iiuc, is what is currently implemented in Wasmtime): let's say I compile my component targeting a world that imports [email protected] but my component only uses features also present in [email protected]. In this scenario, Wasmtime will ignore the minor version part of [email protected], and just see if it has an implementation of i@1.

I see, that seems like it covers most of this use case. Is this diffing aware of the versions, or is that information already erased, and it just checks to see if everything that the consumer requires happens to be provided by the implementor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version of the imported interface is present in the component's import string (see the grammar for interfacename) and thus when there is an error (let's say my component is using a function added by 1.1.0 that isn't present in the host's 1.0.0), the host should be able to synthesize a nice error message explaining that the component requires 1.1.0 but the host only has 1.0.0 (which is the error it would've given unconditionally if we didn't do this permissive "ignore the minor/patch version when resolving names" trick).

components, just part of the source-level tooling for producing components.
Loading