Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow arbitrary enums to have explicit discriminants #2363

Merged
merged 3 commits into from
May 5, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
295 changes: 295 additions & 0 deletions text/2363-arbitrary-enum-discriminant.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,295 @@
- Feature Name: `arbitrary_enum_discriminant`
- Start Date: 2018-03-11
- RFC PR: [rust-lang/rfcs#2363](https://github.com/rust-lang/rfcs/pull/2363)
- Rust Issue: [rust-lang/rust#60553](https://github.com/rust-lang/rust/issues/60553)

# Summary
[summary]: #summary

This RFC gives users a way to control the discriminants of variants of all
enumerations, not just the ones that are shaped like C-like enums (i.e. where
all the variants have no fields).

The change is minimal: allow any variant to be adorned with an explicit
discriminant value, whether or not that variant has any field.

# Motivation
[motivation]: #motivation

Stylo, the style system of Servo, represents CSS properties with a large
enumeration `PropertyDeclaration` where each variant has only one field which
represents the value of a given CSS property. Here is a subset of it:

```rust
#[repr(u16)]
enum PropertyDeclaration {
Color(Color),
Height(Length),
InlineSize(Length),
TransformOrigin(TransformOrigin),
}
```

For various book-keeping reasons, Servo also generates a `LonghandId`
enumeration with the same variants as `PropertyDeclaration` but without the
fields, thus making `LonghandId` a C-like enumeration:

```rust
#[derive(Clone, Copy)]
#[repr(u16)]
enum LonghandId {
Color,
Height,
InlineSize,
TransformOrigin,
}
```

Given that rustc guarantees that `#[repr(u16)]` enumerations start with their
discriminant stored as a `u16`, going from `&PropertyDeclaration` to
`LonghandId` is then just a matter of unsafely coercing `&self` as a
`&LonghandId`:

```rust
impl PropertyDeclaration {
fn id(&self) -> LonghandId {
unsafe { *(self as *const Self as *const LonghandId) }
}
}
```

This works great, but doesn't scale if we want to replicate this behaviour for
an enumeration that is a subset of `PropertyDeclaration`, for example an
enumeration `AnimationValue` that is limited to animatable properties:

```rust
#[repr(u16)]
enum AnimationValue {
Color(Color),
Height(Length),
TransformOrigin(TransformOrigin),
}

impl AnimationValue {
fn id(&self) -> LonghandId {
// We can't just unsafely read `&self` as a `&LonghandId` because
// the discriminant of `AnimationValue::TransformOrigin` isn't equal
// to `LonghandId::TransformOrigin` anymore.
match *self {
AnimationValue::Color(_) => LonghandId::Color,
AnimationValue::Height(_) => LonghandId::Height,
AnimationValue::TransformOrigin(_) => LonghandId::TransformOrigin,
}
}
}
```

This is not sustainable, as the jump table generated by rustc to compile this
huge match expression is larger than 4KB in the final Gecko binary, when this
operation could be a trivial `u16` copy. This is worked around in Servo by
generating spurious `Void` variants for the non-animatable properties in
`AnimationValue`:

```rust
enum Void {}

#[repr(u16)]
enum AnimationValue {
Color(Color),
Height(Length),
InlineSize(Void),
TransformOrigin(TransformOrigin),
}

impl AnimationValue {
fn id(&self) -> LonghandId
// We can use the unsafe trick again.
unsafe { *(self as *const Self as *const LonghandId) }
}
}
```

This is unfortunately quite painful to use, given now all methods matching
against `AnimationValue` need to have dummy arms for all of these variants:

```rust
impl AnimationValue {
fn do_something(&self) {
match *self {
AnimationValue::Color(ref color) => {
do_something_with_color(color)
}
AnimationValue::Height(ref height) => {
do_something_with_height(height)
}
// This shouldn't be needed.
AnimationValue::InlineSize(ref void) => {
match *void {}
}
AnimationValue::TransformOrigin(ref origin) => {
do_something_with_transform_origin(origin)
}
}
}
}
```

We suggest generalising the explicit discriminant notation to all enums,
regardless of whether their variants have fields or not:

```rust
#[repr(u16)]
enum AnimationValue {
Color(Color) = LonghandId::Color as u16,
Height(Length) = LonghandId::Height as u16,
TransformOrigin(TransformOrigin) = LonghandId::TransformOrigin as u16,
}

impl AnimationValue {
fn id(&self) -> LonghandId
// We can use the unsafe trick again.
unsafe { *(self as *const Self as *const LonghandId) }
}

fn do_something(&self) {
// No spurious variant anymore.
match *self {
AnimationValue::Color(ref color) => {
do_something_with_color(color)
}
AnimationValue::Height(ref height) => {
do_something_with_height(height)
}
AnimationValue::TransformOrigin(ref origin) => {
do_something_with_transform_origin(origin)
}
}
}
}
```

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

An enumeration with only field-less variants can currently have explicit
discriminant values:

```rust
enum ForceFromage {
Emmental = 0,
Camembert = 1,
Roquefort = 2,
}
```

With this RFC, users are allowed to put explicit discriminant values on any
variant of any enumeration, not just the ones where all variants are field-less:

```rust
enum ParisianSandwichIngredient {
Bread(BreadKind) = 0,
Ham(HamKind) = 1,
Butter(ButterKind) = 2,
}
```

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

## Grammar

The production for enumeration items becomes:

```
EnumItem :
OuterAttribute*
IDENTIFIER ( EnumItemTuple | EnumItemStruct)? EnumItemDiscriminant?
```

## Semantics

The limitation that only field-less enumerations can have explicit discriminant
values is lifted, and no other change is made to their semantics:

* enumerations with fields still can't be casted to numeric types
with the `as` operator;
* if the first variant doesn't have an explicit discriminant,
it is set to zero;
* any unspecified discriminant is set to one higher than the one from
the previous variant;
* under the default representation, the specified discriminants are
interpreted as `isize`;
* two variants cannot share the same discriminant.

# Drawbacks
[drawbacks]: #drawbacks

This introduces one more knob to the representation of enumerations.

# Rationale and alternatives
[alternatives]: #alternatives

Reusing the current syntax and semantics for explicit discriminants of
field-less enumerations means that the changes to the grammar and semantics of
the language are minimal. There are a few possible alternatives nonetheless.

## Discriminant values in attributes

We could specify the discriminant values in variant attributes, but this would
be at odds with the syntax for field-less enumerations.

```rust
enum ParisianSandwichIngredient {
#[discriminant = 0]
Bread(BreadKind),
#[discriminant = 1]
Ham(HamKind),
#[discriminant = 2]
Butter(ButterKind),
}
```

## Use discriminants of a separate field-less enumeration

We could tell rustc to tie the discriminants of the enumeration to the
variants of a separate field-less enumeration.

```rust
#[discriminant(IngredientKind)]
enum ParisianSandwichIngredient {
Bread(BreadKind),
Ham(HamKind),
Butter(ButterKind),
}

enum IngredientKind {
Bread,
Ham,
Butter,
}
```

This isn't applicable if such a separate field-less enumeration doesn't exist,
and this can easily be done as a procedural macro using the feature described
in this RFC. It also looks way more like spooky action at a distance.

# Prior art
[prior-art]: #prior-art

No prior art.

# Unresolved questions
[unresolved]: #unresolved-questions

## Should discriminants of enumerations with fields be specified as variant attributes?

Should they?

## Should this apply only to enumerations with an explicit representation?

Should it?

# Thanks

Thanks to Mazdak Farrokhzad (@Centril) and Simon Sapin (@SimonSapin) for
the reviews, and my local bakery for their delicious baguettes. 🥖