diff --git a/text/2363-arbitrary-enum-discriminant.md b/text/2363-arbitrary-enum-discriminant.md new file mode 100644 index 00000000000..ae3c5eaa774 --- /dev/null +++ b/text/2363-arbitrary-enum-discriminant.md @@ -0,0 +1,295 @@ +- Feature Name: `arbitrary_enum_discriminant` +- Start Date: 2018-03-11 +- RFC PR: [rust-lang/rfcs#2363](https://github.com/rust-lang/rfcs/pull/2363) +- Rust Issue: [rust-lang/rust#60553](https://github.com/rust-lang/rust/issues/60553) + +# Summary +[summary]: #summary + +This RFC gives users a way to control the discriminants of variants of all +enumerations, not just the ones that are shaped like C-like enums (i.e. where +all the variants have no fields). + +The change is minimal: allow any variant to be adorned with an explicit +discriminant value, whether or not that variant has any field. + +# Motivation +[motivation]: #motivation + +Stylo, the style system of Servo, represents CSS properties with a large +enumeration `PropertyDeclaration` where each variant has only one field which +represents the value of a given CSS property. Here is a subset of it: + +```rust +#[repr(u16)] +enum PropertyDeclaration { + Color(Color), + Height(Length), + InlineSize(Length), + TransformOrigin(TransformOrigin), +} +``` + +For various book-keeping reasons, Servo also generates a `LonghandId` +enumeration with the same variants as `PropertyDeclaration` but without the +fields, thus making `LonghandId` a C-like enumeration: + +```rust +#[derive(Clone, Copy)] +#[repr(u16)] +enum LonghandId { + Color, + Height, + InlineSize, + TransformOrigin, +} +``` + +Given that rustc guarantees that `#[repr(u16)]` enumerations start with their +discriminant stored as a `u16`, going from `&PropertyDeclaration` to +`LonghandId` is then just a matter of unsafely coercing `&self` as a +`&LonghandId`: + +```rust +impl PropertyDeclaration { + fn id(&self) -> LonghandId { + unsafe { *(self as *const Self as *const LonghandId) } + } +} +``` + +This works great, but doesn't scale if we want to replicate this behaviour for +an enumeration that is a subset of `PropertyDeclaration`, for example an +enumeration `AnimationValue` that is limited to animatable properties: + +```rust +#[repr(u16)] +enum AnimationValue { + Color(Color), + Height(Length), + TransformOrigin(TransformOrigin), +} + +impl AnimationValue { + fn id(&self) -> LonghandId { + // We can't just unsafely read `&self` as a `&LonghandId` because + // the discriminant of `AnimationValue::TransformOrigin` isn't equal + // to `LonghandId::TransformOrigin` anymore. + match *self { + AnimationValue::Color(_) => LonghandId::Color, + AnimationValue::Height(_) => LonghandId::Height, + AnimationValue::TransformOrigin(_) => LonghandId::TransformOrigin, + } + } +} +``` + +This is not sustainable, as the jump table generated by rustc to compile this +huge match expression is larger than 4KB in the final Gecko binary, when this +operation could be a trivial `u16` copy. This is worked around in Servo by +generating spurious `Void` variants for the non-animatable properties in +`AnimationValue`: + +```rust +enum Void {} + +#[repr(u16)] +enum AnimationValue { + Color(Color), + Height(Length), + InlineSize(Void), + TransformOrigin(TransformOrigin), +} + +impl AnimationValue { + fn id(&self) -> LonghandId + // We can use the unsafe trick again. + unsafe { *(self as *const Self as *const LonghandId) } + } +} +``` + +This is unfortunately quite painful to use, given now all methods matching +against `AnimationValue` need to have dummy arms for all of these variants: + +```rust +impl AnimationValue { + fn do_something(&self) { + match *self { + AnimationValue::Color(ref color) => { + do_something_with_color(color) + } + AnimationValue::Height(ref height) => { + do_something_with_height(height) + } + // This shouldn't be needed. + AnimationValue::InlineSize(ref void) => { + match *void {} + } + AnimationValue::TransformOrigin(ref origin) => { + do_something_with_transform_origin(origin) + } + } + } +} +``` + +We suggest generalising the explicit discriminant notation to all enums, +regardless of whether their variants have fields or not: + +```rust +#[repr(u16)] +enum AnimationValue { + Color(Color) = LonghandId::Color as u16, + Height(Length) = LonghandId::Height as u16, + TransformOrigin(TransformOrigin) = LonghandId::TransformOrigin as u16, +} + +impl AnimationValue { + fn id(&self) -> LonghandId + // We can use the unsafe trick again. + unsafe { *(self as *const Self as *const LonghandId) } + } + + fn do_something(&self) { + // No spurious variant anymore. + match *self { + AnimationValue::Color(ref color) => { + do_something_with_color(color) + } + AnimationValue::Height(ref height) => { + do_something_with_height(height) + } + AnimationValue::TransformOrigin(ref origin) => { + do_something_with_transform_origin(origin) + } + } + } +} +``` + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +An enumeration with only field-less variants can currently have explicit +discriminant values: + +```rust +enum ForceFromage { + Emmental = 0, + Camembert = 1, + Roquefort = 2, +} +``` + +With this RFC, users are allowed to put explicit discriminant values on any +variant of any enumeration, not just the ones where all variants are field-less: + +```rust +enum ParisianSandwichIngredient { + Bread(BreadKind) = 0, + Ham(HamKind) = 1, + Butter(ButterKind) = 2, +} +``` + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +## Grammar + +The production for enumeration items becomes: + +``` +EnumItem : + OuterAttribute* + IDENTIFIER ( EnumItemTuple | EnumItemStruct)? EnumItemDiscriminant? +``` + +## Semantics + +The limitation that only field-less enumerations can have explicit discriminant +values is lifted, and no other change is made to their semantics: + + * enumerations with fields still can't be casted to numeric types + with the `as` operator; + * if the first variant doesn't have an explicit discriminant, + it is set to zero; + * any unspecified discriminant is set to one higher than the one from + the previous variant; + * under the default representation, the specified discriminants are + interpreted as `isize`; + * two variants cannot share the same discriminant. + +# Drawbacks +[drawbacks]: #drawbacks + +This introduces one more knob to the representation of enumerations. + +# Rationale and alternatives +[alternatives]: #alternatives + +Reusing the current syntax and semantics for explicit discriminants of +field-less enumerations means that the changes to the grammar and semantics of +the language are minimal. There are a few possible alternatives nonetheless. + +## Discriminant values in attributes + +We could specify the discriminant values in variant attributes, but this would +be at odds with the syntax for field-less enumerations. + +```rust +enum ParisianSandwichIngredient { + #[discriminant = 0] + Bread(BreadKind), + #[discriminant = 1] + Ham(HamKind), + #[discriminant = 2] + Butter(ButterKind), +} +``` + +## Use discriminants of a separate field-less enumeration + +We could tell rustc to tie the discriminants of the enumeration to the +variants of a separate field-less enumeration. + +```rust +#[discriminant(IngredientKind)] +enum ParisianSandwichIngredient { + Bread(BreadKind), + Ham(HamKind), + Butter(ButterKind), +} + +enum IngredientKind { + Bread, + Ham, + Butter, +} +``` + +This isn't applicable if such a separate field-less enumeration doesn't exist, +and this can easily be done as a procedural macro using the feature described +in this RFC. It also looks way more like spooky action at a distance. + +# Prior art +[prior-art]: #prior-art + +No prior art. + +# Unresolved questions +[unresolved]: #unresolved-questions + +## Should discriminants of enumerations with fields be specified as variant attributes? + +Should they? + +## Should this apply only to enumerations with an explicit representation? + +Should it? + +# Thanks + +Thanks to Mazdak Farrokhzad (@Centril) and Simon Sapin (@SimonSapin) for +the reviews, and my local bakery for their delicious baguettes. 🥖