From 9da95c4b3cbde546efd438505401045b7847b6df Mon Sep 17 00:00:00 2001 From: ltdk Date: Fri, 14 Jun 2024 22:11:13 -0400 Subject: [PATCH 1/9] repr(type) for type aliases --- text/0000-repr-type-aliases.md | 70 ++++++++++++++++++++++++++++++++++ 1 file changed, 70 insertions(+) create mode 100644 text/0000-repr-type-aliases.md diff --git a/text/0000-repr-type-aliases.md b/text/0000-repr-type-aliases.md new file mode 100644 index 00000000000..d41303b4a4d --- /dev/null +++ b/text/0000-repr-type-aliases.md @@ -0,0 +1,70 @@ +- Feature Name: `repr_type_aliases` +- Start Date: 2024-06-14 +- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) +- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) + +# Summary +[summary]: #summary + +Primitive representations on enums now accept type aliases, meaning that in addition to primitives like `#[repr(u32)]`, `#[repr(core::ffi::c_int)]` and `#[repr(self::my_type)]` are now accepted. + +# Motivation +[motivation]: #motivation + +For the same reasons why type aliases are useful, having type aliases in `repr` attributes would also be useful. A few examples: + +* Types depend on an external API whose exact size may be uncertain. (e.g. `core::ffi::c_int`, `gl::types::GLsizei`) +* An internal API might want to be able to easily change a type later. +* The intent behind a type alias may be clearer than simply using the primitive directly. + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +Enums allow `#[repr(type)]` attributes to offer an explicit discriminant type. (`type` can be any primitive integer type, like `u8`, `i32`, or `usize`, but not `char`.) If all variants of the enum are unit variants, this means that the enum will be easily castable to `type` using `as`. Otherwise, the discriminant will still be of the specified type, but unsafe code is required to actually access it. + +In addition to the primitive types themselves, you can also use the path to a type alias in the `repr` attribute instead, and it will resolve the primitive type of the type alias. However, to ensure compatibility as new potential representations are added, the path to the alias must contain a double-colon: you can access an alias `Alias` defined in the same module by using `self::Alias`. + +For example, `#[repr(core::ffi::c_int)]` is valid because it contains a double-colon, but a `use core::ffi::c_int` followed by `#[repr(c_int)]` is not. If you wanted to `use core::ffi::c_int` first, then you could still do `#[repr(self::c_int)]` to reference the type. + +You can use any type alias in the `repr` attribute, but it *must* be an alias to an accepted primitive type like `u8` or `i32`, and cannot be a pointer, reference, struct, etc. + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation + +The `repr` attribute now accepts arguments containing double-colon tokens, which will be parsed as paths to type aliases to resolve. If those type aliases resolve to a valid primitive type which can be used in the `repr` attribute, that will be used as the actual discriminant representation. + +An additional, automatically-applicable lint should be added if a user references a valid type alias in the current scope without including multiple components in the path, recommending to add `self::` to the beginning to ensure forward-compatibility. + +# Drawbacks +[drawbacks]: #drawbacks + +The requirement for `self::` on already-imported types is unfortunate, but it feels like the best way to ensure that adding new representations isn't a breaking change going forward. Even if we were to decide it weren't a "breaking change," it would still break things anyway, being de-facto breaking. + +And, of course, this complicates the compiler. But that's about it. + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +We could always not do this. + +But more realistically, here's an alternative design that would avoid the `self::` change, whose complexity feels worse than the `self::` requirement: + +* Until a future edition, the current set of valid representations is solidified as taking precedence over any shadowed identifiers. For example, if someone defines `type transparent = u32`, then `repr(transparent)` still means `repr(transparent)` and not `repr(u32)`. +* At said future edition, type aliases now shadow all valid representations. So, for example, defining `type transparent = u32` would truly mean `repr(u32)`, not `repr(transparent)`. The only way to actually reference `repr(transparent)` would be to not have a type inside the current scope named `transparent`. There could be a deny-by-default warning when people do this. + +Alternatively, you could require that the types start with capital letters- oh, right, `repr(C)` is a thing. It feels like there's not a good way to solve the shadowing problem besides adding in something that will never be a part of future representation names, that is, a double-colon token representing a path. + +# Prior art +[prior-art]: #prior-art + +No known prior art exists. + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +None currently. + +# Future possibilities +[future-possibilities]: #future-possibilities + +Future RFCs like #3607 propose explicit methods of obtaining enum discriminants, and that further justifies the desire to include a change like this. There aren't many other extensions that could be added, however. From 2ce777ca466b8681c0e079908705a8e15fde2357 Mon Sep 17 00:00:00 2001 From: ltdk Date: Fri, 14 Jun 2024 22:17:18 -0400 Subject: [PATCH 2/9] Few minor changes after reading rendered --- text/0000-repr-type-aliases.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/text/0000-repr-type-aliases.md b/text/0000-repr-type-aliases.md index d41303b4a4d..4fe0ec0ee0b 100644 --- a/text/0000-repr-type-aliases.md +++ b/text/0000-repr-type-aliases.md @@ -51,6 +51,7 @@ But more realistically, here's an alternative design that would avoid the `self: * Until a future edition, the current set of valid representations is solidified as taking precedence over any shadowed identifiers. For example, if someone defines `type transparent = u32`, then `repr(transparent)` still means `repr(transparent)` and not `repr(u32)`. * At said future edition, type aliases now shadow all valid representations. So, for example, defining `type transparent = u32` would truly mean `repr(u32)`, not `repr(transparent)`. The only way to actually reference `repr(transparent)` would be to not have a type inside the current scope named `transparent`. There could be a deny-by-default warning when people do this. +* Alternatively, you could continue the dance at every future edition, making each new edition define the unshadowable representations allowed. Alternatively, you could require that the types start with capital letters- oh, right, `repr(C)` is a thing. It feels like there's not a good way to solve the shadowing problem besides adding in something that will never be a part of future representation names, that is, a double-colon token representing a path. @@ -67,4 +68,6 @@ None currently. # Future possibilities [future-possibilities]: #future-possibilities -Future RFCs like #3607 propose explicit methods of obtaining enum discriminants, and that further justifies the desire to include a change like this. There aren't many other extensions that could be added, however. +Future RFCs like [#3607] propose explicit methods of obtaining enum discriminants, and that further justifies the desire to include a change like this. There aren't many other extensions that could be added, however. + +[#3607]: https://github.com/rust-lang/rfcs/pull/3607 From 4975077c652d372a4281f3f53b57b8dc667dc2e5 Mon Sep 17 00:00:00 2001 From: ltdk Date: Sun, 16 Jun 2024 14:08:02 -0400 Subject: [PATCH 3/9] Use type = syntax suggestion --- text/0000-repr-type-aliases.md | 37 ++++++++++++++++++++++------------ 1 file changed, 24 insertions(+), 13 deletions(-) diff --git a/text/0000-repr-type-aliases.md b/text/0000-repr-type-aliases.md index 4fe0ec0ee0b..7aafe563c62 100644 --- a/text/0000-repr-type-aliases.md +++ b/text/0000-repr-type-aliases.md @@ -6,7 +6,7 @@ # Summary [summary]: #summary -Primitive representations on enums now accept type aliases, meaning that in addition to primitives like `#[repr(u32)]`, `#[repr(core::ffi::c_int)]` and `#[repr(self::my_type)]` are now accepted. +Primitive representations on enums now accept type aliases, meaning that in addition to primitives like `#[repr(u32)]`, `#[repr(type = core::ffi::c_int)]` and `#[repr(type = my_type)]` are now accepted. # Motivation [motivation]: #motivation @@ -20,25 +20,25 @@ For the same reasons why type aliases are useful, having type aliases in `repr` # Guide-level explanation [guide-level-explanation]: #guide-level-explanation -Enums allow `#[repr(type)]` attributes to offer an explicit discriminant type. (`type` can be any primitive integer type, like `u8`, `i32`, or `usize`, but not `char`.) If all variants of the enum are unit variants, this means that the enum will be easily castable to `type` using `as`. Otherwise, the discriminant will still be of the specified type, but unsafe code is required to actually access it. +Enums allow `#[repr(type = ...)]` attributes to offer an explicit discriminant type. (`...` can be any primitive integer type, like `u8`, `i32`, or `usize`, but not `char`.) If all variants of the enum are unit variants, this means that the enum will be easily castable to `type` using `as`. Otherwise, the discriminant will still be of the specified type, but unsafe code is required to actually access it. -In addition to the primitive types themselves, you can also use the path to a type alias in the `repr` attribute instead, and it will resolve the primitive type of the type alias. However, to ensure compatibility as new potential representations are added, the path to the alias must contain a double-colon: you can access an alias `Alias` defined in the same module by using `self::Alias`. - -For example, `#[repr(core::ffi::c_int)]` is valid because it contains a double-colon, but a `use core::ffi::c_int` followed by `#[repr(c_int)]` is not. If you wanted to `use core::ffi::c_int` first, then you could still do `#[repr(self::c_int)]` to reference the type. +To ensure compatibility, the `#[repr(type = ...)]` form is required if the type is not one of the known primitive types. Note that this form is not necessarily equivalent to using the primitive representations directly, since shadowing is possible; for example, if you did `type u32 = u8` and then `#[repr(type = u32)]`, this would be equivalent to `#[repr(u8)]`, not `#[repr(u32)]`. You can use any type alias in the `repr` attribute, but it *must* be an alias to an accepted primitive type like `u8` or `i32`, and cannot be a pointer, reference, struct, etc. # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -The `repr` attribute now accepts arguments containing double-colon tokens, which will be parsed as paths to type aliases to resolve. If those type aliases resolve to a valid primitive type which can be used in the `repr` attribute, that will be used as the actual discriminant representation. +The `repr` attribute now accepts a `type = ...` argument to indicate a resolved path instead of a well-known primitive type. If those the path resolves to a type alias to a valid primitive type which can be used in the `repr` attribute, that will be used as the actual discriminant representation. + +An additional, automatically-applicable lint should be added that warns a user if they use `type = ...` for a well-known primitive type, since adding `type = ` instead of using the type directly introduces the possibility of shadowing. (For example, `#[repr(type = u32)]` becomes `#[repr(u32)]`.) -An additional, automatically-applicable lint should be added if a user references a valid type alias in the current scope without including multiple components in the path, recommending to add `self::` to the beginning to ensure forward-compatibility. +Similarly, an automatically-applicable lint should be added that warns a user if a `repr` argument references in an-scope type alias without the `type = ` prefix. (For example, `#[repr(MyType)]` becomes `#[repr(type = MyType)]`.) # Drawbacks [drawbacks]: #drawbacks -The requirement for `self::` on already-imported types is unfortunate, but it feels like the best way to ensure that adding new representations isn't a breaking change going forward. Even if we were to decide it weren't a "breaking change," it would still break things anyway, being de-facto breaking. +The requirement for `type =` is unfortunate, but it feels like the best way to ensure that adding new representations isn't a breaking change going forward. Even if we were to decide it weren't a "breaking change," it would still break things anyway, being de-facto breaking. And, of course, this complicates the compiler. But that's about it. @@ -47,13 +47,24 @@ And, of course, this complicates the compiler. But that's about it. We could always not do this. -But more realistically, here's an alternative design that would avoid the `self::` change, whose complexity feels worse than the `self::` requirement: +But more realistically, here are some alternative designs that were rejected. + +## `self::` + +We could, instead of using `type =`, require that all types contain a double-colon to indicate they're a path, effectively preventing collisions with arguments that aren't paths. This would require using `self::` for types that are imported in the local scope, and was actually the first proposal of this RFC, but wasn't very well-received. + +## Shadowing attributes + +Until a future edition, the current set of valid representations could be solidified as taking precedence over any shadowed identifiers. For example, if someone defines `type transparent = u32`, then `repr(transparent)` still means `repr(transparent)` and not `repr(u32)`. + +In future editions, we could either: + +* Let type aliases shadow all valid representations. This isn't ideal since there is no way to override the shadowing besides nesting your code in a new module and then re-exporting it outside that module, which is very messy. +* Expand the list of unshadowable representations every edition where necessary. -* Until a future edition, the current set of valid representations is solidified as taking precedence over any shadowed identifiers. For example, if someone defines `type transparent = u32`, then `repr(transparent)` still means `repr(transparent)` and not `repr(u32)`. -* At said future edition, type aliases now shadow all valid representations. So, for example, defining `type transparent = u32` would truly mean `repr(u32)`, not `repr(transparent)`. The only way to actually reference `repr(transparent)` would be to not have a type inside the current scope named `transparent`. There could be a deny-by-default warning when people do this. -* Alternatively, you could continue the dance at every future edition, making each new edition define the unshadowable representations allowed. +## Capital letters -Alternatively, you could require that the types start with capital letters- oh, right, `repr(C)` is a thing. It feels like there's not a good way to solve the shadowing problem besides adding in something that will never be a part of future representation names, that is, a double-colon token representing a path. +You could require that the types start with capital letters- oh, right, `repr(C)` is a thing. # Prior art [prior-art]: #prior-art From 3302274fd3566c0f19a0349176dda21a0bff600e Mon Sep 17 00:00:00 2001 From: ltdk Date: Fri, 21 Jun 2024 15:31:00 -0400 Subject: [PATCH 4/9] Third pass: replace type with discriminant, expand the section on shadowing, and alter the recommended list of lints --- text/0000-repr-type-aliases.md | 36 +++++++++++++++++++++++++--------- 1 file changed, 27 insertions(+), 9 deletions(-) diff --git a/text/0000-repr-type-aliases.md b/text/0000-repr-type-aliases.md index 7aafe563c62..909d2290b80 100644 --- a/text/0000-repr-type-aliases.md +++ b/text/0000-repr-type-aliases.md @@ -6,7 +6,7 @@ # Summary [summary]: #summary -Primitive representations on enums now accept type aliases, meaning that in addition to primitives like `#[repr(u32)]`, `#[repr(type = core::ffi::c_int)]` and `#[repr(type = my_type)]` are now accepted. +Primitive representations on enums now accept type aliases, meaning that in addition to primitives like `#[repr(u32)]`, `#[repr(discriminant = core::ffi::c_int)]` and `#[repr(discriminant = my_type)]` are now accepted. # Motivation [motivation]: #motivation @@ -20,25 +20,33 @@ For the same reasons why type aliases are useful, having type aliases in `repr` # Guide-level explanation [guide-level-explanation]: #guide-level-explanation -Enums allow `#[repr(type = ...)]` attributes to offer an explicit discriminant type. (`...` can be any primitive integer type, like `u8`, `i32`, or `usize`, but not `char`.) If all variants of the enum are unit variants, this means that the enum will be easily castable to `type` using `as`. Otherwise, the discriminant will still be of the specified type, but unsafe code is required to actually access it. +Enums allow `#[repr(discriminant = type)]` attributes to offer an explicit discriminant type. (`type` can be any primitive integer type, like `u8`, `i32`, or `usize`, but not `char`.) If all variants of the enum are unit variants, this means that the enum will be easily castable to `type` using `as`. Otherwise, the discriminant will still be of the specified type, but unsafe code is required to actually access it. (Future RFCs like [#3607] might change this.) -To ensure compatibility, the `#[repr(type = ...)]` form is required if the type is not one of the known primitive types. Note that this form is not necessarily equivalent to using the primitive representations directly, since shadowing is possible; for example, if you did `type u32 = u8` and then `#[repr(type = u32)]`, this would be equivalent to `#[repr(u8)]`, not `#[repr(u32)]`. +[#3607]: https://github.com/rust-lang/rfcs/pull/3607 + +To ensure compatibility, the `#[repr(discriminant = type)]` form is required if the type is a type alias instead of a known primitive type. Depending on circumstances, it might be more beneficial to use the old `#[repr(type)]` syntax instead of `#[repr(discriminant = type)]` syntax, explained below: + +> Since `type` is resolved as a path instead of being a hard-coded value, it's possible to shadow existing primitive types and cause weird results. For example, if a scope contains `type u32 = u8`, then `#[repr(discriminant = u32)` means `#[repr(u8)]`, not `#[repr(u32)]`. You can always refer to these types directly, however, and `#[repr(discriminant = ::std::u32)]` will always mean `#[repr(u32)]`. +> +> It's also worth noting that this syntax allows using type aliases with names of other `repr` arguments, like `C`. For example, `type C = u8` in the scope means that `#[repr(discriminant = C)]` means `#[repr(u8)]`, not `#[repr(C)]`. + +You can use any type alias in the `repr` attribute, but it *must* be an alias to an accepted primitive type like `u8` or `i32`, and cannot be a pointer, reference, struct, etc. See the [future possibilities] section for some potential alternatives that may be allowed in the future. -You can use any type alias in the `repr` attribute, but it *must* be an alias to an accepted primitive type like `u8` or `i32`, and cannot be a pointer, reference, struct, etc. +[future possibilities]: #Future_possibilities # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -The `repr` attribute now accepts a `type = ...` argument to indicate a resolved path instead of a well-known primitive type. If those the path resolves to a type alias to a valid primitive type which can be used in the `repr` attribute, that will be used as the actual discriminant representation. +The `repr` attribute now accepts a `discriminant = ...` argument to indicate a resolved path instead of a well-known primitive type. If those the path resolves to a type alias to a valid primitive type which can be used in the `repr` attribute, that will be used as the actual discriminant representation. -An additional, automatically-applicable lint should be added that warns a user if they use `type = ...` for a well-known primitive type, since adding `type = ` instead of using the type directly introduces the possibility of shadowing. (For example, `#[repr(type = u32)]` becomes `#[repr(u32)]`.) +An automatically-applicable lint should be added that warns a user if a `repr` argument is invalid but references in an-scope type alias without the `discriminant = ` prefix. (For example, `#[repr(MyType)]` becomes `#[repr(discriminant = MyType)]`, but `#[repr(C)]` does not warn even if an alias `C` is in scope.) -Similarly, an automatically-applicable lint should be added that warns a user if a `repr` argument references in an-scope type alias without the `type = ` prefix. (For example, `#[repr(MyType)]` becomes `#[repr(type = MyType)]`.) +To aid macro authors, an allow-by-default lint should be added that warns a user if they use `discriminant = ...` with a relative path instead of an absolute one, since the resolved type depends on the existing scope. For example, `discriminant = u32` or `discriminant = my_type` could potentially be shadowed by another type in the scope, but `discriminant = ::std::u32` and `discriminant = $crate::my_type` will always work as expected. This could potentially be relegated to a `clippy::pedantic` lint instead of a `rustc` lint. # Drawbacks [drawbacks]: #drawbacks -The requirement for `type =` is unfortunate, but it feels like the best way to ensure that adding new representations isn't a breaking change going forward. Even if we were to decide it weren't a "breaking change," it would still break things anyway, being de-facto breaking. +The requirement for `discriminant =` is unfortunate, but it feels like the best way to ensure that adding new representations isn't a breaking change going forward. Even if we were to decide it weren't a "breaking change," it would still break things anyway, being de-facto breaking. And, of course, this complicates the compiler. But that's about it. @@ -49,6 +57,10 @@ We could always not do this. But more realistically, here are some alternative designs that were rejected. +## `type` + +The second proposal for this RFC used `type = ...` instead of `discriminant = ...`. Initially, this decision was chosen because `discriminant` was long to type, but it was ultimately decided that `discriminant` is more clear and that the extra typing is worth it. Additionally, RFC [#3607] proposes using `discriminant` in its proposed syntax, so, this would be further in line with that as well. + ## `self::` We could, instead of using `type =`, require that all types contain a double-colon to indicate they're a path, effectively preventing collisions with arguments that aren't paths. This would require using `self::` for types that are imported in the local scope, and was actually the first proposal of this RFC, but wasn't very well-received. @@ -79,6 +91,12 @@ None currently. # Future possibilities [future-possibilities]: #future-possibilities -Future RFCs like [#3607] propose explicit methods of obtaining enum discriminants, and that further justifies the desire to include a change like this. There aren't many other extensions that could be added, however. +In the future, more potential discriminant types could be added which might be useful: + +* `repr(transparent)` structs over valid types +* Floating-point primitives +* `char` or other restricted primitives like `NonZeroU*` and `NonZeroI*` + +Of course, all of these would require larger changes to the compiler than the ones proposed, and are currently left as future extensions. [#3607]: https://github.com/rust-lang/rfcs/pull/3607 From d20d743301ee1919dfe81d1d7477b657193936b7 Mon Sep 17 00:00:00 2001 From: ltdk Date: Wed, 3 Jul 2024 16:43:54 -0400 Subject: [PATCH 5/9] Fix editing mistake from last revision; add note about proc macros --- text/0000-repr-type-aliases.md | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/text/0000-repr-type-aliases.md b/text/0000-repr-type-aliases.md index 909d2290b80..4da0f317d53 100644 --- a/text/0000-repr-type-aliases.md +++ b/text/0000-repr-type-aliases.md @@ -24,7 +24,7 @@ Enums allow `#[repr(discriminant = type)]` attributes to offer an explicit discr [#3607]: https://github.com/rust-lang/rfcs/pull/3607 -To ensure compatibility, the `#[repr(discriminant = type)]` form is required if the type is a type alias instead of a known primitive type. Depending on circumstances, it might be more beneficial to use the old `#[repr(type)]` syntax instead of `#[repr(discriminant = type)]` syntax, explained below: +To ensure compatibility, the `#[repr(discriminant = type)]` form is required if the type is a type alias instead of a known primitive type. While this may cause additional issues compared to the native `#[repr(type)]` syntax, it is possible to work around these issues as explained below: > Since `type` is resolved as a path instead of being a hard-coded value, it's possible to shadow existing primitive types and cause weird results. For example, if a scope contains `type u32 = u8`, then `#[repr(discriminant = u32)` means `#[repr(u8)]`, not `#[repr(u32)]`. You can always refer to these types directly, however, and `#[repr(discriminant = ::std::u32)]` will always mean `#[repr(u32)]`. > @@ -37,7 +37,9 @@ You can use any type alias in the `repr` attribute, but it *must* be an alias to # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -The `repr` attribute now accepts a `discriminant = ...` argument to indicate a resolved path instead of a well-known primitive type. If those the path resolves to a type alias to a valid primitive type which can be used in the `repr` attribute, that will be used as the actual discriminant representation. +The `repr` attribute now accepts a `discriminant = ...` argument to indicate an arbitrary type instead of a well-known primitive type. It accepts both fully-resolved types (including all necessary generics) as well as macros that expand to types. + +If the resulting type resolves to a type alias to a valid primitive type which can be used in the `repr` attribute, that will be used as the actual discriminant representation. Otherwise, the result is an error. An automatically-applicable lint should be added that warns a user if a `repr` argument is invalid but references in an-scope type alias without the `discriminant = ` prefix. (For example, `#[repr(MyType)]` becomes `#[repr(discriminant = MyType)]`, but `#[repr(C)]` does not warn even if an alias `C` is in scope.) @@ -46,9 +48,17 @@ To aid macro authors, an allow-by-default lint should be added that warns a user # Drawbacks [drawbacks]: #drawbacks -The requirement for `discriminant =` is unfortunate, but it feels like the best way to ensure that adding new representations isn't a breaking change going forward. Even if we were to decide it weren't a "breaking change," it would still break things anyway, being de-facto breaking. +The requirement for `discriminant =` is unfortunate, but it feels like the best way to ensure that adding new representations isn't a breaking change going forward. Even if we were to decide it weren't a "breaking change," it would still break things anyway, being de-facto breaking. Some people have expressed desire for a dedicated syntax instead, which would fix this problem, but this can still be added after the fact. + +Some proc macros like [`zerocopy::FromBytes`] may also have to complicate their logic if they depend on checking the existing `#[repr]` attribute for validity. In particular, since they can no longer exactly know which primitive type is being used for the representation, they would have to instead depend on associated constants on the types like `BITS` to verify which type is being used. Instead of being able to emit an error at macro expansion time, this error will have to be triggered at constant evaluation time or runtime instead, which is unfortunate. + +Complicating proc macros in this way is considered acceptable for the following reasons: + +* Since errors can still be included in constants now that `const` panics are stable, errors can still be caught at compile time. +* Users of special `repr` attributes are more likely to have more knowledge about specifics of how these proc macros work, and are thus more equipped to deal with weird errors like these. +* Ultimately, the benefits of adding the feature outweigh this negative. -And, of course, this complicates the compiler. But that's about it. +[`zerocopy::FromBytes`]: https://docs.rs/zerocopy/latest/zerocopy/derive.FromBytes.html # Rationale and alternatives [rationale-and-alternatives]: #rationale-and-alternatives From 07c2d94a01e8f23fff0d8cbc6c77eb352f3d2ec2 Mon Sep 17 00:00:00 2001 From: ltdk Date: Wed, 3 Jul 2024 16:45:18 -0400 Subject: [PATCH 6/9] Remove half-hearted capital letters option --- text/0000-repr-type-aliases.md | 4 ---- 1 file changed, 4 deletions(-) diff --git a/text/0000-repr-type-aliases.md b/text/0000-repr-type-aliases.md index 4da0f317d53..b015a9ff5d9 100644 --- a/text/0000-repr-type-aliases.md +++ b/text/0000-repr-type-aliases.md @@ -84,10 +84,6 @@ In future editions, we could either: * Let type aliases shadow all valid representations. This isn't ideal since there is no way to override the shadowing besides nesting your code in a new module and then re-exporting it outside that module, which is very messy. * Expand the list of unshadowable representations every edition where necessary. -## Capital letters - -You could require that the types start with capital letters- oh, right, `repr(C)` is a thing. - # Prior art [prior-art]: #prior-art From b12353500e3fab3f6a22fc9ad402b1a0bd9140c1 Mon Sep 17 00:00:00 2001 From: ltdk Date: Wed, 3 Jul 2024 16:46:53 -0400 Subject: [PATCH 7/9] Mention former RFC (#1605) --- text/0000-repr-type-aliases.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/text/0000-repr-type-aliases.md b/text/0000-repr-type-aliases.md index b015a9ff5d9..57fb5ee3dd1 100644 --- a/text/0000-repr-type-aliases.md +++ b/text/0000-repr-type-aliases.md @@ -87,7 +87,9 @@ In future editions, we could either: # Prior art [prior-art]: #prior-art -No known prior art exists. +This was previously suggested in [#1605] and was rejected at the time. The primary reasoning at the time was due to the lack of advanced syntax in attributes which were able to support this feature, which is no longer the case; the compiler has improved a lot in 8 years! + +[#1605]: https://github.com/rust-lang/rfcs/pull/1605 # Unresolved questions [unresolved-questions]: #unresolved-questions From 9b22aba9e50813d3baf1c8c9a0e675f6ed760580 Mon Sep 17 00:00:00 2001 From: ltdk Date: Wed, 3 Jul 2024 16:48:51 -0400 Subject: [PATCH 8/9] Mention the contents of the error, and StructuralEq as a future extension --- text/0000-repr-type-aliases.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/text/0000-repr-type-aliases.md b/text/0000-repr-type-aliases.md index 57fb5ee3dd1..c7726b8b1b2 100644 --- a/text/0000-repr-type-aliases.md +++ b/text/0000-repr-type-aliases.md @@ -39,7 +39,7 @@ You can use any type alias in the `repr` attribute, but it *must* be an alias to The `repr` attribute now accepts a `discriminant = ...` argument to indicate an arbitrary type instead of a well-known primitive type. It accepts both fully-resolved types (including all necessary generics) as well as macros that expand to types. -If the resulting type resolves to a type alias to a valid primitive type which can be used in the `repr` attribute, that will be used as the actual discriminant representation. Otherwise, the result is an error. +If the resulting type resolves to a type alias to a valid primitive type which can be used in the `repr` attribute, that will be used as the actual discriminant representation. Otherwise, the result is an error which states that only primitive types and type aliases to them are allowed. An automatically-applicable lint should be added that warns a user if a `repr` argument is invalid but references in an-scope type alias without the `discriminant = ` prefix. (For example, `#[repr(MyType)]` becomes `#[repr(discriminant = MyType)]`, but `#[repr(C)]` does not warn even if an alias `C` is in scope.) @@ -104,6 +104,7 @@ In the future, more potential discriminant types could be added which might be u * `repr(transparent)` structs over valid types * Floating-point primitives * `char` or other restricted primitives like `NonZeroU*` and `NonZeroI*` +* Arbitrary `StructuralEq` types Of course, all of these would require larger changes to the compiler than the ones proposed, and are currently left as future extensions. From b71923a417673ab1a7d6204a6dc23a52450b6d75 Mon Sep 17 00:00:00 2001 From: ltdk Date: Mon, 22 Jul 2024 17:55:54 -0400 Subject: [PATCH 9/9] discriminant => tag per lang team decision --- text/0000-repr-type-aliases.md | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/text/0000-repr-type-aliases.md b/text/0000-repr-type-aliases.md index c7726b8b1b2..45251b6cd2c 100644 --- a/text/0000-repr-type-aliases.md +++ b/text/0000-repr-type-aliases.md @@ -6,7 +6,7 @@ # Summary [summary]: #summary -Primitive representations on enums now accept type aliases, meaning that in addition to primitives like `#[repr(u32)]`, `#[repr(discriminant = core::ffi::c_int)]` and `#[repr(discriminant = my_type)]` are now accepted. +Primitive representations on enums now accept type aliases, meaning that in addition to primitives like `#[repr(u32)]`, `#[repr(tag = core::ffi::c_int)]` and `#[repr(tag = my_type)]` are now accepted. # Motivation [motivation]: #motivation @@ -20,15 +20,15 @@ For the same reasons why type aliases are useful, having type aliases in `repr` # Guide-level explanation [guide-level-explanation]: #guide-level-explanation -Enums allow `#[repr(discriminant = type)]` attributes to offer an explicit discriminant type. (`type` can be any primitive integer type, like `u8`, `i32`, or `usize`, but not `char`.) If all variants of the enum are unit variants, this means that the enum will be easily castable to `type` using `as`. Otherwise, the discriminant will still be of the specified type, but unsafe code is required to actually access it. (Future RFCs like [#3607] might change this.) +Enums allow `#[repr(tag = type)]` attributes to explicitly tag variants using the given type. (`type` can be any primitive integer type, like `u8`, `i32`, or `usize`, but not `char`.) If all variants of the enum are unit variants, this means that the enum will be easily castable to `type` using `as`. Otherwise, the discriminant will still be a tag of the specified type, but unsafe code is required to actually access it. (Future RFCs like [#3607] might change this.) [#3607]: https://github.com/rust-lang/rfcs/pull/3607 -To ensure compatibility, the `#[repr(discriminant = type)]` form is required if the type is a type alias instead of a known primitive type. While this may cause additional issues compared to the native `#[repr(type)]` syntax, it is possible to work around these issues as explained below: +To ensure compatibility, the `#[repr(tag = type)]` form is required if the type is a type alias instead of a known primitive type. While this may cause additional issues compared to the native `#[repr(type)]` syntax, it is possible to work around these issues as explained below: -> Since `type` is resolved as a path instead of being a hard-coded value, it's possible to shadow existing primitive types and cause weird results. For example, if a scope contains `type u32 = u8`, then `#[repr(discriminant = u32)` means `#[repr(u8)]`, not `#[repr(u32)]`. You can always refer to these types directly, however, and `#[repr(discriminant = ::std::u32)]` will always mean `#[repr(u32)]`. +> Since `type` is resolved as a path instead of being a hard-coded value, it's possible to shadow existing primitive types and cause weird results. For example, if a scope contains `type u32 = u8`, then `#[repr(tag = u32)` means `#[repr(u8)]`, not `#[repr(u32)]`. You can always refer to these types directly, however, and `#[repr(tag = ::std::u32)]` will always mean `#[repr(u32)]`. > -> It's also worth noting that this syntax allows using type aliases with names of other `repr` arguments, like `C`. For example, `type C = u8` in the scope means that `#[repr(discriminant = C)]` means `#[repr(u8)]`, not `#[repr(C)]`. +> It's also worth noting that this syntax allows using type aliases with names of other `repr` arguments, like `C`. For example, `type C = u8` in the scope means that `#[repr(tag = C)]` means `#[repr(u8)]`, not `#[repr(C)]`. You can use any type alias in the `repr` attribute, but it *must* be an alias to an accepted primitive type like `u8` or `i32`, and cannot be a pointer, reference, struct, etc. See the [future possibilities] section for some potential alternatives that may be allowed in the future. @@ -37,18 +37,18 @@ You can use any type alias in the `repr` attribute, but it *must* be an alias to # Reference-level explanation [reference-level-explanation]: #reference-level-explanation -The `repr` attribute now accepts a `discriminant = ...` argument to indicate an arbitrary type instead of a well-known primitive type. It accepts both fully-resolved types (including all necessary generics) as well as macros that expand to types. +The `repr` attribute now accepts a `tag = ...` argument to indicate an arbitrary type instead of a well-known primitive type. It accepts both fully-resolved types (including all necessary generics) as well as macros that expand to types. -If the resulting type resolves to a type alias to a valid primitive type which can be used in the `repr` attribute, that will be used as the actual discriminant representation. Otherwise, the result is an error which states that only primitive types and type aliases to them are allowed. +If the resulting type resolves to a type alias to a valid primitive type which can be used in the `repr` attribute, that will be used as a tag for the discriminant representation. Otherwise, the result is an error which states that only primitive integer types and type aliases to them are allowed. -An automatically-applicable lint should be added that warns a user if a `repr` argument is invalid but references in an-scope type alias without the `discriminant = ` prefix. (For example, `#[repr(MyType)]` becomes `#[repr(discriminant = MyType)]`, but `#[repr(C)]` does not warn even if an alias `C` is in scope.) +An automatically-applicable lint should be added that warns a user if a `repr` argument is invalid but references in an-scope type alias without the `tag = ` prefix. (For example, `#[repr(MyType)]` becomes `#[repr(tag = MyType)]`, but `#[repr(C)]` does not warn even if an alias `C` is in scope.) -To aid macro authors, an allow-by-default lint should be added that warns a user if they use `discriminant = ...` with a relative path instead of an absolute one, since the resolved type depends on the existing scope. For example, `discriminant = u32` or `discriminant = my_type` could potentially be shadowed by another type in the scope, but `discriminant = ::std::u32` and `discriminant = $crate::my_type` will always work as expected. This could potentially be relegated to a `clippy::pedantic` lint instead of a `rustc` lint. +To aid macro authors, an allow-by-default lint should be added that warns a user if they use `tag = ...` with a relative path instead of an absolute one, since the resolved type depends on the existing scope. For example, `tag = u32` or `tag = my_type` could potentially be shadowed by another type in the scope, but `tag = ::std::u32` and `tag = $crate::my_type` will always work as expected. This could potentially be relegated to a `clippy::pedantic` lint instead of a `rustc` lint. # Drawbacks [drawbacks]: #drawbacks -The requirement for `discriminant =` is unfortunate, but it feels like the best way to ensure that adding new representations isn't a breaking change going forward. Even if we were to decide it weren't a "breaking change," it would still break things anyway, being de-facto breaking. Some people have expressed desire for a dedicated syntax instead, which would fix this problem, but this can still be added after the fact. +The requirement for `tag =` is unfortunate, but it feels like the best way to ensure that adding new representations isn't a breaking change going forward. Even if we were to decide it weren't a "breaking change," it would still break things anyway, being de-facto breaking. Some people have expressed desire for a dedicated syntax instead, which would fix this problem, but this can still be added after the fact. Some proc macros like [`zerocopy::FromBytes`] may also have to complicate their logic if they depend on checking the existing `#[repr]` attribute for validity. In particular, since they can no longer exactly know which primitive type is being used for the representation, they would have to instead depend on associated constants on the types like `BITS` to verify which type is being used. Instead of being able to emit an error at macro expansion time, this error will have to be triggered at constant evaluation time or runtime instead, which is unfortunate. @@ -67,9 +67,13 @@ We could always not do this. But more realistically, here are some alternative designs that were rejected. -## `type` +## `type`, `discriminant` -The second proposal for this RFC used `type = ...` instead of `discriminant = ...`. Initially, this decision was chosen because `discriminant` was long to type, but it was ultimately decided that `discriminant` is more clear and that the extra typing is worth it. Additionally, RFC [#3607] proposes using `discriminant` in its proposed syntax, so, this would be further in line with that as well. +The previous proposals for this RFC used `type = ...` and `discriminant = ...` instead of `tag = ...`. Initially, this decision was chosen because `discriminant` was long to type, but it was ultimately decided that `discriminant` is more clear and that the extra typing is worth it. However, the lang team was swayed by the brevity of `tag` and, because this representation is explicitly a tagged discriminant, the shorter `tag` seems most fitting. + +RFC [#3607] is considering `discriminant` but may be moving to `tag` also, so, it's worth considering the outcome of that RFC as well. + +Note that `tag` also works because alternative representations could be considered. For example, an enum of two other enums with disjoint values could have a representation that avoids tags altogether, while still having the same "discriminant type" of the two individual enums. ## `self::` @@ -94,7 +98,7 @@ This was previously suggested in [#1605] and was rejected at the time. The prima # Unresolved questions [unresolved-questions]: #unresolved-questions -None currently. +* Is `tag` really the best name, or should it be `discriminant` instead? # Future possibilities [future-possibilities]: #future-possibilities