@@ -7,10 +7,14 @@ and which parts are still in a "preliminary" state.
7
7
8
8
[ #10 ] : https://github.com/rust-rfcs/unsafe-code-guidelines/issues/10
9
9
10
- ## Background
10
+ ## Categories of enums
11
11
12
- ** C-like enums.** The simplest form of enum is simply a list of
13
- variants:
12
+ ** Empty enums.** Enums with no variants can never be instantiated and
13
+ are equivalent to the ` ! ` type. They do not accept any ` #[repr] `
14
+ annotations.
15
+
16
+ ** Fieldless enums.** The simplest form of enum is one where none of
17
+ the variants have any fields:
14
18
15
19
``` rust
16
20
enum SomeEnum {
@@ -19,13 +23,13 @@ enum SomeEnum {
19
23
Variant3 ,
20
24
```
21
25
22
- Such enums are called "C-like" because they correspond quite closely
23
- with enums in the C language (though there are important differences
24
- as well, covered later). Presuming that they have more than one
25
- variant, these sorts of enums are always represented as a simple integer,
26
- though the size will vary.
26
+ Such enums correspond quite closely with enums in the C language
27
+ (though there are important differences as well). Presuming that they
28
+ have more than one variant, these sorts of enums are always
29
+ represented as a simple integer, though the size will vary.
27
30
28
- C-like enums may also specify the value of their discriminants explicitly:
31
+ Fieldless enums may also specify the value of their discriminants
32
+ explicitly:
29
33
30
34
``` rust
31
35
enum SomeEnum {
@@ -51,17 +55,9 @@ enum Foo {
51
55
}
52
56
```
53
57
54
- ** Option-like enums.** As a special case of data-carrying enums, we
55
- identify "option-like" enums as enums where all of the variants but
56
- one have no fields, and one variant has a single field. The most
57
- common example is ` Option ` itself. In some cases, as described below,
58
- the compiler may apply special optimization rules to the layout of
59
- option-like enums. The ** payload** of an option-like enum is the value
60
- of that single field.
61
-
62
- ## Enums with a specified representation
58
+ ## repr annotations accepted on enums
63
59
64
- Enums may be annotation using the following ` #[repr] ` tags:
60
+ In general, enums may be annotation using the following ` #[repr] ` tags:
65
61
66
62
- A specific integer type (called ` Int ` as a shorthand below):
67
63
- ` #[repr(u8)] `
@@ -79,25 +75,36 @@ Enums may be annotation using the following `#[repr]` tags:
79
75
- ` #[repr(C, u16)] `
80
76
- etc
81
77
82
- We cover each of the categories below. The layout rules for enums with
83
- explicit ` #[repr] ` annotations are specified in [ RFC 2195 ] [ ] .
78
+ Note that manually specifying the alignment using ` #[repr(align)] ` is
79
+ not permitted on an enum .
84
80
85
- [ RFC 2195 ] : https://rust-lang.github.io/rfcs/2195-really-tagged-unions.html
81
+ The set of repr annotations accepted by an enum depends on its category,
82
+ as defined above:
83
+
84
+ - Empty enums: no repr annotations are permitted.
85
+ - Fieldless enums: ` #[repr(Int)] ` -style and ` #[repr(C)] ` annotations are permitted, but ` #[repr(C, Int)] ` annotations are not.
86
+ - Data-carrying enums: all repr annotations are permitted.
86
87
87
- ### Layout of an enum with no variants
88
+ ## Enum layout rules
88
89
89
- An enum with no variants can never be instantiated and is logically
90
- equivalent to the "never type" ` ! ` . Such enums are guaranteed to have
91
- the same layout as ` ! ` (zero size and alignment 1).
90
+ The rules for enum layout vary depending on the category.
92
91
93
- ### Layout of a C-like enum
92
+ ### Layout of an empty enum
94
93
95
- If there is no ` #[repr] ` attached to a C-like enum, it is guaranteed
96
- to be represented as an integer of sufficient size to store the
97
- discriminants for all possible variants. The size is selected by the
98
- compiler but must be at least a ` u8 ` .
94
+ An ** empty enum** is an enum with no variants; empty enums can never
95
+ be instantiated and are logically equivalent to the "never type"
96
+ ` ! ` . ` #[repr] ` annotations are not accepted on empty enums. Empty
97
+ enums are guaranteed to have the same layout as ` ! ` (zero size and
98
+ alignment 1).
99
99
100
- When a ` #[repr(Int)] ` -style annotation is attached to a C-like enum
100
+ ### Layout of a fieldless enum
101
+
102
+ If there is no ` #[repr] ` attached to a fieldless enum, it is
103
+ guaranteed to be represented as an integer of sufficient size to store
104
+ the discriminants for all possible variants. The size is selected by
105
+ the compiler but must be at least a ` u8 ` .
106
+
107
+ When a ` #[repr(Int)] ` -style annotation is attached to a fieldless enum
101
108
(one without any data for its variants), it will cause the enum to be
102
109
represented as a simple integer of the specified size ` Int ` . This must
103
110
be sufficient to store all the required discriminant values.
@@ -107,7 +114,7 @@ size as the C compiler would use for the given target for an
107
114
equivalent C-enum declaration.
108
115
109
116
Combining a ` C ` and ` Int ` representation (e.g., ` #[repr(C, u8)] ` ) is
110
- not permitted on a C-like enum.
117
+ not permitted on a fieldless enum.
111
118
112
119
The values used for the discriminant will match up with what is
113
120
specified (or automatically assigned) in the enum definition. For
@@ -128,12 +135,19 @@ enum Foo {
128
135
** Unresolved question:** What about platforms where ` -fshort-enums `
129
136
are the default? Do we know/care about that?
130
137
131
- ### Layout for enums that carry data
138
+ ### Layout of a data-carrying enums with an explicit repr annotation
132
139
133
- For enums that carry data, the layout differs depending on whether
134
- C-compatibility is requested or not.
140
+ This section concerns data-carrying enums ** with an explicit repr
141
+ annotation of some form** . The memory layout of such cases was
142
+ specified in [ RFC 2195] [ ] and is therefore normative.
135
143
136
- #### Non-C-compatible layouts
144
+ [ RFC 2195 ] : https://rust-lang.github.io/rfcs/2195-really-tagged-unions.html
145
+
146
+ The layout of data-carrying enums that do ** not** have an explicit
147
+ repr annotation is generally undefined, but with certain specific
148
+ exceptions: see the next section for details.
149
+
150
+ #### Non-C-compatible representation selected
137
151
138
152
When an enum is tagged with ` #[repr(Int)] ` for some integral type
139
153
` Int ` (e.g., ` #[repr(u8)] ` ), it will be represented as a C-union of a
@@ -176,15 +190,15 @@ Note that the `TwoCasesVariantA` and `TwoCasesVariantB` structs are
176
190
appears at offset 0 in both cases, so that we can read it to determine
177
191
the current variant.
178
192
179
- #### C-compatible layouts.
193
+ #### C-compatible representation selected
180
194
181
195
When the ` #[repr] ` tag includes ` C ` , e.g., ` #[repr(C)] ` or `#[ repr(C,
182
196
u8)] `, the layout of enums is changed to better match C++ enums. In
183
197
this mode, the data is laid out as a tuple of ` (discriminant, union) ` ,
184
198
where ` union ` represents a C union of all the possible variants. The
185
199
type of the discriminant will be the integral type specified (` u8 ` ,
186
200
etc) -- if no type is specified, then the compiler will select one
187
- based on what a size a C-like enum would have with the same number of
201
+ based on what a size a fieldless enum would have with the same number of
188
202
variants.
189
203
190
204
This layout, while more compatible and arguably more obvious, is also
@@ -252,27 +266,26 @@ struct MyEnum {
252
266
};
253
267
```
254
268
255
- ## Enums without a specified representation
269
+ ### Layout of a data-carrying enums without a repr annotation
270
+
271
+ If no explicit `#[repr]` attribute is used, then the layout of a
272
+ data-carrying enum is typically **not specified**. However, in certain
273
+ select cases, there are **guaranteed layout optimizations** that may
274
+ apply, as described below.
256
275
257
- If no explicit `#[repr]` attribute is used, then the layout of most
258
- enums is not specified, with one crucial exception: option-like enums
259
- may in some cases use a compact layout that is identical to their
260
- payload.
276
+ #### Discriminant elision on Option-like enums
261
277
262
278
(Meta-note: The content in this section is not described by any RFC
263
279
and is therefore "non-normative".)
264
280
265
- ### Discriminant elision on Option-like enums
281
+ **Definition.** An **option-like enum** is a 2-variant enum where:
266
282
267
- **Definition.** An **option-like enum** is an enum which has:
268
-
269
- - one variant with a single field,
270
- - other variants with no fields ("unit" variants).
283
+ - one variant has a single field, and
284
+ - the other variant has no fields (the "unit variant").
271
285
272
286
The simplest example is `Option<T>` itself, where the `Some` variant
273
287
has a single field (of type `T`), and the `None` variant has no
274
- fields. But other enums that fit that same template (and even enums
275
- that include multiple `None`-like fields) fit.
288
+ fields. But other enums that fit that same template fit.
276
289
277
290
**Definition.** The **payload** of an option-like enum is the single
278
291
field which it contains; in the case of `Option<T>`, the payload has
@@ -284,15 +297,17 @@ may never be NULL, and hence defines a niche consisting of the
284
297
bitstring `0`. Similarly, the standard library types [`NonZeroU8`]
285
298
and friends may never be zero, and hence also define the value of `0`
286
299
as a niche. (Types that define niche values will say so as part of the
287
- description of their representation invariant.)
300
+ description of their representation invariant, which -- as of this
301
+ writing -- are the next topic up for discussion in the unsafe code
302
+ guidelines process.)
288
303
289
304
[`NonZeroU8`]: https://doc.rust-lang.org/std/num/struct.NonZeroU8.html
290
305
291
- **Option-like enums where the payload defines an adequate number of
292
- niche values are guaranteed to be represented without using any
293
- discriminant at all .** This is called **discriminant elision**. If
294
- discriminant elision is in effect, then the layout of the enum is
295
- equal to the layout of its payload .
306
+ **Option-like enums where the payload defines at least one niche value
307
+ are guaranteed to be represented using the same memory layout as their
308
+ payload .** This is called **discriminant elision**, as there is no
309
+ explicit discriminant value stored anywhere. Instead, niche values are
310
+ used to represent the unit variant .
296
311
297
312
The most common example is that `Option<&u8>` can be represented as an
298
313
nullable `&u8` reference -- the `None` variant is then represented
@@ -313,64 +328,13 @@ a nullable pointer. FFI interop often depends on this property.
313
328
pointer (which is therefore equivalent to a C function pointer) . FFI
314
329
interop often depends on this property.
315
330
316
- **Example.** Consider the following enum definitions:
331
+ **Example.** The following enum definition is **not** option-like,
332
+ as it has two unit variants:
317
333
318
334
```rust
319
335
enum Enum1<T> {
320
336
Present(T),
321
337
Absent1,
322
338
Absent2,
323
339
}
324
-
325
- enum Enum2 {
326
- A, B, C
327
- }
328
340
```
329
-
330
- ` Enum1<&u8> ` is not eligible for discriminant elision, since ` &u8 `
331
- defines a single niche value, but ` Enum1 ` has two unit
332
- variants. However, ` Enum2 ` has only three legal values (0 for ` A ` , 1
333
- for ` B ` , and 2 for ` C ` ), and hence defines a plethora of niche values[ ^ caveat ] .
334
- Therefore, ` Enum1<Enum2> ` is guaranteed to be laid out the same as
335
- ` Enum2 ` ([ consider the results of applying
336
- ` size_of ` ] ( https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=eadff247f2c5713b8f3b6c9cda297711 ) ).
337
-
338
- [^caveat]: Strictly speaking, niche values are considered part of the " representation invariant" for an enum and not its type. Therefore, this section is added only as a preview for future unsafe-code-guidelines discussion.
339
-
340
- ### Other optimizations
341
-
342
- The previous section specified a relatively narrow set of layout
343
- optimizations that are **guaranteed** by the compiler. However, the
344
- compiler is always free to perform **more** optimizations than this
345
- minimal set. For example, the compiler presently treats `Result<T,
346
- ()>` and `Option<T>` as equivalent, but this behavior is not
347
- guaranteed to continue as `Result<T, ()>` is not considered
348
- " option-like" .
349
-
350
- As of this writing, the compiler' s current behavior is to attempt to
351
- elide discriminants whenever possible. Furthermore, a variant whose
352
- only fields are of zero-size is considered a unit variant for this
353
- purpose. If eliding discriminants is not possible (e.g., because the
354
- payload does not define sufficient niche values), then the compiler
355
- will select an appropriate discriminant size `N` and use a
356
- representation roughly equivalent to `#[repr(N)]`, though without the
357
- strict `#[repr(C)]` guarantees on each struct. However, this behavior
358
- is not guaranteed to remain the same in future versions of the
359
- compiler and should not be relied upon. (While it is not expected that
360
- existing layout optimizations will be removed, it is possible -- it is
361
- also possible for the compiler to introduce new sorts of
362
- optimizations.)
363
-
364
- ## Niche values
365
-
366
- C-like enums with N variants and no specified representation are
367
- guaranteed to supply niche values corresponding to 256 - N (presuming
368
- that is a positive number). This is because a C-like enum must be
369
- represented using an integer and that integer must correspond to a
370
- valid variant: the precise size of C-like enums is not specified but
371
- it must be at least one byte, which means that there are at least 256
372
- possible bitstrings (only N of which are valid).
373
-
374
- Other enums -- or enums with a specified representation -- may supply
375
- niches if their representation invariant permits it, but that is not
376
- **guaranteed**.
0 commit comments