Skip to content

Commit 3600533

Browse files
authored
Merge pull request #153 from RalfJung/uninit
update uninit section to MaybeUninit
2 parents 0f469dc + 9fae750 commit 3600533

File tree

1 file changed

+120
-62
lines changed

1 file changed

+120
-62
lines changed

src/unchecked-uninit.md

Lines changed: 120 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -8,25 +8,89 @@ Unfortunately this is pretty rigid, especially if you need to initialize your
88
array in a more incremental or dynamic way.
99

1010
Unsafe Rust gives us a powerful tool to handle this problem:
11-
[`mem::uninitialized`][uninitialized]. This function pretends to return a value
12-
when really it does nothing at all. Using it, we can convince Rust that we have
13-
initialized a variable, allowing us to do trickier things with conditional and
14-
incremental initialization.
15-
16-
Unfortunately, this opens us up to all kinds of problems. Assignment has a
17-
different meaning to Rust based on whether it believes that a variable is
18-
initialized or not. If it's believed uninitialized, then Rust will semantically
19-
just memcopy the bits over the uninitialized ones, and do nothing else. However
20-
if Rust believes a value to be initialized, it will try to `Drop` the old value!
21-
Since we've tricked Rust into believing that the value is initialized, we can no
22-
longer safely use normal assignment.
23-
24-
This is also a problem if you're working with a raw system allocator, which
25-
returns a pointer to uninitialized memory.
26-
27-
To handle this, we must use the [`ptr`] module. In particular, it provides
28-
three functions that allow us to assign bytes to a location in memory without
29-
dropping the old value: [`write`], [`copy`], and [`copy_nonoverlapping`].
11+
[`MaybeUninit`]. This type can be used to handle memory that has not been fully
12+
initialized yet.
13+
14+
With `MaybeUninit`, we can initialize an array element-for-element as follows:
15+
16+
```rust
17+
use std::mem::{self, MaybeUninit};
18+
19+
// Size of the array is hard-coded but easy to change (meaning, changing just
20+
// the constant is sufficient). This means we can't use [a, b, c] syntax to
21+
// initialize the array, though, as we would have to keep that in sync
22+
// with `SIZE`!
23+
const SIZE: usize = 10;
24+
25+
let x = {
26+
// Create an uninitialized array of `MaybeUninit`. The `assume_init` is
27+
// safe because the type we are claiming to have initialized here is a
28+
// bunch of `MaybeUninit`s, which do not require initialization.
29+
let mut x: [MaybeUninit<Box<u32>>; SIZE] = unsafe {
30+
MaybeUninit::uninit().assume_init()
31+
};
32+
33+
// Dropping a `MaybeUninit` does nothing. Thus using raw pointer
34+
// assignment instead of `ptr::write` does not cause the old
35+
// uninitialized value to be dropped.
36+
// Exception safety is not a concern because Box can't panic
37+
for i in 0..SIZE {
38+
x[i] = MaybeUninit::new(Box::new(i as u32));
39+
}
40+
41+
// Everything is initialized. Transmute the array to the
42+
// initialized type.
43+
unsafe { mem::transmute::<_, [Box<u32>; SIZE]>(x) }
44+
};
45+
46+
dbg!(x);
47+
```
48+
49+
This code proceeds in three steps:
50+
51+
1. Create an array of `MaybeUninit<T>`. With current stable Rust, we have to use
52+
unsafe code for this: we take some uninitialized piece of memory
53+
(`MaybeUninit::uninit()`) and claim we have fully initialized it
54+
([`assume_init()`][assume_init]). This seems ridiculous, because we didn't!
55+
The reason this is correct is that the array consists itself entirely of
56+
`MaybeUninit`, which do not actually require initialization. For most other
57+
types, doing `MaybeUninit::uninit().assume_init()` produces an invalid
58+
instance of said type, so you got yourself some Undefined Behavior.
59+
60+
2. Initialize the array. The subtle aspect of this is that usually, when we use
61+
`=` to assign to a value that the Rust type checker considers to already be
62+
initialized (like `x[i]`), the old value stored on the left-hand side gets
63+
dropped. This would be a disaster. However, in this case, the type of the
64+
left-hand side is `MaybeUninit<Box<u32>>`, and dropping that does not do
65+
anything! See below for some more discussion of this `drop` issue.
66+
67+
3. Finally, we have to change the type of our array to remove the
68+
`MaybeUninit`. With current stable Rust, this requires a `transmute`.
69+
This transmute is legal because in memory, `MaybeUninit<T>` looks the same as `T`.
70+
71+
However, note that in general, `Container<MaybeUninit<T>>>` does *not* look
72+
the same as `Container<T>`! Imagine if `Container` was `Option`, and `T` was
73+
`bool`, then `Option<bool>` exploits that `bool` only has two valid values,
74+
but `Option<MaybeUninit<bool>>` cannot do that because the `bool` does not
75+
have to be initialized.
76+
77+
So, it depends on `Container` whether transmuting away the `MaybeUninit` is
78+
allowed. For arrays, it is (and eventually the standard library will
79+
acknowledge that by providing appropriate methods).
80+
81+
It's worth spending a bit more time on the loop in the middle, and in particular
82+
the assignment operator and its interaction with `drop`. If we would have
83+
written something like
84+
```rust,ignore
85+
*x[i].as_mut_ptr() = Box::new(i as u32); // WRONG!
86+
```
87+
we would actually overwrite a `Box<u32>`, leading to `drop` of uninitialized
88+
data, which will cause much sadness and pain.
89+
90+
The correct alternative, if for some reason we cannot use `MaybeUninit::new`, is
91+
to use the [`ptr`] module. In particular, it provides three functions that allow
92+
us to assign bytes to a location in memory without dropping the old value:
93+
[`write`], [`copy`], and [`copy_nonoverlapping`].
3094

3195
* `ptr::write(ptr, val)` takes a `val` and moves it into the address pointed
3296
to by `ptr`.
@@ -40,59 +104,53 @@ dropping the old value: [`write`], [`copy`], and [`copy_nonoverlapping`].
40104
It should go without saying that these functions, if misused, will cause serious
41105
havoc or just straight up Undefined Behavior. The only things that these
42106
functions *themselves* require is that the locations you want to read and write
43-
are allocated. However the ways writing arbitrary bits to arbitrary
44-
locations of memory can break things are basically uncountable!
45-
46-
Putting this all together, we get the following:
47-
48-
```rust
49-
use std::mem;
50-
use std::ptr;
51-
52-
// size of the array is hard-coded but easy to change. This means we can't
53-
// use [a, b, c] syntax to initialize the array, though!
54-
const SIZE: usize = 10;
55-
56-
let mut x: [Box<u32>; SIZE];
57-
58-
unsafe {
59-
// convince Rust that x is Totally Initialized
60-
x = mem::uninitialized();
61-
for i in 0..SIZE {
62-
// very carefully overwrite each index without reading it
63-
// NOTE: exception safety is not a concern; Box can't panic
64-
ptr::write(&mut x[i], Box::new(i as u32));
65-
}
66-
}
67-
68-
println!("{:?}", x);
69-
```
107+
are allocated and properly aligned. However, the ways writing arbitrary bits to
108+
arbitrary locations of memory can break things are basically uncountable!
70109

71110
It's worth noting that you don't need to worry about `ptr::write`-style
72111
shenanigans with types which don't implement `Drop` or contain `Drop` types,
73-
because Rust knows not to try to drop them. Similarly you should be able to
74-
assign to fields of partially initialized structs directly if those fields don't
75-
contain any `Drop` types.
112+
because Rust knows not to try to drop them. This is what we relied on in the
113+
above example.
76114

77115
However when working with uninitialized memory you need to be ever-vigilant for
78116
Rust trying to drop values you make like this before they're fully initialized.
79117
Every control path through that variable's scope must initialize the value
80118
before it ends, if it has a destructor.
81-
*[This includes code panicking](unwinding.html)*.
82-
83-
Not being careful about uninitialized memory often leads to bugs and it has been
84-
decided the [`mem::uninitialized`][uninitialized] function should be deprecated.
85-
The [`MaybeUninit`] type is supposed to replace it as its API wraps many common
86-
operations needed to be done around initialized memory. This is nightly only for
87-
now.
119+
*[This includes code panicking](unwinding.html)*. `MaybeUninit` helps a bit
120+
here, because it does not implicitly drop its content - but all this really
121+
means in case of a panic is that instead of a double-free of the not yet
122+
initialized parts, you end up with a memory leak of the already initialized
123+
parts.
124+
125+
Note that, to use the `ptr` methods, you need to first obtain a *raw pointer* to
126+
the data you want to initialize. It is illegal to construct a *reference* to
127+
uninitialized data, which implies that you have to be careful when obtaining
128+
said raw pointer:
129+
* For an array of `T`, you can use `base_ptr.add(idx)` where `base_ptr: *mut T`
130+
to compute the address of array index `idx`. This relies on
131+
how arrays are laid out in memory.
132+
* For a struct, however, in general we do not know how it is laid out, and we
133+
also cannot use `&mut base_ptr.field` as that would be creating a
134+
reference. Thus, it is currently not possible to create a raw pointer to a field
135+
of a partially initialized struct, and also not possible to initialize a single
136+
field of a partially initialized struct. (A
137+
[solution to this problem](https://github.com/rust-lang/rfcs/pull/2582) is being
138+
worked on.)
139+
140+
One last remark: when reading old Rust code, you might stumble upon the
141+
deprecated `mem::uninitialized` function. That function used to be the only way
142+
to deal with uninitialized memory on the stack, but it turned out to be
143+
impossible to properly integrate with the rest of the language. Always use
144+
`MaybeUninit` instead in new code, and port old code over when you get the
145+
opportunity.
88146

89147
And that's about it for working with uninitialized memory! Basically nothing
90148
anywhere expects to be handed uninitialized memory, so if you're going to pass
91149
it around at all, be sure to be *really* careful.
92150

93-
[uninitialized]: ../std/mem/fn.uninitialized.html
94-
[`ptr`]: ../std/ptr/index.html
95-
[`write`]: ../std/ptr/fn.write.html
96-
[`copy`]: ../std/ptr/fn.copy.html
97-
[`copy_nonoverlapping`]: ../std/ptr/fn.copy_nonoverlapping.html
98-
[`MaybeUninit`]: ../std/mem/union.MaybeUninit.html
151+
[`MaybeUninit`]: ../core/mem/union.MaybeUninit.html
152+
[assume_init]: ../core/mem/union.MaybeUninit.html#method.assume_init
153+
[`ptr`]: ../core/ptr/index.html
154+
[`write`]: ../core/ptr/fn.write.html
155+
[`copy`]: ../core/ptr/fn.copy.html
156+
[`copy_nonoverlapping`]: ../core/ptr/fn.copy_nonoverlapping.html

0 commit comments

Comments
 (0)