Skip to content

Commit 4d52d7c

Browse files
committed
Auto merge of #27032 - Gankro:tarpl, r=aturon,acrichto,arielb,pnkfelix,nrc,nmatsakis,huonw
I've been baking this out of tree for long enough. This is currently about ~2/5ths the size of TRPL. Time to get it in tree so it can be more widely maintained and scrutinized. I've preserved the whole gruesome history including various rewrites. I can definitely squash these a fair amount if desired. Some random people submitted minor fixes though, so they're mixed in. Edit: forgot to link to rendered http://cglab.ca/~abeinges/blah/turpl/_book/ Edit2: To streamline the review process, I'm going to break this into sections that need official "domain expert" approval: # Summary * [ ] references.md -- very important, needs work * [x] Meet Safe and Unsafe: reviewed by @aturon * [x] Data Layout: reviewed by @arielb1 * [x] Ownership: reviewed by @aturon ( and sorta @nikomatsakis ) -- significantly updated, may need re-r * [x] Coversions: reviewed by @nrc * [x] Uninitialized Memory: reviewed by @pnkfelix * [x] Ownership-Oriented Resource Management: reviewed by @aturon * [x] Unwinding: reviewed by @alexcrichton * [x] Concurrency: reviewed by @aturon * [x] Implementing Vec: r? @huonw
2 parents 1867078 + ddb0290 commit 4d52d7c

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

55 files changed

+5442
-2
lines changed

mk/docs.mk

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ ERR_IDX_GEN = $(RPATH_VAR2_T_$(CFG_BUILD)_H_$(CFG_BUILD)) $(ERR_IDX_GEN_EXE)
7777

7878
D := $(S)src/doc
7979

80-
DOC_TARGETS := trpl style error-index
80+
DOC_TARGETS := trpl tarpl style error-index
8181
COMPILER_DOC_TARGETS :=
8282
DOC_L10N_TARGETS :=
8383

@@ -287,6 +287,13 @@ doc/book/index.html: $(RUSTBOOK_EXE) $(wildcard $(S)/src/doc/trpl/*.md) | doc/
287287
$(Q)rm -rf doc/book
288288
$(Q)$(RUSTBOOK) build $(S)src/doc/trpl doc/book
289289

290+
tarpl: doc/adv-book/index.html
291+
292+
doc/adv-book/index.html: $(RUSTBOOK_EXE) $(wildcard $(S)/src/doc/tarpl/*.md) | doc/
293+
@$(call E, rustbook: $@)
294+
$(Q)rm -rf doc/adv-book
295+
$(Q)$(RUSTBOOK) build $(S)src/doc/tarpl doc/adv-book
296+
290297
style: doc/style/index.html
291298

292299
doc/style/index.html: $(RUSTBOOK_EXE) $(wildcard $(S)/src/doc/style/*.md) | doc/

mk/tests.mk

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -162,7 +162,8 @@ $(foreach doc,$(DOCS), \
162162
$(eval $(call DOCTEST,md-$(doc),$(S)src/doc/$(doc).md)))
163163
$(foreach file,$(wildcard $(S)src/doc/trpl/*.md), \
164164
$(eval $(call DOCTEST,$(file:$(S)src/doc/trpl/%.md=trpl-%),$(file))))
165-
165+
$(foreach file,$(wildcard $(S)src/doc/tarpl/*.md), \
166+
$(eval $(call DOCTEST,$(file:$(S)src/doc/tarpl/%.md=tarpl-%),$(file))))
166167
######################################################################
167168
# Main test targets
168169
######################################################################

src/doc/tarpl/README.md

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
% The Advanced Rust Programming Language
2+
3+
# NOTE: This is a draft document, and may contain serious errors
4+
5+
So you've played around with Rust a bit. You've written a few simple programs and
6+
you think you grok the basics. Maybe you've even read through
7+
*[The Rust Programming Language][trpl]*. Now you want to get neck-deep in all the
8+
nitty-gritty details of the language. You want to know those weird corner-cases.
9+
You want to know what the heck `unsafe` really means, and how to properly use it.
10+
This is the book for you.
11+
12+
To be clear, this book goes into *serious* detail. We're going to dig into
13+
exception-safety and pointer aliasing. We're going to talk about memory
14+
models. We're even going to do some type-theory. This is stuff that you
15+
absolutely *don't* need to know to write fast and safe Rust programs.
16+
You could probably close this book *right now* and still have a productive
17+
and happy career in Rust.
18+
19+
However if you intend to write unsafe code -- or just *really* want to dig into
20+
the guts of the language -- this book contains *invaluable* information.
21+
22+
Unlike *The Rust Programming Language* we *will* be assuming considerable prior
23+
knowledge. In particular, you should be comfortable with:
24+
25+
* Basic Systems Programming:
26+
* Pointers
27+
* [The stack and heap][]
28+
* The memory hierarchy (caches)
29+
* Threads
30+
31+
* [Basic Rust][]
32+
33+
Due to the nature of advanced Rust programming, we will be spending a lot of time
34+
talking about *safety* and *guarantees*. In particular, a significant portion of
35+
the book will be dedicated to correctly writing and understanding Unsafe Rust.
36+
37+
[trpl]: ../book/
38+
[The stack and heap]: ../book/the-stack-and-the-heap.html
39+
[Basic Rust]: ../book/syntax-and-semantics.html

src/doc/tarpl/SUMMARY.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
# Summary
2+
3+
* [Meet Safe and Unsafe](meet-safe-and-unsafe.md)
4+
* [How Safe and Unsafe Interact](safe-unsafe-meaning.md)
5+
* [Working with Unsafe](working-with-unsafe.md)
6+
* [Data Layout](data.md)
7+
* [repr(Rust)](repr-rust.md)
8+
* [Exotically Sized Types](exotic-sizes.md)
9+
* [Other reprs](other-reprs.md)
10+
* [Ownership](ownership.md)
11+
* [References](references.md)
12+
* [Lifetimes](lifetimes.md)
13+
* [Limits of lifetimes](lifetime-mismatch.md)
14+
* [Lifetime Elision](lifetime-elision.md)
15+
* [Unbounded Lifetimes](unbounded-lifetimes.md)
16+
* [Higher-Rank Trait Bounds](hrtb.md)
17+
* [Subtyping and Variance](subtyping.md)
18+
* [Drop Check](dropck.md)
19+
* [PhantomData](phantom-data.md)
20+
* [Splitting Borrows](borrow-splitting.md)
21+
* [Type Conversions](conversions.md)
22+
* [Coercions](coercions.md)
23+
* [The Dot Operator](dot-operator.md)
24+
* [Casts](casts.md)
25+
* [Transmutes](transmutes.md)
26+
* [Uninitialized Memory](uninitialized.md)
27+
* [Checked](checked-uninit.md)
28+
* [Drop Flags](drop-flags.md)
29+
* [Unchecked](unchecked-uninit.md)
30+
* [Ownership Based Resource Management](obrm.md)
31+
* [Constructors](constructors.md)
32+
* [Destructors](destructors.md)
33+
* [Leaking](leaking.md)
34+
* [Unwinding](unwinding.md)
35+
* [Exception Safety](exception-safety.md)
36+
* [Poisoning](poisoning.md)
37+
* [Concurrency](concurrency.md)
38+
* [Races](races.md)
39+
* [Send and Sync](send-and-sync.md)
40+
* [Atomics](atomics.md)
41+
* [Implementing Vec](vec.md)
42+
* [Layout](vec-layout.md)
43+
* [Allocating](vec-alloc.md)
44+
* [Push and Pop](vec-push-pop.md)
45+
* [Deallocating](vec-dealloc.md)
46+
* [Deref](vec-deref.md)
47+
* [Insert and Remove](vec-insert-remove.md)
48+
* [IntoIter](vec-into-iter.md)
49+
* [RawVec](vec-raw.md)
50+
* [Drain](vec-drain.md)
51+
* [Handling Zero-Sized Types](vec-zsts.md)
52+
* [Final Code](vec-final.md)
53+
* [Implementing Arc and Mutex](arc-and-mutex.md)

src/doc/tarpl/arc-and-mutex.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
% Implementing Arc and Mutex
2+
3+
Knowing the theory is all fine and good, but the *best* way to understand
4+
something is to use it. To better understand atomics and interior mutability,
5+
we'll be implementing versions of the standard library's Arc and Mutex types.
6+
7+
TODO: ALL OF THIS OMG

src/doc/tarpl/atomics.md

Lines changed: 250 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,250 @@
1+
% Atomics
2+
3+
Rust pretty blatantly just inherits C11's memory model for atomics. This is not
4+
due this model being particularly excellent or easy to understand. Indeed, this
5+
model is quite complex and known to have [several flaws][C11-busted]. Rather, it
6+
is a pragmatic concession to the fact that *everyone* is pretty bad at modeling
7+
atomics. At very least, we can benefit from existing tooling and research around
8+
C.
9+
10+
Trying to fully explain the model in this book is fairly hopeless. It's defined
11+
in terms of madness-inducing causality graphs that require a full book to
12+
properly understand in a practical way. If you want all the nitty-gritty
13+
details, you should check out [C's specification (Section 7.17)][C11-model].
14+
Still, we'll try to cover the basics and some of the problems Rust developers
15+
face.
16+
17+
The C11 memory model is fundamentally about trying to bridge the gap between the
18+
semantics we want, the optimizations compilers want, and the inconsistent chaos
19+
our hardware wants. *We* would like to just write programs and have them do
20+
exactly what we said but, you know, *fast*. Wouldn't that be great?
21+
22+
23+
24+
25+
# Compiler Reordering
26+
27+
Compilers fundamentally want to be able to do all sorts of crazy transformations
28+
to reduce data dependencies and eliminate dead code. In particular, they may
29+
radically change the actual order of events, or make events never occur! If we
30+
write something like
31+
32+
```rust,ignore
33+
x = 1;
34+
y = 3;
35+
x = 2;
36+
```
37+
38+
The compiler may conclude that it would *really* be best if your program did
39+
40+
```rust,ignore
41+
x = 2;
42+
y = 3;
43+
```
44+
45+
This has inverted the order of events *and* completely eliminated one event.
46+
From a single-threaded perspective this is completely unobservable: after all
47+
the statements have executed we are in exactly the same state. But if our
48+
program is multi-threaded, we may have been relying on `x` to *actually* be
49+
assigned to 1 before `y` was assigned. We would *really* like the compiler to be
50+
able to make these kinds of optimizations, because they can seriously improve
51+
performance. On the other hand, we'd really like to be able to depend on our
52+
program *doing the thing we said*.
53+
54+
55+
56+
57+
# Hardware Reordering
58+
59+
On the other hand, even if the compiler totally understood what we wanted and
60+
respected our wishes, our *hardware* might instead get us in trouble. Trouble
61+
comes from CPUs in the form of memory hierarchies. There is indeed a global
62+
shared memory space somewhere in your hardware, but from the perspective of each
63+
CPU core it is *so very far away* and *so very slow*. Each CPU would rather work
64+
with its local cache of the data and only go through all the *anguish* of
65+
talking to shared memory *only* when it doesn't actually have that memory in
66+
cache.
67+
68+
After all, that's the whole *point* of the cache, right? If every read from the
69+
cache had to run back to shared memory to double check that it hadn't changed,
70+
what would the point be? The end result is that the hardware doesn't guarantee
71+
that events that occur in the same order on *one* thread, occur in the same
72+
order on *another* thread. To guarantee this, we must issue special instructions
73+
to the CPU telling it to be a bit less smart.
74+
75+
For instance, say we convince the compiler to emit this logic:
76+
77+
```text
78+
initial state: x = 0, y = 1
79+
80+
THREAD 1 THREAD2
81+
y = 3; if x == 1 {
82+
x = 1; y *= 2;
83+
}
84+
```
85+
86+
Ideally this program has 2 possible final states:
87+
88+
* `y = 3`: (thread 2 did the check before thread 1 completed)
89+
* `y = 6`: (thread 2 did the check after thread 1 completed)
90+
91+
However there's a third potential state that the hardware enables:
92+
93+
* `y = 2`: (thread 2 saw `x = 1`, but not `y = 3`, and then overwrote `y = 3`)
94+
95+
It's worth noting that different kinds of CPU provide different guarantees. It
96+
is common to separate hardware into two categories: strongly-ordered and weakly-
97+
ordered. Most notably x86/64 provides strong ordering guarantees, while ARM
98+
provides weak ordering guarantees. This has two consequences for concurrent
99+
programming:
100+
101+
* Asking for stronger guarantees on strongly-ordered hardware may be cheap or
102+
even *free* because they already provide strong guarantees unconditionally.
103+
Weaker guarantees may only yield performance wins on weakly-ordered hardware.
104+
105+
* Asking for guarantees that are *too* weak on strongly-ordered hardware is
106+
more likely to *happen* to work, even though your program is strictly
107+
incorrect. If possible, concurrent algorithms should be tested on weakly-
108+
ordered hardware.
109+
110+
111+
112+
113+
114+
# Data Accesses
115+
116+
The C11 memory model attempts to bridge the gap by allowing us to talk about the
117+
*causality* of our program. Generally, this is by establishing a *happens
118+
before* relationships between parts of the program and the threads that are
119+
running them. This gives the hardware and compiler room to optimize the program
120+
more aggressively where a strict happens-before relationship isn't established,
121+
but forces them to be more careful where one *is* established. The way we
122+
communicate these relationships are through *data accesses* and *atomic
123+
accesses*.
124+
125+
Data accesses are the bread-and-butter of the programming world. They are
126+
fundamentally unsynchronized and compilers are free to aggressively optimize
127+
them. In particular, data accesses are free to be reordered by the compiler on
128+
the assumption that the program is single-threaded. The hardware is also free to
129+
propagate the changes made in data accesses to other threads as lazily and
130+
inconsistently as it wants. Mostly critically, data accesses are how data races
131+
happen. Data accesses are very friendly to the hardware and compiler, but as
132+
we've seen they offer *awful* semantics to try to write synchronized code with.
133+
Actually, that's too weak. *It is literally impossible to write correct
134+
synchronized code using only data accesses*.
135+
136+
Atomic accesses are how we tell the hardware and compiler that our program is
137+
multi-threaded. Each atomic access can be marked with an *ordering* that
138+
specifies what kind of relationship it establishes with other accesses. In
139+
practice, this boils down to telling the compiler and hardware certain things
140+
they *can't* do. For the compiler, this largely revolves around re-ordering of
141+
instructions. For the hardware, this largely revolves around how writes are
142+
propagated to other threads. The set of orderings Rust exposes are:
143+
144+
* Sequentially Consistent (SeqCst) Release Acquire Relaxed
145+
146+
(Note: We explicitly do not expose the C11 *consume* ordering)
147+
148+
TODO: negative reasoning vs positive reasoning? TODO: "can't forget to
149+
synchronize"
150+
151+
152+
153+
# Sequentially Consistent
154+
155+
Sequentially Consistent is the most powerful of all, implying the restrictions
156+
of all other orderings. Intuitively, a sequentially consistent operation
157+
*cannot* be reordered: all accesses on one thread that happen before and after a
158+
SeqCst access *stay* before and after it. A data-race-free program that uses
159+
only sequentially consistent atomics and data accesses has the very nice
160+
property that there is a single global execution of the program's instructions
161+
that all threads agree on. This execution is also particularly nice to reason
162+
about: it's just an interleaving of each thread's individual executions. This
163+
*does not* hold if you start using the weaker atomic orderings.
164+
165+
The relative developer-friendliness of sequential consistency doesn't come for
166+
free. Even on strongly-ordered platforms sequential consistency involves
167+
emitting memory fences.
168+
169+
In practice, sequential consistency is rarely necessary for program correctness.
170+
However sequential consistency is definitely the right choice if you're not
171+
confident about the other memory orders. Having your program run a bit slower
172+
than it needs to is certainly better than it running incorrectly! It's also
173+
*mechanically* trivial to downgrade atomic operations to have a weaker
174+
consistency later on. Just change `SeqCst` to e.g. `Relaxed` and you're done! Of
175+
course, proving that this transformation is *correct* is a whole other matter.
176+
177+
178+
179+
180+
# Acquire-Release
181+
182+
Acquire and Release are largely intended to be paired. Their names hint at their
183+
use case: they're perfectly suited for acquiring and releasing locks, and
184+
ensuring that critical sections don't overlap.
185+
186+
Intuitively, an acquire access ensures that every access after it *stays* after
187+
it. However operations that occur before an acquire are free to be reordered to
188+
occur after it. Similarly, a release access ensures that every access before it
189+
*stays* before it. However operations that occur after a release are free to be
190+
reordered to occur before it.
191+
192+
When thread A releases a location in memory and then thread B subsequently
193+
acquires *the same* location in memory, causality is established. Every write
194+
that happened *before* A's release will be observed by B *after* its release.
195+
However no causality is established with any other threads. Similarly, no
196+
causality is established if A and B access *different* locations in memory.
197+
198+
Basic use of release-acquire is therefore simple: you acquire a location of
199+
memory to begin the critical section, and then release that location to end it.
200+
For instance, a simple spinlock might look like:
201+
202+
```rust
203+
use std::sync::Arc;
204+
use std::sync::atomic::{AtomicBool, Ordering};
205+
use std::thread;
206+
207+
fn main() {
208+
let lock = Arc::new(AtomicBool::new(true)); // value answers "am I locked?"
209+
210+
// ... distribute lock to threads somehow ...
211+
212+
// Try to acquire the lock by setting it to false
213+
while !lock.compare_and_swap(true, false, Ordering::Acquire) { }
214+
// broke out of the loop, so we successfully acquired the lock!
215+
216+
// ... scary data accesses ...
217+
218+
// ok we're done, release the lock
219+
lock.store(true, Ordering::Release);
220+
}
221+
```
222+
223+
On strongly-ordered platforms most accesses have release or acquire semantics,
224+
making release and acquire often totally free. This is not the case on
225+
weakly-ordered platforms.
226+
227+
228+
229+
230+
# Relaxed
231+
232+
Relaxed accesses are the absolute weakest. They can be freely re-ordered and
233+
provide no happens-before relationship. Still, relaxed operations *are* still
234+
atomic. That is, they don't count as data accesses and any read-modify-write
235+
operations done to them occur atomically. Relaxed operations are appropriate for
236+
things that you definitely want to happen, but don't particularly otherwise care
237+
about. For instance, incrementing a counter can be safely done by multiple
238+
threads using a relaxed `fetch_add` if you're not using the counter to
239+
synchronize any other accesses.
240+
241+
There's rarely a benefit in making an operation relaxed on strongly-ordered
242+
platforms, since they usually provide release-acquire semantics anyway. However
243+
relaxed operations can be cheaper on weakly-ordered platforms.
244+
245+
246+
247+
248+
249+
[C11-busted]: http://plv.mpi-sws.org/c11comp/popl15.pdf
250+
[C11-model]: http://www.open-std.org/jtc1/sc22/wg14/www/standards.html#9899

0 commit comments

Comments
 (0)