Skip to content

The Advanced Rust Programming Language #27032

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 112 commits into from
Jul 30, 2015
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
112 commits
Select commit Hold shift + click to select a range
89df25f
first commit
Gankra Jun 8, 2015
cc5b4d3
add md files
Gankra Jun 8, 2015
f2a37fc
blurp
Gankra Jun 8, 2015
39c0473
progress
Gankra Jun 10, 2015
3318e3d
progress
Gankra Jun 10, 2015
f8be073
progress
Gankra Jun 19, 2015
3be2643
stub out more stuff
Gankra Jun 19, 2015
0213446
progress
Gankra Jun 19, 2015
74b398b
progress on lifetimes
Gankra Jun 19, 2015
c3c9d91
progress
Gankra Jun 19, 2015
fabbd4e
progress
Gankra Jun 19, 2015
cb4f081
remove ffi and no_std, TRPL's got it
Gankra Jun 19, 2015
7d41c95
fix data headers
Gankra Jun 19, 2015
edb29ec
futz with headers more
Gankra Jun 19, 2015
8f531d9
progress
Gankra Jun 20, 2015
069681a
vec exmaple maybe
Gankra Jun 21, 2015
9c87b1f
fix double "however"
Manishearth Jun 21, 2015
709641b
Tiny typo of "positive"
ben0x539 Jun 21, 2015
7466590
Fix description of integer conversions
petrochenkov Jun 21, 2015
2bcd58e
Merge pull request #3 from petrochenkov/patch-1
Gankra Jun 21, 2015
2d5c1bb
Merge pull request #2 from ben0x539/patch-1
Gankra Jun 21, 2015
9997a6e
Merge pull request #1 from Manishearth/patch-1
Gankra Jun 21, 2015
8e75c50
community fixups
Gankra Jun 21, 2015
bb32315
fix paper link
Gankra Jun 21, 2015
e5d39f7
extra whitespace to render *-list as list
ben0x539 Jun 23, 2015
75621b8
Merge pull request #4 from ben0x539/patch-2
Gankra Jun 23, 2015
3287372
so much Vec
Gankra Jun 24, 2015
af2fd1d
rustbook support
Gankra Jun 24, 2015
fa6fff5
vec 1.0
Gankra Jun 24, 2015
a9143a8
fix vec size_hint
Gankra Jun 25, 2015
8b60fe9
The Unsafe English Language demands tribute
Manishearth Jun 25, 2015
3f66928
lowercase level Virtual Machine
Manishearth Jun 25, 2015
414f730
If you prick a code block, does it not bleed?
Manishearth Jun 25, 2015
0f92455
Fix Typo
killercup Jun 25, 2015
77e5f70
Merge pull request #5 from Manishearth/patch-1
Gankra Jun 25, 2015
14d11b3
Merge pull request #6 from killercup/patch-1
Gankra Jun 25, 2015
7588957
rewrap uninit
Gankra Jun 25, 2015
d2e802b
tweak usize::MAX
Gankra Jun 26, 2015
a1f6bbc
niko fixes
Gankra Jun 26, 2015
65a0c17
poke at data and conversions more
Gankra Jun 29, 2015
9c6a46b
fiddlin'
Gankra Jun 29, 2015
10af239
unwinding start
Gankra Jun 30, 2015
5d4f854
so much unwinding
Gankra Jun 30, 2015
b26958c
add unwinding section to index
Gankra Jun 30, 2015
108a697
conversion corrections
Gankra Jun 30, 2015
18067e4
start on proper atomics
Gankra Jul 1, 2015
ccb08a5
TODO
Gankra Jul 1, 2015
e4f718a
uninit cleanup
Gankra Jul 1, 2015
dd98edd
lifetiiiiimes
Gankra Jul 3, 2015
a42a415
rework unsafe intro to be 1000% more adorable
Gankra Jul 3, 2015
31adad6
SHARD ALL THE CHAPTERS
Gankra Jul 7, 2015
59ff3a3
expand on ctors
Gankra Jul 7, 2015
adcd30c
mdinger fix
Gankra Jul 7, 2015
5ec12b1
cleanup
Gankra Jul 7, 2015
35b8001
cleanup
Gankra Jul 7, 2015
987a868
split out and rework drop flags section
Gankra Jul 7, 2015
50d8656
void types
Gankra Jul 7, 2015
2e653f3
shard out concurrency
Gankra Jul 7, 2015
d8f460c
improve joke
Gankra Jul 7, 2015
668bdd3
flesh out atomics
Gankra Jul 8, 2015
778a4fa
new chapter
Gankra Jul 8, 2015
498e44d
new chapter for reals
Gankra Jul 8, 2015
fcf4a7e
oops
Gankra Jul 8, 2015
29e71b9
niko discussion affects
Gankra Jul 8, 2015
e167ee8
typos
mdinger Jul 8, 2015
4f6b141
Merge pull request #10 from mdinger/patch-2
Gankra Jul 8, 2015
acd3c59
fix typo
Gankra Jul 8, 2015
62e827c
rewrite intro
Gankra Jul 13, 2015
cbc6408
fix
Gankra Jul 13, 2015
d66c67b
clarify atomics
Gankra Jul 14, 2015
bdc62e0
fix definition
Gankra Jul 14, 2015
d4268f9
shard out and clean up unwinding
Gankra Jul 14, 2015
d96a518
several fixups
Gankra Jul 14, 2015
667afb8
remove salsh
Gankra Jul 14, 2015
c7919f2
remove chaff
Gankra Jul 14, 2015
a54e64b
move everything to tarpl
Gankra Jul 14, 2015
e2b5f4f
move everything into the Rust tree
Gankra Jul 14, 2015
04578f6
update build to make tarpl
Gankra Jul 14, 2015
dba548d
fix via mdinger
Gankra Jul 14, 2015
58f6f2d
nits and realigning
Gankra Jul 14, 2015
7aee844
fix all the doc tests
Gankra Jul 14, 2015
700895f
split out vec-zsts correctly
Gankra Jul 14, 2015
c5a1b87
properly remove moved text
Gankra Jul 15, 2015
d1b899e
update subtyping to be a bit clearer about reference variance
Gankra Jul 17, 2015
eba459a
shard out misc section on lifetimes properly
Gankra Jul 18, 2015
fc2d294
no really I deleted you
Gankra Jul 18, 2015
b79d279
fix typo
Gankra Jul 18, 2015
c97673c
fix up lifetimes
Gankra Jul 18, 2015
13b2605
fixup and cool example for checked-uninit
Gankra Jul 20, 2015
94a89e5
some conversions cleanup
Gankra Jul 20, 2015
42c2f10
flesh out void types
Gankra Jul 20, 2015
5f6e0ab
clean up vec chapter of tarpl
Gankra Jul 20, 2015
99043dd
mention void pointers
Gankra Jul 20, 2015
7a47ffc
UB is src bzns
Gankra Jul 20, 2015
0a36ea7
get into the weeds over GEP and allocations
Gankra Jul 20, 2015
06ded9c
explain phantom
Gankra Jul 20, 2015
14bc454
remove redundant explanation
Gankra Jul 20, 2015
5f02de3
clarify casts are checked at compile time
Gankra Jul 20, 2015
3f8e029
remove subtyping from coercions, it's something else
Gankra Jul 20, 2015
f54c5ad
fix accident
Gankra Jul 24, 2015
36a8b94
expand lifetime splitting to show IterMut is totally safe
Gankra Jul 27, 2015
8c7111d
fixup atomics
Gankra Jul 27, 2015
b53406f
fixups for aturon
Gankra Jul 27, 2015
fd13bdf
vec fixes for huonw
Gankra Jul 27, 2015
05bb1db
OBRM for aturon
Gankra Jul 27, 2015
5789106
many many pnkfelix fixes
Gankra Jul 28, 2015
0d37e78
lots more felix fixes
Gankra Jul 28, 2015
b93438f
fix incorrect name
Gankra Jul 28, 2015
9123bb0
fix borrow-splitting
Gankra Jul 28, 2015
b539906
clarify subtyping
Gankra Jul 28, 2015
4c48ffa
add warning about reference section
Gankra Jul 29, 2015
ddb0290
fix example code
Gankra Jul 30, 2015
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 8 additions & 1 deletion mk/docs.mk
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ ERR_IDX_GEN = $(RPATH_VAR2_T_$(CFG_BUILD)_H_$(CFG_BUILD)) $(ERR_IDX_GEN_EXE)

D := $(S)src/doc

DOC_TARGETS := trpl style error-index
DOC_TARGETS := trpl tarpl style error-index
COMPILER_DOC_TARGETS :=
DOC_L10N_TARGETS :=

Expand Down Expand Up @@ -287,6 +287,13 @@ doc/book/index.html: $(RUSTBOOK_EXE) $(wildcard $(S)/src/doc/trpl/*.md) | doc/
$(Q)rm -rf doc/book
$(Q)$(RUSTBOOK) build $(S)src/doc/trpl doc/book

tarpl: doc/adv-book/index.html

doc/adv-book/index.html: $(RUSTBOOK_EXE) $(wildcard $(S)/src/doc/tarpl/*.md) | doc/
@$(call E, rustbook: $@)
$(Q)rm -rf doc/adv-book
$(Q)$(RUSTBOOK) build $(S)src/doc/tarpl doc/adv-book

style: doc/style/index.html

doc/style/index.html: $(RUSTBOOK_EXE) $(wildcard $(S)/src/doc/style/*.md) | doc/
Expand Down
3 changes: 2 additions & 1 deletion mk/tests.mk
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,8 @@ $(foreach doc,$(DOCS), \
$(eval $(call DOCTEST,md-$(doc),$(S)src/doc/$(doc).md)))
$(foreach file,$(wildcard $(S)src/doc/trpl/*.md), \
$(eval $(call DOCTEST,$(file:$(S)src/doc/trpl/%.md=trpl-%),$(file))))

$(foreach file,$(wildcard $(S)src/doc/tarpl/*.md), \
$(eval $(call DOCTEST,$(file:$(S)src/doc/tarpl/%.md=tarpl-%),$(file))))
######################################################################
# Main test targets
######################################################################
Expand Down
39 changes: 39 additions & 0 deletions src/doc/tarpl/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
% The Advanced Rust Programming Language

# NOTE: This is a draft document, and may contain serious errors

So you've played around with Rust a bit. You've written a few simple programs and
you think you grok the basics. Maybe you've even read through
*[The Rust Programming Language][trpl]*. Now you want to get neck-deep in all the
nitty-gritty details of the language. You want to know those weird corner-cases.
You want to know what the heck `unsafe` really means, and how to properly use it.
This is the book for you.

To be clear, this book goes into *serious* detail. We're going to dig into
exception-safety and pointer aliasing. We're going to talk about memory
models. We're even going to do some type-theory. This is stuff that you
absolutely *don't* need to know to write fast and safe Rust programs.
You could probably close this book *right now* and still have a productive
and happy career in Rust.

However if you intend to write unsafe code -- or just *really* want to dig into
the guts of the language -- this book contains *invaluable* information.

Unlike *The Rust Programming Language* we *will* be assuming considerable prior
knowledge. In particular, you should be comfortable with:

* Basic Systems Programming:
* Pointers
* [The stack and heap][]
* The memory hierarchy (caches)
* Threads

* [Basic Rust][]

Due to the nature of advanced Rust programming, we will be spending a lot of time
talking about *safety* and *guarantees*. In particular, a significant portion of
the book will be dedicated to correctly writing and understanding Unsafe Rust.

[trpl]: ../book/
[The stack and heap]: ../book/the-stack-and-the-heap.html
[Basic Rust]: ../book/syntax-and-semantics.html
53 changes: 53 additions & 0 deletions src/doc/tarpl/SUMMARY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Summary

* [Meet Safe and Unsafe](meet-safe-and-unsafe.md)
* [How Safe and Unsafe Interact](safe-unsafe-meaning.md)
* [Working with Unsafe](working-with-unsafe.md)
* [Data Layout](data.md)
* [repr(Rust)](repr-rust.md)
* [Exotically Sized Types](exotic-sizes.md)
* [Other reprs](other-reprs.md)
* [Ownership](ownership.md)
* [References](references.md)
* [Lifetimes](lifetimes.md)
* [Limits of lifetimes](lifetime-mismatch.md)
* [Lifetime Elision](lifetime-elision.md)
* [Unbounded Lifetimes](unbounded-lifetimes.md)
* [Higher-Rank Trait Bounds](hrtb.md)
* [Subtyping and Variance](subtyping.md)
* [Drop Check](dropck.md)
* [PhantomData](phantom-data.md)
* [Splitting Borrows](borrow-splitting.md)
* [Type Conversions](conversions.md)
* [Coercions](coercions.md)
* [The Dot Operator](dot-operator.md)
* [Casts](casts.md)
* [Transmutes](transmutes.md)
* [Uninitialized Memory](uninitialized.md)
* [Checked](checked-uninit.md)
* [Drop Flags](drop-flags.md)
* [Unchecked](unchecked-uninit.md)
* [Ownership Based Resource Management](obrm.md)
* [Constructors](constructors.md)
* [Destructors](destructors.md)
* [Leaking](leaking.md)
* [Unwinding](unwinding.md)
* [Exception Safety](exception-safety.md)
* [Poisoning](poisoning.md)
* [Concurrency](concurrency.md)
* [Races](races.md)
* [Send and Sync](send-and-sync.md)
* [Atomics](atomics.md)
* [Implementing Vec](vec.md)
* [Layout](vec-layout.md)
* [Allocating](vec-alloc.md)
* [Push and Pop](vec-push-pop.md)
* [Deallocating](vec-dealloc.md)
* [Deref](vec-deref.md)
* [Insert and Remove](vec-insert-remove.md)
* [IntoIter](vec-into-iter.md)
* [RawVec](vec-raw.md)
* [Drain](vec-drain.md)
* [Handling Zero-Sized Types](vec-zsts.md)
* [Final Code](vec-final.md)
* [Implementing Arc and Mutex](arc-and-mutex.md)
7 changes: 7 additions & 0 deletions src/doc/tarpl/arc-and-mutex.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
% Implementing Arc and Mutex

Knowing the theory is all fine and good, but the *best* way to understand
something is to use it. To better understand atomics and interior mutability,
we'll be implementing versions of the standard library's Arc and Mutex types.

TODO: ALL OF THIS OMG
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: ALL OF THIS OMG

Replace with "Coming Soon"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NO OMG THE HYPE IS REAL

250 changes: 250 additions & 0 deletions src/doc/tarpl/atomics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,250 @@
% Atomics

Rust pretty blatantly just inherits C11's memory model for atomics. This is not
due this model being particularly excellent or easy to understand. Indeed, this
model is quite complex and known to have [several flaws][C11-busted]. Rather, it
is a pragmatic concession to the fact that *everyone* is pretty bad at modeling
atomics. At very least, we can benefit from existing tooling and research around
C.

Trying to fully explain the model in this book is fairly hopeless. It's defined
in terms of madness-inducing causality graphs that require a full book to
properly understand in a practical way. If you want all the nitty-gritty
details, you should check out [C's specification (Section 7.17)][C11-model].
Still, we'll try to cover the basics and some of the problems Rust developers
face.

The C11 memory model is fundamentally about trying to bridge the gap between the
semantics we want, the optimizations compilers want, and the inconsistent chaos
our hardware wants. *We* would like to just write programs and have them do
exactly what we said but, you know, *fast*. Wouldn't that be great?




# Compiler Reordering

Compilers fundamentally want to be able to do all sorts of crazy transformations
to reduce data dependencies and eliminate dead code. In particular, they may
radically change the actual order of events, or make events never occur! If we
write something like

```rust,ignore
x = 1;
y = 3;
x = 2;
```

The compiler may conclude that it would *really* be best if your program did

```rust,ignore
x = 2;
y = 3;
```

This has inverted the order of events *and* completely eliminated one event.
From a single-threaded perspective this is completely unobservable: after all
the statements have executed we are in exactly the same state. But if our
program is multi-threaded, we may have been relying on `x` to *actually* be
assigned to 1 before `y` was assigned. We would *really* like the compiler to be
able to make these kinds of optimizations, because they can seriously improve
performance. On the other hand, we'd really like to be able to depend on our
program *doing the thing we said*.




# Hardware Reordering

On the other hand, even if the compiler totally understood what we wanted and
respected our wishes, our *hardware* might instead get us in trouble. Trouble
comes from CPUs in the form of memory hierarchies. There is indeed a global
shared memory space somewhere in your hardware, but from the perspective of each
CPU core it is *so very far away* and *so very slow*. Each CPU would rather work
with its local cache of the data and only go through all the *anguish* of
talking to shared memory *only* when it doesn't actually have that memory in
cache.

After all, that's the whole *point* of the cache, right? If every read from the
cache had to run back to shared memory to double check that it hadn't changed,
what would the point be? The end result is that the hardware doesn't guarantee
that events that occur in the same order on *one* thread, occur in the same
order on *another* thread. To guarantee this, we must issue special instructions
to the CPU telling it to be a bit less smart.

For instance, say we convince the compiler to emit this logic:

```text
initial state: x = 0, y = 1

THREAD 1 THREAD2
y = 3; if x == 1 {
x = 1; y *= 2;
}
```

Ideally this program has 2 possible final states:

* `y = 3`: (thread 2 did the check before thread 1 completed)
* `y = 6`: (thread 2 did the check after thread 1 completed)

However there's a third potential state that the hardware enables:

* `y = 2`: (thread 2 saw `x = 1`, but not `y = 3`, and then overwrote `y = 3`)

It's worth noting that different kinds of CPU provide different guarantees. It
is common to separate hardware into two categories: strongly-ordered and weakly-
ordered. Most notably x86/64 provides strong ordering guarantees, while ARM
provides weak ordering guarantees. This has two consequences for concurrent
programming:

* Asking for stronger guarantees on strongly-ordered hardware may be cheap or
even *free* because they already provide strong guarantees unconditionally.
Weaker guarantees may only yield performance wins on weakly-ordered hardware.

* Asking for guarantees that are *too* weak on strongly-ordered hardware is
more likely to *happen* to work, even though your program is strictly
incorrect. If possible, concurrent algorithms should be tested on weakly-
ordered hardware.





# Data Accesses

The C11 memory model attempts to bridge the gap by allowing us to talk about the
*causality* of our program. Generally, this is by establishing a *happens
before* relationships between parts of the program and the threads that are
running them. This gives the hardware and compiler room to optimize the program
more aggressively where a strict happens-before relationship isn't established,
but forces them to be more careful where one *is* established. The way we
communicate these relationships are through *data accesses* and *atomic
accesses*.

Data accesses are the bread-and-butter of the programming world. They are
fundamentally unsynchronized and compilers are free to aggressively optimize
them. In particular, data accesses are free to be reordered by the compiler on
the assumption that the program is single-threaded. The hardware is also free to
propagate the changes made in data accesses to other threads as lazily and
inconsistently as it wants. Mostly critically, data accesses are how data races
happen. Data accesses are very friendly to the hardware and compiler, but as
we've seen they offer *awful* semantics to try to write synchronized code with.
Actually, that's too weak. *It is literally impossible to write correct
synchronized code using only data accesses*.

Atomic accesses are how we tell the hardware and compiler that our program is
multi-threaded. Each atomic access can be marked with an *ordering* that
specifies what kind of relationship it establishes with other accesses. In
practice, this boils down to telling the compiler and hardware certain things
they *can't* do. For the compiler, this largely revolves around re-ordering of
instructions. For the hardware, this largely revolves around how writes are
propagated to other threads. The set of orderings Rust exposes are:

* Sequentially Consistent (SeqCst) Release Acquire Relaxed

(Note: We explicitly do not expose the C11 *consume* ordering)

TODO: negative reasoning vs positive reasoning? TODO: "can't forget to
synchronize"



# Sequentially Consistent

Sequentially Consistent is the most powerful of all, implying the restrictions
of all other orderings. Intuitively, a sequentially consistent operation
*cannot* be reordered: all accesses on one thread that happen before and after a
SeqCst access *stay* before and after it. A data-race-free program that uses
only sequentially consistent atomics and data accesses has the very nice
property that there is a single global execution of the program's instructions
that all threads agree on. This execution is also particularly nice to reason
about: it's just an interleaving of each thread's individual executions. This
*does not* hold if you start using the weaker atomic orderings.

The relative developer-friendliness of sequential consistency doesn't come for
free. Even on strongly-ordered platforms sequential consistency involves
emitting memory fences.

In practice, sequential consistency is rarely necessary for program correctness.
However sequential consistency is definitely the right choice if you're not
confident about the other memory orders. Having your program run a bit slower
than it needs to is certainly better than it running incorrectly! It's also
*mechanically* trivial to downgrade atomic operations to have a weaker
consistency later on. Just change `SeqCst` to e.g. `Relaxed` and you're done! Of
course, proving that this transformation is *correct* is a whole other matter.




# Acquire-Release

Acquire and Release are largely intended to be paired. Their names hint at their
use case: they're perfectly suited for acquiring and releasing locks, and
ensuring that critical sections don't overlap.

Intuitively, an acquire access ensures that every access after it *stays* after
it. However operations that occur before an acquire are free to be reordered to
occur after it. Similarly, a release access ensures that every access before it
*stays* before it. However operations that occur after a release are free to be
reordered to occur before it.

When thread A releases a location in memory and then thread B subsequently
acquires *the same* location in memory, causality is established. Every write
that happened *before* A's release will be observed by B *after* its release.
However no causality is established with any other threads. Similarly, no
causality is established if A and B access *different* locations in memory.

Basic use of release-acquire is therefore simple: you acquire a location of
memory to begin the critical section, and then release that location to end it.
For instance, a simple spinlock might look like:

```rust
use std::sync::Arc;
use std::sync::atomic::{AtomicBool, Ordering};
use std::thread;

fn main() {
let lock = Arc::new(AtomicBool::new(true)); // value answers "am I locked?"

// ... distribute lock to threads somehow ...

// Try to acquire the lock by setting it to false
while !lock.compare_and_swap(true, false, Ordering::Acquire) { }
// broke out of the loop, so we successfully acquired the lock!

// ... scary data accesses ...

// ok we're done, release the lock
lock.store(true, Ordering::Release);
}
```

On strongly-ordered platforms most accesses have release or acquire semantics,
making release and acquire often totally free. This is not the case on
weakly-ordered platforms.




# Relaxed

Relaxed accesses are the absolute weakest. They can be freely re-ordered and
provide no happens-before relationship. Still, relaxed operations *are* still
atomic. That is, they don't count as data accesses and any read-modify-write
operations done to them occur atomically. Relaxed operations are appropriate for
things that you definitely want to happen, but don't particularly otherwise care
about. For instance, incrementing a counter can be safely done by multiple
threads using a relaxed `fetch_add` if you're not using the counter to
synchronize any other accesses.

There's rarely a benefit in making an operation relaxed on strongly-ordered
platforms, since they usually provide release-acquire semantics anyway. However
relaxed operations can be cheaper on weakly-ordered platforms.





[C11-busted]: http://plv.mpi-sws.org/c11comp/popl15.pdf
[C11-model]: http://www.open-std.org/jtc1/sc22/wg14/www/standards.html#9899
Loading