Skip to content

Commit 06aadb4

Browse files
committed
a sound {bare_metal,cortex_m,etc}::Mutex
1 parent aad69b4 commit 06aadb4

File tree

1 file changed

+281
-0
lines changed

1 file changed

+281
-0
lines changed

rfcs/0000-sound-mutex.md

Lines changed: 281 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,281 @@
1+
- Affected components: `bare-metal`; `cortex-m` and similar
2+
- Feature Name: sound_mutex
3+
- Start Date: 2019-10-23
4+
- RFC PR: (leave this empty)
5+
- Rust Issue: (leave this empty)
6+
7+
# Summary
8+
9+
`bare_metal::Mutex`, re-exported in the `cortex-m` and other crates, is a mutex
10+
based on critical sections that temporarily disable all interrupts. As provided
11+
today this abstraction is unsound in multi-core context because `Mutex` can be
12+
stored in `static` variables, which are visible to all cores, but interrupt
13+
masking is *not* sufficient to synchronize (potentially) parallel access (i.e.
14+
access from different cores) to memory.
15+
16+
This document proposes that we deprecate the existing `Mutex` abstraction in
17+
favor of a mutex that properly expresses the idea that mutexes based on
18+
interrupt-masking is only "Sync" in single-core context.
19+
20+
# Motivation
21+
22+
Today it's possible to write multi-core applications for homogeneous multi-core
23+
devices on stable Rust, as evidenced in the [lpcxpresso55S69] repository. In
24+
fact, it has been possible to write multi-core Cortex-M applications since Rust
25+
1.30.0, the release in which it became possible to write single-core Cortex-M
26+
applications on stable.
27+
28+
[lpcxpresso55S69]: https://github.com/japaric/lpcxpresso55S69
29+
30+
Writing homogeneous multi-core applications requires no special build steps and
31+
these programs make use of the same compilation targets used for single-core
32+
applications (e.g. `thumbv7m-none-eabi`). For this reason, it is not possible to
33+
restrict the parts of crates like `cortex-m` that these multi-core applications
34+
can use. Therefore, these crates can *not* assume that they will only be used in
35+
single-core contexts as that may to lead to abstractions that are single-core
36+
sound but multi-core unsound.
37+
38+
`cortex_m::Mutex` is one abstraction that assumes a single-core context. The
39+
following example demonstrates how this "safe" abstraction leads to a data race
40+
in a multi-core application.
41+
42+
``` rust
43+
use core::cell::Cell;
44+
use cortex_m::{Mutex, interrupt};
45+
46+
static X: Mutex<Cell<u64>> = Mutex::new(Cell::new(0));
47+
48+
#[entry(0)] // entry point of core #0
49+
fn main0() {
50+
interrupt::free(|cs| loop {
51+
let x = X.borrow(cs);
52+
x.set(x.get() + 1); // (A)
53+
});
54+
}
55+
56+
#[entry(1)]
57+
fn main1() {
58+
interrupt::free(|cs| loop {
59+
let x = X.borrow(cs);
60+
x.set(!x.get()); // (B)
61+
});
62+
}
63+
```
64+
65+
Here the statements A and B run in parallel because disabling the interrupts on
66+
one core has no effect on other cores.
67+
68+
Proliferation of this unsound `Mutex` API, specially in libraries, will
69+
inevitably lead to hard to debug soundness issues in multi-core programs (the
70+
root of the problem could be in a crate deep in the dependency graph). It is
71+
important to promptly remove this unsound API before multi-core applications
72+
become more widespread.
73+
74+
# Detailed design
75+
76+
## `SingleCore*`
77+
78+
This document proposes adopting a new set of `Send` / `Sync` traits that
79+
are "only needs to be sound in single-core context" variants of the ones
80+
provided by the `core` crate. The `Send` and `Sync` traits in `core` will
81+
continue to mean "must be sound in single-core AND multi-core context". These
82+
traits will go into a separate crate so they can be used by other libraries and
83+
frameworks. The full implementation of these traits is shown below -- all names
84+
are up for debate:
85+
86+
``` rust
87+
// crate: single-core-send-sync
88+
89+
pub auto trait SingleCoreSend {}
90+
91+
pub auto trait SingleCoreSync {}
92+
93+
// replicate all the `Send` / `Sync` impls in core
94+
impl<T> !SingleCoreSync for *mut T {}
95+
impl<T> !SingleCoreSync for *const T {}
96+
unsafe impl<'a, T> SingleCoreSend for &'a T where T: SingleCoreSync {}
97+
unsafe impl<'a, T> SingleCoreSend for &'a mut T where T: SingleCoreSync {}
98+
// ..
99+
100+
// all multi-core Send types are also single-core Send
101+
unsafe impl<T> SingleCoreSend for T where T: Send {}
102+
103+
// all multi-core Sync types are also single-core Sync
104+
unsafe impl<T> SingleCoreSync for T where T: Sync {}
105+
```
106+
107+
With these traits in place the existing `Mutex` can be replaced with one that
108+
implements `SingleCoreSync`, but not `Sync`.
109+
110+
``` rust
111+
// crate: bare-metal
112+
113+
pub struct SingleCoreMutex<T> { /* .. */ }
114+
115+
unsafe impl<T> SingleCoreSync for SingleCoreMutex<T> {}
116+
117+
impl<T> SingleCoreMutex<T> {
118+
pub const fn new(data: T) -> Self {
119+
// ..
120+
}
121+
122+
// .. and the rest of the `Mutex` API (e.g. `borrow`) ..
123+
}
124+
```
125+
126+
### Usage
127+
128+
It is not possible to instantiate the `SingleCoreMutex` type in a static
129+
variable because it does not implement the `Sync` trait -- this is a soundness
130+
requirement and not a shortcoming; but existing and upcoming concurrency
131+
frameworks / libraries can loosen their `Send` bounds into `SingleCoreSend`
132+
bounds to accept `SingleCoreMutex` values.
133+
134+
The code snippet below shows an hypothetical API to dynamically register
135+
interrupt handlers. This API uses the `SingleCoreSend` trait instead of the
136+
`Send` bound because it doesn't enable multi-core concurrency (the registered
137+
interrupt handlers will run on core that registered the handler).
138+
139+
``` rust
140+
/// registers an interrupt handler in the vector table of the _calling_ core
141+
fn register(interrupt: Interrupt, handler: F)
142+
where
143+
F: FnMut() + SingleCoreSend + 'static,
144+
// ^^^^^^^^^^^^^^
145+
{
146+
// ..
147+
}
148+
149+
#[entry]
150+
fn main() {
151+
// remember: there's a source code level transformation here; `X` and `Y`
152+
// have type `&'static mut _`
153+
static mut X: Mutex<Cell<u64>> = Mutex::new(Cell::new(0u64));
154+
static mut Y: AtomicU32 = AtomicU32::new(0);
155+
156+
let x: &'static _ = X; // coerces `&'static mut T` into `&'static T`
157+
158+
register(Interrupt::A, || {
159+
let z: &'static Mutex<Cell<u64>> = x; // `x` is copied here
160+
// ^^^^^^^^^^^^^^^^^^^^^^^^^ OK; this is a `SingleCoreSend` type
161+
162+
// .. do stuff with `z` ..
163+
});
164+
165+
let y: &'static _ = Y;
166+
167+
register(Interrupt::B, || {
168+
let z: &'static AtomicU32 = y; // `y` is copied here
169+
// ^^^^^^^^^^^^^^^^^^ OK; this is a `Send` type
170+
171+
// .. do stuff with `z` ..
172+
});
173+
174+
loop {
175+
// do stuff with `x` and `y`
176+
}
177+
}
178+
```
179+
180+
Here the statically allocated variables, `X` and `Y`, are accessed concurrently
181+
from `main` and the interrupt handlers `A` and `B`. All accesses are performed
182+
by the same core so `SingleCoreSend` is the right bound to use and it's OK to
183+
use the `SingleCoreMutex`.
184+
185+
### Blockers
186+
187+
The `SingleCore*` API depends on the `auto trait` feature, which as of Rust
188+
1.38.0 is still unstable (gated behind the `optin_builtin_traits` feature gate).
189+
Thus the implementation of this API will need to wait until the feature is
190+
stabilized. One could implement this API behind a Cargo feature that requires
191+
nightly Rust but that would hinder its adoption so this proposal advises against
192+
doing that.
193+
194+
## Intermediate step
195+
196+
Because the `SingleCore*` API is, time-wise, far away this document proposes, as
197+
an intermediate step, landing an `unsafe` `RacyMutex` type and deprecating /
198+
removing `Mutex` *today*.
199+
200+
`RacyMutex` will pretty much replicate the API of `Mutex`, including its `Sync`
201+
implementation, but will have an `unsafe` constructor. The safety contract of
202+
the constructor is that the value has to be created in a single-core context;
203+
only within that context the rest of the `RacyMutex` API will be sound.
204+
205+
The main parts of the `RacyMutex` API are shown below:
206+
207+
``` rust
208+
// crate: bare-metal
209+
210+
/// A mutex that's racy in multi-core contexts
211+
pub struct RacyMutex<T> { /* .. */ }
212+
213+
/// IMPORTANT: `RacyMutex` is only `Sync` in a single-core context
214+
unsafe impl<T> Sync for RacyMutex<T> {}
215+
216+
impl<T> RacyMutex<T> {
217+
/// Creates a new `RacyMutex`
218+
///
219+
/// # Safety
220+
///
221+
/// By constructing a `RacyMutex` the caller is asserting that the value
222+
/// will only be used in a single-core context; this is the case, for
223+
/// example, if the constructor is called in a single-core application.
224+
///
225+
/// Using `RacyMutex` in a multi-core context is unsound. For that reason,
226+
/// `RacyMutex` must NEVER be used in a general purpose library; it should
227+
/// only be used in (a) HALs for single-core devices, and (b) applications
228+
/// for single-core devices.
229+
pub const unsafe fn new(data: T) -> Self {
230+
// ..
231+
}
232+
233+
234+
// .. and the rest of the `Mutex` API ..
235+
}
236+
```
237+
238+
### Migration path
239+
240+
Existing applications that use `bare_metal::Mutex`-es in static variables can
241+
easily migrate to `RacyMutex` by wrapping the constructors in an `unsafe` block
242+
and replacing `Mutex` with `RacyMutex`.
243+
244+
``` rust
245+
// this
246+
// static SHALL_WE_DANCE: Mutex<RefCell<Option<Partner>>> = Mutex::new(/* .. */);
247+
248+
// becomes
249+
static SHALL_WE_DANCE: RacyMutex<RefCell<Option<Partner>>> = unsafe {
250+
// NOTE(unsafe): this is a single-core application
251+
RacyMutex::new(/* .. */)
252+
};
253+
254+
#[entry]
255+
fn main() {
256+
// ..
257+
}
258+
259+
#[exception]
260+
fn SysTick() {
261+
interrupt::free(|cs| {
262+
// this stays the same (safe)
263+
let yes = SHALL_WE_DANCE.borrow(cs).borrow_mut().take().unwrap();
264+
// ..
265+
});
266+
}
267+
```
268+
269+
# Unresolved questions
270+
271+
- Bikeshed all names
272+
273+
- Should `RacyMutex` be removed once the `SingleCore*` API has been implemented?
274+
Or should both co-exist side by side?
275+
276+
# Alternatives
277+
278+
- Come up with a interrupt-masking / spinlock hybrid implementation for `Mutex`
279+
that's both single-core and multi-core `Sync`.
280+
281+
- Remove `bare_metal::Mutex` without replacement.

0 commit comments

Comments
 (0)