Address Multi-Core Soundness by abolishing Send and Sync #419

jonas-schievink · 2020-01-26T15:18:01Z

Summary

Change how we model multi-core MCUs in the following ways:

Stop using Send and Sync to model anything related to cross-core
interactions since those traits are unfit for the purpose.
Declare that Send is not sufficient to transfer resources between cores,
since memory addresses can have different meanings on different cores.
Mutexes based on critical sections become declared sound due to the above:
They can only turn Send data Sync, but neither allows cross-core
interactions anymore.
Punt on how to deal with cross-core communication and data sharing for now,
and leave it to the ecosystem.

jamesmunns · 2020-01-26T15:31:20Z

I’ll ask one possibly naive question:

Why do we need a separate model to represent multi-core behavior in bare metal programs, when this is not necessary for desktop CPUs with multiple cores? Is this due to the operating system abstracting away the reality of disjoint sets of resources between CPU cores in hosted environments?

For microcontrollers, we’ve taken the metaphor that “threads” can mean:

The main thread of execution prompted by Main
threads of execution originating from Interrupts/Exceptions, including RTFM’s tasks
perhaps the operation of software threads, provided by an RTOS like Tock
maaaaaybe the operations of DMA transactions, though those behave without code

In desktop PCs, there is no difference to Rust between moving data between threads on the same CPU/Core, or to threads on another CPU/Core.

Is this a model we should be aiming to upstream to Rust proper, perhaps as an uncommon but existent component of core? Or is it only relevant where the context details of actual execution matters, and should live within the embedded domain?

jonas-schievink · 2020-01-26T15:49:17Z

Why do we need a separate model to represent multi-core behavior in bare metal programs, when this is not necessary for desktop CPUs with multiple cores? Is this due to the operating system abstracting away the reality of disjoint sets of resources between CPU cores in hosted environments?

Correct. While APIs exist that allow making threads not share resources, and make processes do share resources, the "default" is that all threads within a program do share all of the program's resources, and this is the only use case intended to be modeled by Send and Sync.

Is this a model we should be aiming to upstream to Rust proper, perhaps as an uncommon but existent component of core? Or is it only relevant where the context details of actual execution matters, and should live within the embedded domain?

For now, I don't think this is of much general interest. The proposed traits are very embedded-specific and I don't see how they can be made more general (nor do I think they should). While there's certainly a use case for making shared memory and IPC in general safe to use, that would likely require different traits specific to the exposed mechanism.

Amanieu · 2020-01-26T16:14:03Z

I think that this RFC is missing a key section: what types will implement CoreSend/CoreSync, and how do you expect it to be used? Some examples would help here.

adamgreig · 2020-01-26T19:11:39Z

I like this RFC; it keeps the common case (single core) simple and correct for existing and future code, we could still use Send and Sync with future RTOSs that could correctly handle multiple threads on one or multiple cores, and for today's multi-core programs, treating them as separate processes that need to use some form of IPC to communicate seems to map well to hosted Rust semantics.

My understanding is that bare-metal multi-core programs already have distinct entry points for each core, often with distinct binaries for each core too? So treating them as separate processes seems fine. We'll definitely want some sort of IPC for sharing data between those cores, but it doesn't have to be the same primitives we use for single-core synchronisation.

I think this RFC would also work without CoreSync and CoreSend at all; we could have one RFC to say "multi-core bare metal should be considered multiple processes for Send and Sync purposes" which fixes our current soundness issue and sets out a useful single-core foundation, and then a separate, more detailed RFC to introduce CoreSync and CoreSend, perhaps with some actual implementations of multi-core synchronisation.

jonas-schievink · 2020-01-26T20:03:47Z

I think that this RFC is missing a key section: what types will implement CoreSend/CoreSync, and how do you expect it to be used? Some examples would help here.

You're right, I've added some more text expanding on how they're expected to be used.

My understanding is that bare-metal multi-core programs already have distinct entry points for each core, often with distinct binaries for each core too?

Yeah, you can either build a single image (SMP) or distinct images that can opt into sharing certain statics (AMP). The first case is unlikely to be supported though (mostly because it's really difficult to make cortex_m::Peripherals::take() work on all cores). I did add some text explaining my reasoning for why this proposal is at least safe when building an SMP app though.

I think this RFC would also work without CoreSync and CoreSend at all

That's true. I've added that to the alternatives.

andre-richter · 2020-01-26T20:20:06Z

rfcs/0000-multi-core-soundness.md

+Now, because `S` implements `Sync`, `&S` will implement `Send`. This is simply
+the languages definition of these traits and nothing we can change. Now a big
+problem can arise when tranferring `&S` across core boundaries: Cores can have
+private memory regions that are not accessible by other cores. If `T: Send` is


For the purpose of being able to fluently read this RFC, I would put one of the more popular cases as an example right after this sentence, or in braces. I had to stop here and look ahead what exactly was meant by this.

andre-richter · 2020-01-26T20:20:51Z

rfcs/0000-multi-core-soundness.md

+[summary]: #summary
+
+Change how we model multi-core MCUs in the following ways:
+


A notice here that this RFC is about Asymmetric MP and not SMP would be helpful to set the context before reading.

It is about both, since it is independent of the application model. Basically it just spells out a few facts about multi-core MCUs in general.

andre-richter · 2020-01-26T20:40:25Z

Is this a model we should be aiming to upstream to Rust proper, perhaps as an uncommon but existent component of core? Or is it only relevant where the context details of actual execution matters, and should live within the embedded domain?

For now, I don't think this is of much general interest. The proposed traits are very embedded-specific and I don't see how they can be made more general (nor do I think they should). While there's certainly a use case for making shared memory and IPC in general safe to use, that would likely require different traits specific to the exposed mechanism.

While reading this, I had the same thought as @jamesmunns. One of THE key features of compiling a Rust program is that the compiler stops you from sharing or sending data across thread boundaries. Unless you resolve the issue that the compiler wants to see Send/Sync where it thinks it belongs, you won't get a binary.

The RFC addresses important issues, but right now, compiler enforcement will only kick in when the traits are set where they should be set. Users using provided libraries that implement what the RFC demands will be reminded when they use the API wrong. But it won't stop them to use Send/Sync wrongly in other places of the program.

Just a wild thought, but maybe a good companion to this RFC would be some global that "disables" Send/Sync for a specific memory region.

// Private memory regions live here
#![amp_range(0x8000-0x9000)]

No idea if its possible to have something like that, though.

jonas-schievink · 2020-01-26T20:54:44Z

The RFC addresses important issues, but right now, compiler enforcement will only kick in when the traits are set where they should be set. Users using provided libraries that implement what the RFC demands will be reminded when they use the API wrong. But it won't stop them to use Send/Sync wrongly in other places of the program.

I'm not completely sure what you're referring to here. Do you mean someone could write a library that provides an incorrect unsafe abstraction by taking any T instead of T: Send? Or that someone writes an incorrect unsafe impl Send for X {}?

Just a wild thought, but maybe a good companion to this RFC would be some global that "disables" Send/Sync for a specific memory region.

The compiler doesn't know anything about where variables go, only the linker does (and we heavily rely on correctly written linker scripts even today). I'm not sure how this would work since those traits are implemented for types, not type-address-tuples. Send and Sync are just ordinary traits that can either be implemented or not. Do you suggest that the compiler shouldn't treat statics placed in private regions as implementing Sync?

cr1901 · 2020-01-26T20:58:42Z

Declare that Send is not sufficient to transfer resources between cores,
since memory addresses can have different meanings on different cores.

Correct. While APIs exist that allow making threads not share resources, and make processes do share resources, the "default" is that all threads within a program do share all of the program's resources, and this is the only use case intended to be modeled by Send and Sync.

Would it be accurate to summarize the above two texts as the following?:

Send and Sync are only appropriate for multiple cores if code outside your application (be it a runtime or carefully implemented startup code/exception) contains a mechanism (such as an MMU or perhaps PIE) to make all globally scoped resources visible to each core.

andre-richter · 2020-01-26T21:28:05Z

The compiler doesn't know anything about where variables go, only the linker does (and we heavily rely on correctly written linker scripts even today). I'm not sure how this would work since those traits are implemented for types, not type-address-tuples. Send and Sync are just ordinary traits that can either be implemented or not. Do you suggest that the compiler shouldn't treat statics placed in private regions as implementing Sync?

Wasn't my most well-though out comment, true. Please disregard most of it.

What about creation of raw pointers. Could the compiler theoretically make use of the address used for casting (e.g. to check if this will point to a private region)?

jonas-schievink · 2020-01-26T21:35:45Z

What about creation of raw pointers. Could the compiler theoretically make use of the address used for casting?

Well it can always insert a runtime check, but so could any hypothetical multi-core MCU library with some sort of SharedRef<'a, T> wrapper that's CoreSend if T: Sync. That's precisely the sort of experimentation I'd like to see in ecosystem crates.

EDIT: Actually this doesn't work if T contains any references. But the compiler can't help there either.

jonas-schievink · 2020-01-26T22:07:20Z

Send and Sync are only appropriate for multiple cores if code outside your application (be it a runtime or carefully implemented startup code/exception) contains a mechanism (such as an MMU or perhaps PIE) to make all globally scoped resources visible to each core.

That sounds about right, yeah. The mechanism can also choose to only make a subset of all global resources available, of course.

therealprof · 2020-01-26T23:21:35Z

My understanding is that bare-metal multi-core programs already have distinct entry points for each core, often with distinct binaries for each core too?

Not necessarily. I've learned from the SiFive guys that their RISC-V cores all start from the same entry point and the bootstrapping code running on each core is in charge of figuring out which core it currently is running on and act accordingly, e.g. by disabling itself or by jumping into the core specific code. I would presume there're quite a few implementations working like that.

therealprof · 2020-01-26T23:27:40Z

@jonas-schievink

Aren't

Each core can have private peripherals that may be Sendable between interrupt handlers, but that do not exist (or, worse, something else exists at their address) on other cores of the system.

and

Something much closer to a multi-threaded program is one that makes use of interrupt handlers: In both cases, all global resources exist precisely once, and may be shared and exchanged between interrupt handlers or threads (provided they implement Send/Sync) with no risk of the resources changing their meaning when sent.

saying the exact opposite?

jamesmunns · 2020-01-26T23:30:56Z

@therealprof I believe in those quotes the former was describing multi-core, similar to disjoint processes; while the latter was describing a single core with interrupts, similar to disjoint threads.

jonas-schievink · 2020-01-26T23:32:20Z

@therealprof What James said; I've pushed a commit that should clarify that.

rfcs/0000-multi-core-soundness.md

Co-Authored-By: Hanno Braun <[email protected]>

Disasm · 2020-02-11T17:53:20Z

I'd like to suggest considering the following cases:

Different protection levels (TrustZone in ARM, M-S-U modes in RISC-V). These may limit the memory accessible as well as the subset of the peripherals. What markers should have these "protected" peripherals in this case?
Different addressing modes (physical/virtual memory). Obviously, object cannot be Sync between different addressing modes even on the same core.
The same firmware can be loaded into a core-local memory region as well as into a global memory. What markers should have objects, constructed by this firmware?

thejpster · 2020-02-11T19:48:56Z

I'd personally rename CoreSend/CoreSync to MultiCoreSend/MultiCoreSync, but otherwise I'd take this over #388.

almindor · 2020-02-11T20:29:59Z

I'd like to suggest considering the following cases:

Different protection levels (TrustZone in ARM, M-S-U modes in RISC-V). These may limit the memory accessible as well as the subset of the peripherals. What markers should have these "protected" peripherals in this case?

Different addressing modes (physical/virtual memory). Obviously, object cannot be Sync between different addressing modes even on the same core.

The same firmware can be loaded into a core-local memory region as well as into a global memory. What markers should have objects, constructed by this firmware?

It seems to me that ultimately Rust should support something like MarkedPin<Marker, T> which would require translation where possible (e.g. core1 local memory to core2 local memory if such case is permitted) and disallow any access otherwise. This needs further research tho the use cases and combinations are pretty varied.

jonas-schievink · 2020-02-12T10:09:24Z

Updated the RFC to remove CoreSend/CoreSync, as per discussion in the meeting yesterday

jamesmunns · 2020-02-12T19:30:23Z

As per our discussion yesterday, this RFC is now entering Final Comment Period, with the disposition to merge. As this issue has been open for more than two weeks already, I would like to merge this on 2020-02-18 (the date of our next meeting) if there are no blocking objections.

@rust-embedded/all, please provide a final review, and approve or reject this RFC.

jamesmunns · 2020-02-18T19:26:20Z

bors r+

Thanks all!

419: Address Multi-Core Soundness by abolishing Send and Sync r=jamesmunns a=jonas-schievink # Summary Change how we model multi-core MCUs in the following ways: * Stop using `Send` and `Sync` to model anything related to cross-core interactions since those traits are unfit for the purpose. * Declare that `Send` is not sufficient to transfer resources between cores, since memory addresses can have different meanings on different cores. * Mutexes based on critical sections become declared sound due to the above: They can only turn `Send` data `Sync`, but neither allows cross-core interactions anymore. * Punt on how to deal with cross-core communication and data sharing for now, and leave it to the ecosystem. [Rendered](https://github.com/jonas-schievink/wg/blob/multi-core-soundness/rfcs/0000-multi-core-soundness.md) [`bare-metal`]: https://github.com/rust-embedded/bare-metal Co-authored-by: Jonas Schievink <[email protected]>

bors · 2020-02-18T19:27:32Z

Build succeeded

continuous-integration/travis-ci/push

RFC: Multi-Core Soundness

2397ea3

jonas-schievink requested review from dylanmckay, jcsoo and a team as code owners January 26, 2020 15:18

Complete sentence

9a51928

Expand on CoreSend/CoreSync

3e0e7d0

andre-richter reviewed Jan 26, 2020

View reviewed changes

Clarify interrupts/threads comparison

b989c8e

jonas-schievink commented Jan 27, 2020

View reviewed changes

rfcs/0000-multi-core-soundness.md Outdated Show resolved Hide resolved

RIP SMP

74fbf99

hannobraun reviewed Jan 28, 2020

View reviewed changes

rfcs/0000-multi-core-soundness.md Outdated Show resolved Hide resolved

Fix typo

01398a0

Co-Authored-By: Hanno Braun <[email protected]>

This was referenced Feb 9, 2020

Ship a 1.0 release rust-embedded/bare-metal#22

Closed

Mutex is not safe on multi-core systems rust-embedded/bare-metal#12

Closed

jonas-schievink mentioned this pull request Feb 11, 2020

Agenda for 2020-02-18 meeting #424

Closed

Remove CoreSend/CoreSync from proposal

4940d76

jamesmunns mentioned this pull request Feb 12, 2020

a sound {bare_metal,cortex_m,etc}::Mutex #388

Closed

jamesmunns approved these changes Feb 12, 2020

View reviewed changes

adamgreig approved these changes Feb 12, 2020

View reviewed changes

therealprof approved these changes Feb 12, 2020

View reviewed changes

Disasm approved these changes Feb 15, 2020

View reviewed changes

andre-richter approved these changes Feb 15, 2020

View reviewed changes

almindor approved these changes Feb 15, 2020

View reviewed changes

jamesmunns mentioned this pull request Feb 18, 2020

Tracking Issue for implementation of RFC419 #428

Closed

bors bot merged commit 4940d76 into rust-embedded:master Feb 18, 2020

jonas-schievink deleted the multi-core-soundness branch February 18, 2020 19:28

therealprof mentioned this pull request Jul 28, 2020

Remove PhantomData<*const ()> from peripherals rust-embedded/svd2rust#458

Closed

		[summary]: #summary

		Change how we model multi-core MCUs in the following ways:

Address Multi-Core Soundness by abolishing Send and Sync #419

Address Multi-Core Soundness by abolishing Send and Sync #419

Uh oh!

Conversation

jonas-schievink commented Jan 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

jamesmunns commented Jan 26, 2020

Uh oh!

jonas-schievink commented Jan 26, 2020

Uh oh!

Amanieu commented Jan 26, 2020

Uh oh!

adamgreig commented Jan 26, 2020

Uh oh!

jonas-schievink commented Jan 26, 2020

Uh oh!

andre-richter Jan 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andre-richter Jan 26, 2020

Choose a reason for hiding this comment

Uh oh!

jonas-schievink Jan 26, 2020

Choose a reason for hiding this comment

Uh oh!

andre-richter commented Jan 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jonas-schievink commented Jan 26, 2020

Uh oh!

cr1901 commented Jan 26, 2020

Uh oh!

andre-richter commented Jan 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jonas-schievink commented Jan 26, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jonas-schievink commented Jan 26, 2020

Uh oh!

therealprof commented Jan 26, 2020

Uh oh!

therealprof commented Jan 26, 2020

Uh oh!

jamesmunns commented Jan 26, 2020

Uh oh!

jonas-schievink commented Jan 26, 2020

Uh oh!

Uh oh!

Uh oh!

Disasm commented Feb 11, 2020

Uh oh!

thejpster commented Feb 11, 2020

Uh oh!

almindor commented Feb 11, 2020

Uh oh!

jonas-schievink commented Feb 12, 2020

Uh oh!

jamesmunns commented Feb 12, 2020

Uh oh!

jamesmunns commented Feb 18, 2020

Uh oh!

bors bot commented Feb 18, 2020

Build succeeded

Uh oh!

Uh oh!

jonas-schievink commented Jan 26, 2020 •

edited

Loading

andre-richter Jan 26, 2020 •

edited

Loading

andre-richter commented Jan 26, 2020 •

edited

Loading

andre-richter commented Jan 26, 2020 •

edited

Loading

jonas-schievink commented Jan 26, 2020 •

edited

Loading