Skip to content

Commit b39cd86

Browse files
authored
Merge pull request #2580 from SimonSapin/ptr-meta
RFC: Pointer metadata & VTable
2 parents 513198d + 50b567b commit b39cd86

File tree

1 file changed

+397
-0
lines changed

1 file changed

+397
-0
lines changed

text/2580-ptr-meta.md

+397
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,397 @@
1+
- Feature Name: `ptr-meta`
2+
- Start Date: 2018-10-26
3+
- RFC PR: https://github.com/rust-lang/rfcs/pull/2580
4+
- Rust Issue: https://github.com/rust-lang/rust/issues/81513
5+
6+
# Summary
7+
[summary]: #summary
8+
9+
Add generic APIs that allow manipulating the metadata of fat pointers:
10+
11+
* Naming the metadata’s type (as an associated type)
12+
* Extracting metadata from a pointer
13+
* Reconstructing a pointer from a data pointer and metadata
14+
* Representing vtables, the metadata for trait objects, as a type with some limited API
15+
16+
This RFC does *not* propose a mechanism for defining custom dynamically-sized types,
17+
but tries to stay compatible with future proposals that do.
18+
19+
20+
# Background
21+
[background]: #background
22+
23+
Typical high-level code doesn’t need to worry about fat pointers,
24+
a reference `&Foo` “just works” wether or not `Foo` is a DST.
25+
But unsafe code such as a custom collection library may want to access a fat pointer’s
26+
components separately.
27+
28+
In Rust 1.11 we *removed* a [`std::raw::Repr`] trait and a [`std::raw::Slice`] type
29+
from the standard library.
30+
`Slice` could be `transmute`d to a `&[U]` or `&mut [U]` reference to a slice
31+
as it was guaranteed to have the same memory layout.
32+
This was replaced with more specific and less wildly unsafe
33+
`std::slice::from_raw_parts` and `std::slice::from_raw_parts_mut` functions,
34+
together with `as_ptr` and `len` methods that extract each fat pointer component separatly.
35+
36+
For trait objects, where we still have an unstable `std::raw::TraitObject` type
37+
that can only be used with `transmute`:
38+
39+
```rust
40+
#[repr(C)]
41+
pub struct TraitObject {
42+
pub data: *mut (),
43+
pub vtable: *mut (),
44+
}
45+
```
46+
47+
[`std::raw::Repr`]: https://doc.rust-lang.org/1.10.0/std/raw/trait.Repr.html
48+
[`std::raw::Slice`]: https://doc.rust-lang.org/1.10.0/std/raw/struct.Slice.html
49+
[`std::raw::TraitObjet`]: https://doc.rust-lang.org/1.30.0/std/raw/struct.TraitObject.html
50+
51+
52+
# Motivation
53+
[motivation]: #motivation
54+
55+
We now have APIs in Stable Rust to let unsafe code freely and reliably manipulate slices,
56+
accessing the separate components of a fat pointers and then re-assembling them.
57+
However `std::raw::TraitObject` is still unstable,
58+
but it’s probably not the style of API that we’ll want to stabilize
59+
as it encourages dangerous `transmute` calls.
60+
This is a “hole” in available APIs to manipulate existing Rust types.
61+
62+
For example [this library][lib] stores multiple trait objects of varying size
63+
in contiguous memory together with their vtable pointers,
64+
and during iteration recreates fat pointers from separate data and vtable pointers.
65+
66+
The new `Thin` trait alias also expanding to [extern types] some APIs
67+
that were unnecessarily restricted to `Sized` types
68+
because there was previously no way to express pointer-thinness in generic code.
69+
70+
[lib]: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2015&gist=bbeecccc025f5a7a0ad06086678e13f3
71+
72+
73+
# Guide-level explanation
74+
[guide-level-explanation]: #guide-level-explanation
75+
76+
77+
Let’s build generic type similar to `Box<dyn Trait>`,
78+
but where the vtable pointer is stored in heap memory next to the value
79+
so that the pointer is thin.
80+
First, let’s get some boilerplate out of the way:
81+
82+
```rust
83+
use std::marker::{PhantomData, Unsize};
84+
use std::ptr::{self, DynMetadata};
85+
86+
trait DynTrait<Dyn> = Pointee<Metadata=DynMetadata<Dyn>>;
87+
88+
pub struct ThinBox<Dyn: ?Sized + DynTrait<Dyn>> {
89+
ptr: ptr::NonNull<WithMeta<()>>,
90+
phantom: PhantomData<Dyn>,
91+
}
92+
93+
#[repr(C)]
94+
struct WithMeta<T: ?Sized> {
95+
vtable: DynMetadata,
96+
value: T,
97+
}
98+
```
99+
100+
Since [unsized rvalues] are not implemented yet,
101+
our constructor is going to “unsize” from a concrete type that implements our trait.
102+
The `Unsize` bound ensures we can cast from `&S` to a `&Dyn` trait object
103+
and construct the appopriate metadata.
104+
105+
[unsized rvalues]: https://github.com/rust-lang/rust/issues/48055
106+
107+
We let `Box` do the memory layout computation and allocation:
108+
109+
```rust
110+
impl<Dyn: ?Sized + DynTrait> ThinBox<Dyn> {
111+
pub fn new_unsize<S>(value: S) -> Self where S: Unsize<Dyn> {
112+
let vtable = ptr::metadata(&value as &Dyn);
113+
let ptr = NonNull::from(Box::leak(Box::new(WithMeta { vtable, value }))).cast();
114+
ThinBox { ptr, phantom: PhantomData }
115+
}
116+
}
117+
```
118+
119+
(Another possible constructor is `pub fn new_copy(value: &Dyn) where Dyn: Copy`,
120+
but it would involve slightly more code.)
121+
122+
Accessing the value requires knowing its alignment:
123+
124+
```rust
125+
impl<Dyn: ?Sized + DynTrait> ThinBox<Dyn> {
126+
fn data_ptr(&self) -> *mut () {
127+
unsafe {
128+
let offset = std::mem::size_of::<DynMetadata<Dyn>();
129+
let value_align = self.ptr.as_ref().vtable.align();
130+
let offset = align_up_to(offset, value_align);
131+
(self.ptr.as_ptr() as *mut u8).add(offset) as *mut ()
132+
}
133+
}
134+
}
135+
136+
/// <https://github.com/rust-lang/rust/blob/1.30.0/src/libcore/alloc.rs#L199-L219>
137+
fn align_up_to(offset: usize, align: usize) -> usize {
138+
offset.wrapping_add(align).wrapping_sub(1) & !align.wrapping_sub(1)
139+
}
140+
141+
// Similarly Deref
142+
impl<Dyn: ?Sized + DynTrait> DerefMut for ThinBox<Dyn> {
143+
fn deref_mut(&mut self) -> &mut Dyn {
144+
unsafe {
145+
&mut *<*mut Dyn>::from_raw_parts(self.data_ptr(), *self.ptr.as_ref().vtable)
146+
}
147+
}
148+
}
149+
```
150+
151+
Finally, in `Drop` we may not be able to take advantage of `Box` again
152+
since the original `Sized` type `S` is not statically known at this point.
153+
154+
```rust
155+
impl<Dyn: ?Sized + DynTrait> Drop for ThinBox<Dyn> {
156+
fn drop(&mut self) {
157+
unsafe {
158+
let layout = /* left as an exercise for the reader */;
159+
ptr::drop_in_place::<Dyn>(&mut **self);
160+
alloc::dealloc(self.ptr.cast(), layout);
161+
}
162+
}
163+
}
164+
```
165+
166+
167+
# Reference-level explanation
168+
[reference-level-explanation]: #reference-level-explanation
169+
170+
The APIs whose full definition is found below
171+
are added to `core::ptr` and re-exported in `std::ptr`:
172+
173+
* A `Pointee` trait,
174+
implemented automatically for all types
175+
(similar to how `Sized` and `Unsize` are implemented automatically).
176+
* A `Thin` [trait alias].
177+
If this RFC is implemented before type aliases are,
178+
uses of `Thin` should be replaced with its definition.
179+
* A `metadata` free function
180+
* A `DynMetadata` struct
181+
* A `from_raw_parts` constructor for each of `*const T`, `*mut T`, and `NonNull<T>`.
182+
183+
The bounds on `null()` and `null_mut()` function in that same module
184+
as well as the `NonNull::dangling` constructor
185+
are changed from (implicit) `T: Sized` to `T: ?Sized + Thin`.
186+
Similarly for the `U` type parameter of the `NonNull::cast` method.
187+
This enables using those functions with [extern types].
188+
189+
The `Pointee` trait is implemented for all types.
190+
This can be relied on in generic code,
191+
even if a type parameter `T` does not have an explicit `T: Pointee` bound.
192+
This is similar to how the `Any` trait can be used without an explicit `T: Any` bound,
193+
only `T: 'static`, because a blanket `impl<T: 'static> Any for T {…}` exists.
194+
(Except that `Pointee` is not restricted to `'static`.)
195+
196+
For the purpose of pointer casts being allowed by the `as` operator,
197+
a pointer to `T` is considered to be thin if `T: Thin` instead of `T: Sized`.
198+
This similarly includes extern types.
199+
200+
`std::raw::TraitObject` and `std::raw` are deprecated and eventually removed.
201+
202+
[trait alias]: https://github.com/rust-lang/rust/issues/41517
203+
[extern types]: https://github.com/rust-lang/rust/issues/43467
204+
205+
```rust
206+
/// This trait is automatically implemented for every type.
207+
///
208+
/// Raw pointer types and reference types in Rust can be thought of as made of two parts:
209+
/// a data pointer that contains the memory address of the value, and some metadata.
210+
///
211+
/// For statically-sized types (that implement the `Sized` traits)
212+
/// as well as for `extern` types,
213+
/// pointers are said to be “thin”: metadata is zero-sized and its type is `()`.
214+
///
215+
/// Pointers to [dynamically-sized types][dst] are said to be “fat”
216+
/// and have non-zero-sized metadata:
217+
///
218+
/// * For structs whose last field is a DST, metadata is the metadata for the last field
219+
/// * For the `str` type, metadata is the length in bytes as `usize`
220+
/// * For slice types like `[T]`, metadata is the length in items as `usize`
221+
/// * For trait objects like `dyn SomeTrait`, metadata is [`DynMetadata<Self>`][DynMetadata]
222+
/// (e.g. `DynMetadata<dyn SomeTrait>`).
223+
///
224+
/// In the future, the Rust language may gain new kinds of types
225+
/// that have different pointer metadata.
226+
///
227+
/// Pointer metadata can be extracted from a pointer or reference with the [`metadata`] function.
228+
/// The data pointer can be extracted by casting a (fat) pointer
229+
/// to a (thin) pointer to a `Sized` type with the `as` operator,
230+
/// for example `(x: &dyn SomeTrait) as *const SomeTrait as *const ()`
231+
/// or `(x: *const dyn SomeTrait).cast::<()>()`.
232+
///
233+
/// [dst]: https://doc.rust-lang.org/nomicon/exotic-sizes.html#dynamically-sized-types-dsts
234+
#[lang = "pointee"]
235+
pub trait Pointee {
236+
/// The type for metadata in pointers and references to `Self`.
237+
type Metadata: Copy + Send + Sync + Ord + Hash + Unpin;
238+
}
239+
240+
/// Pointers to types implementing this trait alias are “thin”:
241+
///
242+
/// ```rust
243+
/// fn this_never_panics<T: std::ptr::Thin>() {
244+
/// assert_eq!(std::mem::size_of::<&T>(), std::mem::size_of::<usize>())
245+
/// }
246+
/// ```
247+
pub trait Thin = Pointee<Metadata=()>;
248+
249+
/// Extract the metadata component of a pointer.
250+
///
251+
/// Values of type `*mut T`, `&T`, or `&mut T` can be passed directly to this function
252+
/// as they implicitly coerce to `*const T`.
253+
/// For example:
254+
///
255+
/// ```
256+
/// assert_eq(std::ptr::metadata("foo"), 3_usize);
257+
/// ```
258+
///
259+
/// Note that the data component of a (fat) pointer can be extracted by casting
260+
/// to a (thin) pointer to any `Sized` type:
261+
///
262+
/// ```
263+
/// # trait SomeTrait {}
264+
/// # fn example(something: &SomeTrait) {
265+
/// let object: &SomeTrait = something;
266+
/// let data_ptr = object as *const SomeTrait as *const ();
267+
/// # }
268+
/// ```
269+
pub fn metadata<T: ?Sized>(ptr: *const T) -> <T as Pointee>::Metadata {…}
270+
271+
impl<T: ?Sized> *const T {
272+
pub fn from_raw_parts(data: *const (), meta: <T as Pointee>::Metadata) -> Self {…}
273+
}
274+
275+
impl<T: ?Sized> *mut T {
276+
pub fn from_raw_parts(data: *mut (), meta: <T as Pointee>::Metadata) -> Self {…}
277+
}
278+
279+
impl<T: ?Sized> NonNull<T> {
280+
pub fn from_raw_parts(data: NonNull<()>, meta: <T as Pointee>::Metadata) -> Self {
281+
unsafe {
282+
NonNull::new_unchecked(<*mut _>::from_raw_parts(data.as_ptr(), meta))
283+
}
284+
}
285+
}
286+
287+
/// The metadata for a `DynTrait = dyn SomeTrait` trait object type.
288+
///
289+
/// It is a pointer to a vtable (virtual call table)
290+
/// that represents all the necessary information
291+
/// to manipulate the concrete type stored inside a trait object.
292+
/// The vtable notably it contains:
293+
///
294+
/// * type size
295+
/// * type alignment
296+
/// * a pointer to the type’s `drop_in_place` impl (may be a no-op for plain-old-data)
297+
/// * pointers to all the methods for the type’s implementation of the trait
298+
///
299+
/// Note that the first three are special because they’re necessary to allocate, drop,
300+
/// and deallocate any trait object.
301+
///
302+
/// It is possible to name this struct with a type parameter that is not a `dyn` trait object
303+
/// (for example `DynMetadata<u64>`) but not to obtain a meaningful value of that struct.
304+
#[derive(Copy, Clone)]
305+
pub struct DynMetadata<DynTrait: ?Sized> {
306+
// Private fields
307+
vtable_ptr: ptr::NonNull<()>,
308+
phantom: PhantomData<DynTrait>
309+
}
310+
311+
impl<DynTrait: ?Sized> DynMetadata<DynTrait> {
312+
/// Returns the size of the type associated with this vtable.
313+
pub fn size(self) -> usize { ... }
314+
315+
/// Returns the alignment of the type associated with this vtable.
316+
pub fn align(self) -> usize { ... }
317+
318+
/// Returns the size and alignment together as a `Layout`
319+
pub fn layout(self) -> alloc::Layout {
320+
unsafe {
321+
alloc::Layout::from_size_align_unchecked(self.size(), self.align())
322+
}
323+
}
324+
}
325+
```
326+
327+
328+
# Rationale and alternatives
329+
[rationale-and-alternatives]: #rationale-and-alternatives
330+
331+
The status quo is that code (such as linked in [Motivation]) that requires this functionality
332+
needs to transmute to and from `std::raw::TraitObject`
333+
or a copy of it (to be compatible with Stable Rust).
334+
Additionally, in cases where constructing the data pointer
335+
requires knowing the alignment of the concrete type,
336+
a dangling pointer such as `0x8000_0000_usize as *mut ()` needs to be created.
337+
It is not clear whether `std::mem::align_of(&*ptr)` with `ptr: *const dyn SomeTrait`
338+
is Undefined Behavior with a dangling data pointer.
339+
340+
A [previous iteration][2579] of this RFC proposed a `DynTrait`
341+
that would only be implemented for trait objects like `dyn SomeTrait`.
342+
There would be no `Metadata` associated type, `DynMetadata` was hard-coded in the trait.
343+
In addition to being more general
344+
and (hopefully) more compatible with future custom DSTs proposals,
345+
this RFC resolves the question of what happens
346+
if trait objects with super-fat pointers with multiple vtable pointers are ever added.
347+
(Answer: they can use a different metadata type,
348+
possibly like `(DynMetadata<dyn Trait>, DynMetadata<dyn OtherTrait>)`.)
349+
350+
[2579]: https://github.com/rust-lang/rfcs/pull/2579
351+
352+
353+
# Prior art
354+
[prior-art]: #prior-art
355+
356+
A previous [Custom Dynamically-Sized Types][cdst] RFC was postponed.
357+
[Internals thread #6663][6663] took the same ideas
358+
and was even more ambitious in being very general.
359+
Except for `DynMetadata`’s methods, this RFC proposes a subset of what that thread did.
360+
361+
[cdst]: https://github.com/rust-lang/rfcs/pull/1524
362+
[6663]: https://internals.rust-lang.org/t/pre-erfc-lets-fix-dsts/6663
363+
364+
365+
# Unresolved questions
366+
[unresolved-questions]: #unresolved-questions
367+
368+
* The name of `Pointee`. [Internals thread #6663][6663] used `Referent`.
369+
370+
* The location of `DynMetadata`. Is another module more appropriate than `std::ptr`?
371+
372+
* Should `DynMetadata` not have a type parameter?
373+
This might reduce monomorphization cost,
374+
but would force that the size, alignment, and destruction pointers
375+
be in the same location (offset) for every vtable.
376+
But keeping them in the same location is probaly desirable anyway to keep code size small.
377+
378+
* The name of `Thin`.
379+
This name is short and sweet but `T: Thin` suggests that `T` itself is thin,
380+
rather than pointers and references to `T`.
381+
382+
* The location of `Thin`. Better in `std::marker`?
383+
384+
* Should `Thin` be added as a supertrait of `Sized`?
385+
Or could it ever make sense to have fat pointers to statically-sized types?
386+
387+
* Are there other generic standard library APIs like `ptr::null()`
388+
that have an (implicit) `T: Sized` bound that unneccesarily excludes extern types?
389+
390+
* Should `<*mut _>::from_raw_parts` and friends be `unsafe fn`s?
391+
392+
* API design: free functions v.s. methods/constructors on `*mut _` and `*const _`?
393+
394+
* Add `into_raw_parts` that returns `(*const (), T::Metadata)`?
395+
Using the `cast` method to a `Sized` type to extract the address as a thin pointer
396+
is less discoverable.
397+
Possibly *instead* of the metadata function?

0 commit comments

Comments
 (0)