|
| 1 | +- Feature Name: `ptr-meta` |
| 2 | +- Start Date: 2018-10-26 |
| 3 | +- RFC PR: https://github.com/rust-lang/rfcs/pull/2580 |
| 4 | +- Rust Issue: https://github.com/rust-lang/rust/issues/81513 |
| 5 | + |
| 6 | +# Summary |
| 7 | +[summary]: #summary |
| 8 | + |
| 9 | +Add generic APIs that allow manipulating the metadata of fat pointers: |
| 10 | + |
| 11 | +* Naming the metadata’s type (as an associated type) |
| 12 | +* Extracting metadata from a pointer |
| 13 | +* Reconstructing a pointer from a data pointer and metadata |
| 14 | +* Representing vtables, the metadata for trait objects, as a type with some limited API |
| 15 | + |
| 16 | +This RFC does *not* propose a mechanism for defining custom dynamically-sized types, |
| 17 | +but tries to stay compatible with future proposals that do. |
| 18 | + |
| 19 | + |
| 20 | +# Background |
| 21 | +[background]: #background |
| 22 | + |
| 23 | +Typical high-level code doesn’t need to worry about fat pointers, |
| 24 | +a reference `&Foo` “just works” wether or not `Foo` is a DST. |
| 25 | +But unsafe code such as a custom collection library may want to access a fat pointer’s |
| 26 | +components separately. |
| 27 | + |
| 28 | +In Rust 1.11 we *removed* a [`std::raw::Repr`] trait and a [`std::raw::Slice`] type |
| 29 | +from the standard library. |
| 30 | +`Slice` could be `transmute`d to a `&[U]` or `&mut [U]` reference to a slice |
| 31 | +as it was guaranteed to have the same memory layout. |
| 32 | +This was replaced with more specific and less wildly unsafe |
| 33 | +`std::slice::from_raw_parts` and `std::slice::from_raw_parts_mut` functions, |
| 34 | +together with `as_ptr` and `len` methods that extract each fat pointer component separatly. |
| 35 | + |
| 36 | +For trait objects, where we still have an unstable `std::raw::TraitObject` type |
| 37 | +that can only be used with `transmute`: |
| 38 | + |
| 39 | +```rust |
| 40 | +#[repr(C)] |
| 41 | +pub struct TraitObject { |
| 42 | + pub data: *mut (), |
| 43 | + pub vtable: *mut (), |
| 44 | +} |
| 45 | +``` |
| 46 | + |
| 47 | +[`std::raw::Repr`]: https://doc.rust-lang.org/1.10.0/std/raw/trait.Repr.html |
| 48 | +[`std::raw::Slice`]: https://doc.rust-lang.org/1.10.0/std/raw/struct.Slice.html |
| 49 | +[`std::raw::TraitObjet`]: https://doc.rust-lang.org/1.30.0/std/raw/struct.TraitObject.html |
| 50 | + |
| 51 | + |
| 52 | +# Motivation |
| 53 | +[motivation]: #motivation |
| 54 | + |
| 55 | +We now have APIs in Stable Rust to let unsafe code freely and reliably manipulate slices, |
| 56 | +accessing the separate components of a fat pointers and then re-assembling them. |
| 57 | +However `std::raw::TraitObject` is still unstable, |
| 58 | +but it’s probably not the style of API that we’ll want to stabilize |
| 59 | +as it encourages dangerous `transmute` calls. |
| 60 | +This is a “hole” in available APIs to manipulate existing Rust types. |
| 61 | + |
| 62 | +For example [this library][lib] stores multiple trait objects of varying size |
| 63 | +in contiguous memory together with their vtable pointers, |
| 64 | +and during iteration recreates fat pointers from separate data and vtable pointers. |
| 65 | + |
| 66 | +The new `Thin` trait alias also expanding to [extern types] some APIs |
| 67 | +that were unnecessarily restricted to `Sized` types |
| 68 | +because there was previously no way to express pointer-thinness in generic code. |
| 69 | + |
| 70 | +[lib]: https://play.rust-lang.org/?version=nightly&mode=debug&edition=2015&gist=bbeecccc025f5a7a0ad06086678e13f3 |
| 71 | + |
| 72 | + |
| 73 | +# Guide-level explanation |
| 74 | +[guide-level-explanation]: #guide-level-explanation |
| 75 | + |
| 76 | + |
| 77 | +Let’s build generic type similar to `Box<dyn Trait>`, |
| 78 | +but where the vtable pointer is stored in heap memory next to the value |
| 79 | +so that the pointer is thin. |
| 80 | +First, let’s get some boilerplate out of the way: |
| 81 | + |
| 82 | +```rust |
| 83 | +use std::marker::{PhantomData, Unsize}; |
| 84 | +use std::ptr::{self, DynMetadata}; |
| 85 | + |
| 86 | +trait DynTrait<Dyn> = Pointee<Metadata=DynMetadata<Dyn>>; |
| 87 | + |
| 88 | +pub struct ThinBox<Dyn: ?Sized + DynTrait<Dyn>> { |
| 89 | + ptr: ptr::NonNull<WithMeta<()>>, |
| 90 | + phantom: PhantomData<Dyn>, |
| 91 | +} |
| 92 | + |
| 93 | +#[repr(C)] |
| 94 | +struct WithMeta<T: ?Sized> { |
| 95 | + vtable: DynMetadata, |
| 96 | + value: T, |
| 97 | +} |
| 98 | +``` |
| 99 | + |
| 100 | +Since [unsized rvalues] are not implemented yet, |
| 101 | +our constructor is going to “unsize” from a concrete type that implements our trait. |
| 102 | +The `Unsize` bound ensures we can cast from `&S` to a `&Dyn` trait object |
| 103 | +and construct the appopriate metadata. |
| 104 | + |
| 105 | +[unsized rvalues]: https://github.com/rust-lang/rust/issues/48055 |
| 106 | + |
| 107 | +We let `Box` do the memory layout computation and allocation: |
| 108 | + |
| 109 | +```rust |
| 110 | +impl<Dyn: ?Sized + DynTrait> ThinBox<Dyn> { |
| 111 | + pub fn new_unsize<S>(value: S) -> Self where S: Unsize<Dyn> { |
| 112 | + let vtable = ptr::metadata(&value as &Dyn); |
| 113 | + let ptr = NonNull::from(Box::leak(Box::new(WithMeta { vtable, value }))).cast(); |
| 114 | + ThinBox { ptr, phantom: PhantomData } |
| 115 | + } |
| 116 | +} |
| 117 | +``` |
| 118 | + |
| 119 | +(Another possible constructor is `pub fn new_copy(value: &Dyn) where Dyn: Copy`, |
| 120 | +but it would involve slightly more code.) |
| 121 | + |
| 122 | +Accessing the value requires knowing its alignment: |
| 123 | + |
| 124 | +```rust |
| 125 | +impl<Dyn: ?Sized + DynTrait> ThinBox<Dyn> { |
| 126 | + fn data_ptr(&self) -> *mut () { |
| 127 | + unsafe { |
| 128 | + let offset = std::mem::size_of::<DynMetadata<Dyn>(); |
| 129 | + let value_align = self.ptr.as_ref().vtable.align(); |
| 130 | + let offset = align_up_to(offset, value_align); |
| 131 | + (self.ptr.as_ptr() as *mut u8).add(offset) as *mut () |
| 132 | + } |
| 133 | + } |
| 134 | +} |
| 135 | + |
| 136 | +/// <https://github.com/rust-lang/rust/blob/1.30.0/src/libcore/alloc.rs#L199-L219> |
| 137 | +fn align_up_to(offset: usize, align: usize) -> usize { |
| 138 | + offset.wrapping_add(align).wrapping_sub(1) & !align.wrapping_sub(1) |
| 139 | +} |
| 140 | + |
| 141 | +// Similarly Deref |
| 142 | +impl<Dyn: ?Sized + DynTrait> DerefMut for ThinBox<Dyn> { |
| 143 | + fn deref_mut(&mut self) -> &mut Dyn { |
| 144 | + unsafe { |
| 145 | + &mut *<*mut Dyn>::from_raw_parts(self.data_ptr(), *self.ptr.as_ref().vtable) |
| 146 | + } |
| 147 | + } |
| 148 | +} |
| 149 | +``` |
| 150 | + |
| 151 | +Finally, in `Drop` we may not be able to take advantage of `Box` again |
| 152 | +since the original `Sized` type `S` is not statically known at this point. |
| 153 | + |
| 154 | +```rust |
| 155 | +impl<Dyn: ?Sized + DynTrait> Drop for ThinBox<Dyn> { |
| 156 | + fn drop(&mut self) { |
| 157 | + unsafe { |
| 158 | + let layout = /* left as an exercise for the reader */; |
| 159 | + ptr::drop_in_place::<Dyn>(&mut **self); |
| 160 | + alloc::dealloc(self.ptr.cast(), layout); |
| 161 | + } |
| 162 | + } |
| 163 | +} |
| 164 | +``` |
| 165 | + |
| 166 | + |
| 167 | +# Reference-level explanation |
| 168 | +[reference-level-explanation]: #reference-level-explanation |
| 169 | + |
| 170 | +The APIs whose full definition is found below |
| 171 | +are added to `core::ptr` and re-exported in `std::ptr`: |
| 172 | + |
| 173 | +* A `Pointee` trait, |
| 174 | + implemented automatically for all types |
| 175 | + (similar to how `Sized` and `Unsize` are implemented automatically). |
| 176 | +* A `Thin` [trait alias]. |
| 177 | + If this RFC is implemented before type aliases are, |
| 178 | + uses of `Thin` should be replaced with its definition. |
| 179 | +* A `metadata` free function |
| 180 | +* A `DynMetadata` struct |
| 181 | +* A `from_raw_parts` constructor for each of `*const T`, `*mut T`, and `NonNull<T>`. |
| 182 | + |
| 183 | +The bounds on `null()` and `null_mut()` function in that same module |
| 184 | +as well as the `NonNull::dangling` constructor |
| 185 | +are changed from (implicit) `T: Sized` to `T: ?Sized + Thin`. |
| 186 | +Similarly for the `U` type parameter of the `NonNull::cast` method. |
| 187 | +This enables using those functions with [extern types]. |
| 188 | + |
| 189 | +The `Pointee` trait is implemented for all types. |
| 190 | +This can be relied on in generic code, |
| 191 | +even if a type parameter `T` does not have an explicit `T: Pointee` bound. |
| 192 | +This is similar to how the `Any` trait can be used without an explicit `T: Any` bound, |
| 193 | +only `T: 'static`, because a blanket `impl<T: 'static> Any for T {…}` exists. |
| 194 | +(Except that `Pointee` is not restricted to `'static`.) |
| 195 | + |
| 196 | +For the purpose of pointer casts being allowed by the `as` operator, |
| 197 | +a pointer to `T` is considered to be thin if `T: Thin` instead of `T: Sized`. |
| 198 | +This similarly includes extern types. |
| 199 | + |
| 200 | +`std::raw::TraitObject` and `std::raw` are deprecated and eventually removed. |
| 201 | + |
| 202 | +[trait alias]: https://github.com/rust-lang/rust/issues/41517 |
| 203 | +[extern types]: https://github.com/rust-lang/rust/issues/43467 |
| 204 | + |
| 205 | +```rust |
| 206 | +/// This trait is automatically implemented for every type. |
| 207 | +/// |
| 208 | +/// Raw pointer types and reference types in Rust can be thought of as made of two parts: |
| 209 | +/// a data pointer that contains the memory address of the value, and some metadata. |
| 210 | +/// |
| 211 | +/// For statically-sized types (that implement the `Sized` traits) |
| 212 | +/// as well as for `extern` types, |
| 213 | +/// pointers are said to be “thin”: metadata is zero-sized and its type is `()`. |
| 214 | +/// |
| 215 | +/// Pointers to [dynamically-sized types][dst] are said to be “fat” |
| 216 | +/// and have non-zero-sized metadata: |
| 217 | +/// |
| 218 | +/// * For structs whose last field is a DST, metadata is the metadata for the last field |
| 219 | +/// * For the `str` type, metadata is the length in bytes as `usize` |
| 220 | +/// * For slice types like `[T]`, metadata is the length in items as `usize` |
| 221 | +/// * For trait objects like `dyn SomeTrait`, metadata is [`DynMetadata<Self>`][DynMetadata] |
| 222 | +/// (e.g. `DynMetadata<dyn SomeTrait>`). |
| 223 | +/// |
| 224 | +/// In the future, the Rust language may gain new kinds of types |
| 225 | +/// that have different pointer metadata. |
| 226 | +/// |
| 227 | +/// Pointer metadata can be extracted from a pointer or reference with the [`metadata`] function. |
| 228 | +/// The data pointer can be extracted by casting a (fat) pointer |
| 229 | +/// to a (thin) pointer to a `Sized` type with the `as` operator, |
| 230 | +/// for example `(x: &dyn SomeTrait) as *const SomeTrait as *const ()` |
| 231 | +/// or `(x: *const dyn SomeTrait).cast::<()>()`. |
| 232 | +/// |
| 233 | +/// [dst]: https://doc.rust-lang.org/nomicon/exotic-sizes.html#dynamically-sized-types-dsts |
| 234 | +#[lang = "pointee"] |
| 235 | +pub trait Pointee { |
| 236 | + /// The type for metadata in pointers and references to `Self`. |
| 237 | + type Metadata: Copy + Send + Sync + Ord + Hash + Unpin; |
| 238 | +} |
| 239 | + |
| 240 | +/// Pointers to types implementing this trait alias are “thin”: |
| 241 | +/// |
| 242 | +/// ```rust |
| 243 | +/// fn this_never_panics<T: std::ptr::Thin>() { |
| 244 | +/// assert_eq!(std::mem::size_of::<&T>(), std::mem::size_of::<usize>()) |
| 245 | +/// } |
| 246 | +/// ``` |
| 247 | +pub trait Thin = Pointee<Metadata=()>; |
| 248 | + |
| 249 | +/// Extract the metadata component of a pointer. |
| 250 | +/// |
| 251 | +/// Values of type `*mut T`, `&T`, or `&mut T` can be passed directly to this function |
| 252 | +/// as they implicitly coerce to `*const T`. |
| 253 | +/// For example: |
| 254 | +/// |
| 255 | +/// ``` |
| 256 | +/// assert_eq(std::ptr::metadata("foo"), 3_usize); |
| 257 | +/// ``` |
| 258 | +/// |
| 259 | +/// Note that the data component of a (fat) pointer can be extracted by casting |
| 260 | +/// to a (thin) pointer to any `Sized` type: |
| 261 | +/// |
| 262 | +/// ``` |
| 263 | +/// # trait SomeTrait {} |
| 264 | +/// # fn example(something: &SomeTrait) { |
| 265 | +/// let object: &SomeTrait = something; |
| 266 | +/// let data_ptr = object as *const SomeTrait as *const (); |
| 267 | +/// # } |
| 268 | +/// ``` |
| 269 | +pub fn metadata<T: ?Sized>(ptr: *const T) -> <T as Pointee>::Metadata {…} |
| 270 | + |
| 271 | +impl<T: ?Sized> *const T { |
| 272 | + pub fn from_raw_parts(data: *const (), meta: <T as Pointee>::Metadata) -> Self {…} |
| 273 | +} |
| 274 | + |
| 275 | +impl<T: ?Sized> *mut T { |
| 276 | + pub fn from_raw_parts(data: *mut (), meta: <T as Pointee>::Metadata) -> Self {…} |
| 277 | +} |
| 278 | + |
| 279 | +impl<T: ?Sized> NonNull<T> { |
| 280 | + pub fn from_raw_parts(data: NonNull<()>, meta: <T as Pointee>::Metadata) -> Self { |
| 281 | + unsafe { |
| 282 | + NonNull::new_unchecked(<*mut _>::from_raw_parts(data.as_ptr(), meta)) |
| 283 | + } |
| 284 | + } |
| 285 | +} |
| 286 | + |
| 287 | +/// The metadata for a `DynTrait = dyn SomeTrait` trait object type. |
| 288 | +/// |
| 289 | +/// It is a pointer to a vtable (virtual call table) |
| 290 | +/// that represents all the necessary information |
| 291 | +/// to manipulate the concrete type stored inside a trait object. |
| 292 | +/// The vtable notably it contains: |
| 293 | +/// |
| 294 | +/// * type size |
| 295 | +/// * type alignment |
| 296 | +/// * a pointer to the type’s `drop_in_place` impl (may be a no-op for plain-old-data) |
| 297 | +/// * pointers to all the methods for the type’s implementation of the trait |
| 298 | +/// |
| 299 | +/// Note that the first three are special because they’re necessary to allocate, drop, |
| 300 | +/// and deallocate any trait object. |
| 301 | +/// |
| 302 | +/// It is possible to name this struct with a type parameter that is not a `dyn` trait object |
| 303 | +/// (for example `DynMetadata<u64>`) but not to obtain a meaningful value of that struct. |
| 304 | +#[derive(Copy, Clone)] |
| 305 | +pub struct DynMetadata<DynTrait: ?Sized> { |
| 306 | + // Private fields |
| 307 | + vtable_ptr: ptr::NonNull<()>, |
| 308 | + phantom: PhantomData<DynTrait> |
| 309 | +} |
| 310 | + |
| 311 | +impl<DynTrait: ?Sized> DynMetadata<DynTrait> { |
| 312 | + /// Returns the size of the type associated with this vtable. |
| 313 | + pub fn size(self) -> usize { ... } |
| 314 | + |
| 315 | + /// Returns the alignment of the type associated with this vtable. |
| 316 | + pub fn align(self) -> usize { ... } |
| 317 | + |
| 318 | + /// Returns the size and alignment together as a `Layout` |
| 319 | + pub fn layout(self) -> alloc::Layout { |
| 320 | + unsafe { |
| 321 | + alloc::Layout::from_size_align_unchecked(self.size(), self.align()) |
| 322 | + } |
| 323 | + } |
| 324 | +} |
| 325 | +``` |
| 326 | + |
| 327 | + |
| 328 | +# Rationale and alternatives |
| 329 | +[rationale-and-alternatives]: #rationale-and-alternatives |
| 330 | + |
| 331 | +The status quo is that code (such as linked in [Motivation]) that requires this functionality |
| 332 | +needs to transmute to and from `std::raw::TraitObject` |
| 333 | +or a copy of it (to be compatible with Stable Rust). |
| 334 | +Additionally, in cases where constructing the data pointer |
| 335 | +requires knowing the alignment of the concrete type, |
| 336 | +a dangling pointer such as `0x8000_0000_usize as *mut ()` needs to be created. |
| 337 | +It is not clear whether `std::mem::align_of(&*ptr)` with `ptr: *const dyn SomeTrait` |
| 338 | +is Undefined Behavior with a dangling data pointer. |
| 339 | + |
| 340 | +A [previous iteration][2579] of this RFC proposed a `DynTrait` |
| 341 | +that would only be implemented for trait objects like `dyn SomeTrait`. |
| 342 | +There would be no `Metadata` associated type, `DynMetadata` was hard-coded in the trait. |
| 343 | +In addition to being more general |
| 344 | +and (hopefully) more compatible with future custom DSTs proposals, |
| 345 | +this RFC resolves the question of what happens |
| 346 | +if trait objects with super-fat pointers with multiple vtable pointers are ever added. |
| 347 | +(Answer: they can use a different metadata type, |
| 348 | +possibly like `(DynMetadata<dyn Trait>, DynMetadata<dyn OtherTrait>)`.) |
| 349 | + |
| 350 | +[2579]: https://github.com/rust-lang/rfcs/pull/2579 |
| 351 | + |
| 352 | + |
| 353 | +# Prior art |
| 354 | +[prior-art]: #prior-art |
| 355 | + |
| 356 | +A previous [Custom Dynamically-Sized Types][cdst] RFC was postponed. |
| 357 | +[Internals thread #6663][6663] took the same ideas |
| 358 | +and was even more ambitious in being very general. |
| 359 | +Except for `DynMetadata`’s methods, this RFC proposes a subset of what that thread did. |
| 360 | + |
| 361 | +[cdst]: https://github.com/rust-lang/rfcs/pull/1524 |
| 362 | +[6663]: https://internals.rust-lang.org/t/pre-erfc-lets-fix-dsts/6663 |
| 363 | + |
| 364 | + |
| 365 | +# Unresolved questions |
| 366 | +[unresolved-questions]: #unresolved-questions |
| 367 | + |
| 368 | +* The name of `Pointee`. [Internals thread #6663][6663] used `Referent`. |
| 369 | + |
| 370 | +* The location of `DynMetadata`. Is another module more appropriate than `std::ptr`? |
| 371 | + |
| 372 | +* Should `DynMetadata` not have a type parameter? |
| 373 | + This might reduce monomorphization cost, |
| 374 | + but would force that the size, alignment, and destruction pointers |
| 375 | + be in the same location (offset) for every vtable. |
| 376 | + But keeping them in the same location is probaly desirable anyway to keep code size small. |
| 377 | + |
| 378 | +* The name of `Thin`. |
| 379 | + This name is short and sweet but `T: Thin` suggests that `T` itself is thin, |
| 380 | + rather than pointers and references to `T`. |
| 381 | + |
| 382 | +* The location of `Thin`. Better in `std::marker`? |
| 383 | + |
| 384 | +* Should `Thin` be added as a supertrait of `Sized`? |
| 385 | + Or could it ever make sense to have fat pointers to statically-sized types? |
| 386 | + |
| 387 | +* Are there other generic standard library APIs like `ptr::null()` |
| 388 | + that have an (implicit) `T: Sized` bound that unneccesarily excludes extern types? |
| 389 | + |
| 390 | +* Should `<*mut _>::from_raw_parts` and friends be `unsafe fn`s? |
| 391 | + |
| 392 | +* API design: free functions v.s. methods/constructors on `*mut _` and `*const _`? |
| 393 | + |
| 394 | +* Add `into_raw_parts` that returns `(*const (), T::Metadata)`? |
| 395 | + Using the `cast` method to a `Sized` type to extract the address as a thin pointer |
| 396 | + is less discoverable. |
| 397 | + Possibly *instead* of the metadata function? |
0 commit comments