|
| 1 | +- Start Date: 2015-01-06 |
| 2 | +- RFC PR: |
| 3 | +- Rust Issue: |
| 4 | + |
| 5 | +# Summary |
| 6 | + |
| 7 | +Establish a convention throughout the core libraries for unsafe functions |
| 8 | +constructing references out of raw pointers. The goal is to improve usability |
| 9 | +while promoting awareness of possible pitfalls with inferred lifetimes. |
| 10 | + |
| 11 | +# Motivation |
| 12 | + |
| 13 | +The current library convention on functions constructing borrowed |
| 14 | +values from raw pointers has the pointer passed by reference, which |
| 15 | +reference's lifetime is carried over to the return value. |
| 16 | +Unfortunately, the lifetime of a raw pointer is often not indicative |
| 17 | +of the lifetime of the pointed-to data. So the status quo eschews the |
| 18 | +flexibility of inferring the lifetime from the usage, while falling short |
| 19 | +of providing useful safety semantics in exchange. |
| 20 | + |
| 21 | +A typical case where the lifetime needs to be adjusted is in bindings |
| 22 | +to a foregn library, when returning a reference to an object's |
| 23 | +inner value (we know from the library's API contract that |
| 24 | +the inner data's lifetime is bound to the containing object): |
| 25 | +```rust |
| 26 | +impl Outer { |
| 27 | + fn inner_str(&self) -> &[u8] { |
| 28 | + unsafe { |
| 29 | + let p = ffi::outer_get_inner_str(&self.raw); |
| 30 | + let s = std::slice::from_raw_buf(&p, libc::strlen(p)); |
| 31 | + std::mem::copy_lifetime(self, s) |
| 32 | + } |
| 33 | + } |
| 34 | +} |
| 35 | +``` |
| 36 | +Raw pointer casts also discard the lifetime of the original pointed-to value. |
| 37 | + |
| 38 | +# Detailed design |
| 39 | + |
| 40 | +The signature of `from_raw*` constructors will be changed back to what it |
| 41 | +once was, passing a pointer by value: |
| 42 | +```rust |
| 43 | +unsafe fn from_raw_buf<'a, T>(ptr: *const T, len: uint) -> &'a [T] |
| 44 | +``` |
| 45 | +The lifetime on the return value is inferred from the call context. |
| 46 | + |
| 47 | +The current usage can be mechanically changed, while keeping an eye on |
| 48 | +possible lifetime leaks which need to be worked around by e.g. providing |
| 49 | +safe helper functions establishing lifetime guarantees, as described below. |
| 50 | + |
| 51 | +## Document the unsafety |
| 52 | + |
| 53 | +In many cases, the lifetime parameter will come annotated or elided from the |
| 54 | +call context. The example above, adapted to the new convention, is safe |
| 55 | +despite lack of any explicit annotation: |
| 56 | +```rust |
| 57 | +impl Outer { |
| 58 | + fn inner_str(&self) -> &[u8] { |
| 59 | + unsafe { |
| 60 | + let p = ffi::outer_get_inner_str(&self.raw); |
| 61 | + std::slice::from_raw_buf(p, libc::strlen(p)) |
| 62 | + } |
| 63 | + } |
| 64 | +} |
| 65 | +``` |
| 66 | + |
| 67 | +In other cases, the inferred lifetime will not be correct: |
| 68 | +```rust |
| 69 | + let foo = unsafe { ffi::new_foo() }; |
| 70 | + let s = unsafe { std::slice::from_raw_buf(foo.data, foo.len) }; |
| 71 | + |
| 72 | + // Some lines later |
| 73 | + unsafe { ffi::free_foo(foo) }; |
| 74 | + |
| 75 | + // More lines later |
| 76 | + let guess_what = s[0]; |
| 77 | + // The lifetime of s is inferred to extend to the line above. |
| 78 | + // That code told you it's unsafe, didn't it? |
| 79 | +``` |
| 80 | + |
| 81 | +Given that the function is unsafe, the code author should exercise due care |
| 82 | +when using it. However, the pitfall here is not readily apparent at the |
| 83 | +place where the invalid usage happens, so it can be easily committed by an |
| 84 | +inexperienced user, or inadvertently slipped in with a later edit. |
| 85 | + |
| 86 | +To mitigate this, the documentation on the reference-from-raw functions |
| 87 | +should include caveats warning about possible misuse and suggesting ways to |
| 88 | +avoid it. When an 'anchor' object providing the lifetime is available, the |
| 89 | +best practice is to create a safe helper function or method, taking a |
| 90 | +reference to the anchor object as input for the lifetime parameter, like in |
| 91 | +the example above. The lifetime can also be explicitly assigned with |
| 92 | +`std::mem::copy_lifetime` or `std::mem::copy_lifetime_mut`, or annotated when |
| 93 | +possible. |
| 94 | + |
| 95 | +## Fix copy_mut_lifetime |
| 96 | + |
| 97 | +To improve composability in cases when the lifetime does need to be assigned |
| 98 | +explicitly, the first parameter of `std::mem::copy_mut_lifetime` |
| 99 | +should be made an immutable reference. There is no reason for the lifetime |
| 100 | +anchor to be mutable: the pointer's mutability is usually the relevant |
| 101 | +question, and it's an unsafe function to begin with. This wart may |
| 102 | +breed tedious, mut-happy, or transmute-happy code, when e.g. a container |
| 103 | +providing the lifetime for a mutable view into its contents is not itself |
| 104 | +necessarily mutable. |
| 105 | + |
| 106 | +# Drawbacks |
| 107 | + |
| 108 | +The implicitly inferred lifetimes are unsafe in sneaky ways, so care is |
| 109 | +required when using the borrowed values. |
| 110 | + |
| 111 | +Changing the existing functions is an API break. |
| 112 | + |
| 113 | +# Alternatives |
| 114 | + |
| 115 | +An earlier revision of this RFC proposed adding a generic input parameter to |
| 116 | +determine the lifetime of the returned reference: |
| 117 | +```rust |
| 118 | +unsafe fn from_raw_buf<'a, T, U: Sized?>(ptr: *const T, len: uint, |
| 119 | + life_anchor: &'a U) |
| 120 | + -> &'a [T] |
| 121 | +``` |
| 122 | +However, an object with a suitable lifetime is not always available |
| 123 | +in the context of the call. In line with the general trend in Rust libraries |
| 124 | +to favor composability, `std::mem::copy_lifetime` and |
| 125 | +`std::mem::copy_lifetime_mut` should be the principal methods to explicitly |
| 126 | +adjust a lifetime. |
| 127 | + |
| 128 | +# Unresolved questions |
| 129 | + |
| 130 | +Should the change in function parameter signatures be done before 1.0? |
0 commit comments