-
Notifications
You must be signed in to change notification settings - Fork 386
Lazy or eager ptr/int casts? #786
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
casting references to raw pointers and then to integers is also a safe operation. Why is the argument different for ptr-to-int? Is it because int-to-ptr is fallible in contrast to ptr-to-int?
that sentence is a bit hard to parse. Are you saying that LLVM misoptimizes pointer values in variables with int type? And thus we should keep enforcing that invariant in miri, making the ptr to int transmute UB |
Exactly.
I am saying that there is no known formal model that justifies what LLVM is doing without making ptr-to-int transmutes UB. Constructing miscompilations is possible in theory, making LLVM actually apply the right optimizations in the right order for this may or may not be possible. But this is an academic problem in every sense of the word, and I think ruling out ptr-to-int transmutes in the surface language currently is not a good idea, as much as I'd like to do it. |
So... is there any reason we couldn't hide making-the-transmutes-not-UB behind a miri command line flag and defaulting to eager conversion? |
We could. I am just afraid of an exploding test matrix. |
well, compiletest supports running tests in multiple passes with differing flags and error markings. Unless you mean the runtime of the tests |
I think for now that's not worth it -- the extra value of this stricter mode does not justify the effort. What would be nice is to have a mode that can still detect alignment failures -- some weaker form of intptrcast that lets us remove most of the hacks we carry, but ignores "accidental" alignment when checking memory accesses. I think we might even be implementing this accidentally currently, that's basically this FIXME. ;) |
For detecting alignment failures, after consulting a few people the plan for now is to allow code per default to "guess" the right alignment (this reduces false positives), but to emit a warning and offer an option to turn the warning into an error (or silence it). |
Looks like we came to an agreement on how to progress; the implementation is already tracked at #224. |
With intptrcast coming up, we have to make a decision whether we want the ptr/int casts to happen "lazily" or "eagerly". Ultimately this will be a question for UCG/lang-team to decide as part of the Rust/MIR semantics, and there are some hard open questions here, some of which are discussed in this paper. But for now we have to pick something.
ptr-to-int
"Eager" ptr-to-int cast means that we cast from ptr to int when executing the cast MIR statement (corresponding to the
as
or a coercion in the surface language)."Lazy" ptr-to-int cast is basically what we do now, where the cast statement does nothing and we only perform the actual conversion when the int is subject to an operation where we need the raw bits.
In first approximation, with eager casts we have an invariant that a varable of integer type carries an integer value; with lazy casts we don't.
Eager casts are somewhat easier to understand, less confusing. Extra invariants are nice. However, that leaves open the question about code that "circumvents" the cast, such as transmuting a pointer to an integer:
If we want to allow all of these operations, the aforementioned invariant is not actually holding, and we still have to remember to
force_bits
everywhere. Not even allowing the transmute would basically mean enforcing the aforementioned invariant: when validating integers, we don't allow pointers.I am torn between allowing as much code as possible that people reasonably expect to work, simplifying the code by minimizing the amount of places where we
force_bits
/force_ptr
, and knowing that the only answer that is actually formally good enough to justify LLVM's optimizations (excluding all pointer values at integer types) is likely going to upset people.^^The only thing I am fairly sure I want is that a ptr-to-int-cast in the surface language actually does
ptr_to_int
in Miri. I know I argued against that in the past, but came to realize it is confusing, and also doing this cast helps a lot with testing.int-to-ptr
For the other direction, we cannot eagerly do int-to-ptr conversion when an integer gets turned into a raw pointer as that is a safe operation. And similarly the user can transmute integers to pointers, so even if we cast eagerly when a reference gets created, we still have to handle integer values in the memory access operations.
So, we phase a similar situation as in the ptr-to-int case: if we allow maximal amounts of code, we have to handle integer values everywhere, we cannot have any meaningful extra invariant. And still we should probably make sure that when a reference gets created, it gets turned into a pointer value. Or maybe retagging can just take care of that.
@oli-obk (and anyone else reading) any opinions?
The text was updated successfully, but these errors were encountered: