Skip to content

LLVM nocapture attribute is used incorrectly #137668

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
tmiasko opened this issue Feb 26, 2025 · 10 comments
Open

LLVM nocapture attribute is used incorrectly #137668

tmiasko opened this issue Feb 26, 2025 · 10 comments
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-opsem Relevant to the opsem team

Comments

@tmiasko
Copy link
Contributor

tmiasko commented Feb 26, 2025

The implementation of indirect_pass_mode contains following code:

        // For non-immediate arguments the callee gets its own copy of
        // the value on the stack, so there are no aliases. It's also
        // program-invisible so can't possibly capture
        attrs
            .set(ArgAttribute::NoAlias)
            .set(ArgAttribute::NoCapture)
            .set(ArgAttribute::NonNull)
            .set(ArgAttribute::NoUndef);

The claim that a callee can't possibly capture the argument is incorrect. The callee can obtain the address of the parameter and use it in arbitrary way. Consider the following example. In the call to f, b is an address of a static S, whose memory should be disjoint from parameter a. Consequently f should return false:

static S: [u32; 64]  = [0; 64];

#[inline(never)]
pub fn f(a: [u32; 64], b: usize) -> bool {
    &a as *const _ as usize == b
}

fn main() {
    assert!(!f(S, &S as *const _ as usize));
}
$ rustc a.rs -Copt-level=0 && ./a
$ rustc a.rs -Copt-level=1 && ./a

thread 'main' panicked at a.rs:9:5:
assertion failed: !f(S, &S as *const _ as usize)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
@tmiasko tmiasko added A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-opsem Relevant to the opsem team labels Feb 26, 2025
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Feb 26, 2025
@hanna-kruppe
Copy link
Contributor

This example depends on the pointer address. But recently LLVM has started separating “address capture” from “provenance capture” — I think it’s correct to say the latter isn’t captured here, meaning that any pointer to a constructed in f should be considered dangling (by Rust semantics) after the corresponding call to f returns. cc @nikic

@nikic
Copy link
Contributor

nikic commented Feb 26, 2025

Yes, I believe making these captures(address) would be legal. (This is LLVM 21 only though.)

@jieyouxu jieyouxu added I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Feb 26, 2025
@rustbot rustbot added the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Feb 26, 2025
@jieyouxu
Copy link
Member

(Tagging this as unsound but feel free to readjust/reprioritize)

@apiraino
Copy link
Contributor

Is this something that can be taken care upon upgrading to LLVM 21 or better discuss this earlier?

I've tried digging a bit but I'm a bit unsure about the details: is indirect_pass_mode nightly-only? A git blame seem to indicate that that code was touched years ago (so I think it's ok to not consider this a regression).

thanks

@nikic
Copy link
Contributor

nikic commented Feb 27, 2025

This issue isn't nightly only. In fact, it has been known for a long time (I'm pretty sure we already have an issue for it somewhere, but I couldn't find it).

As far as I know, the only thing this affects is ability to observe whether an intermediate copy has been optimized away, it doesn't allow you to do anything beyond that (like modify the memory of the original, as that would be UB due to reference semantics).

@tmiasko tmiasko self-assigned this Feb 27, 2025
@tmiasko
Copy link
Contributor Author

tmiasko commented Feb 27, 2025

I don't recall a GitHub issue about this problem, but we did discuss it on Zulip before. The hypothetical demonstration given there is also miscompiled now.

I opened #137726 to evaluate the perf impact of removing the attribute. An alternative would be to infer it - analogous to readonly.

@RalfJung
Copy link
Member

RalfJung commented Feb 28, 2025

Cc @rust-lang/opsem

FWIW the intent I have with the operational semantics is that the assertion failure is a legal outcome here -- see the discussion in rust-lang/unsafe-code-guidelines#416 and #71117. However, it is so far unclear how that could be achieved.

The unsoundness is the nocapture, but even if we remove nocapture the assertion will still fail.

@RalfJung
Copy link
Member

Yes, I believe making these captures(address) would be legal. (This is LLVM 21 only though.)

Note that the function can capture the provenance of this pointer, too. We are trying to make it so that that provenance ceases to be valid when the function returns, but that turns out to be in conflict with copy propagation (rust-lang/unsafe-code-guidelines#556).

@tmiasko
Copy link
Contributor Author

tmiasko commented Mar 3, 2025

Note that the function can capture the provenance of this pointer, too.

Can you give an example where captured pointer is used after the function returns? I thought it would be impossible to capture provenance through that argument (so that it remains valid to access) because either memory is temporary or protected.

We are trying to make it so that that provenance ceases to be valid when the function returns, but that turns out to be in conflict with copy propagation (rust-lang/unsafe-code-guidelines#556).

Do we have a rust issue for miscompilation in the first example?

@RalfJung
Copy link
Member

RalfJung commented Mar 3, 2025

Can you give an example where captured pointer is used after the function returns? I thought it would be impossible to capture provenance through that argument (so that it remains valid to access) because either memory is temporary or protected.

You just &raw mut the argument and store the result somewhere. In the generated LLVM IR, that will end up just storing the ptr-typed argument.

As I mentioned, the provenance you get that way is invalid once the function returns, though the system we use to ensure that is causing problems (rust-lang/unsafe-code-guidelines#556).

Do we have a rust issue for miscompilation in the first example?

Is it a miscompilation? This seems like an optimization we want to allow, so it's more of a spec bug. Now, whether that spec bug can be solved without other compromises is unclear...

@apiraino apiraino removed the I-prioritize Issue: Indicates that prioritization has been requested for this issue. label Mar 6, 2025
@tmiasko tmiasko removed their assignment Mar 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-codegen Area: Code generation A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-opsem Relevant to the opsem team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants