Skip to content

[WIP] Optionally model the LLVM assume statement in Boogie #478

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Aug 9, 2019

Conversation

keram88
Copy link
Contributor

@keram88 keram88 commented Aug 1, 2019

Currently we don't do anything with LLVM's assume statement. This is because we
need to be sure that the assumption provided to the function is correct. At best
it might help with verification, but it could also lead SMACK to report erroneous
bugs (or no bugs!). For now, this removes this intrinsic, preventing a warning about an unsound
call to this function.

I could add the closure in the stmtMap, but I think it's valuable to have the rationale somewhere in this function.

@keram88
Copy link
Contributor Author

keram88 commented Aug 1, 2019

My primary impetus to add this is that Rust generates these in just about every program.

@keram88 keram88 requested review from zvonimir and shaobo-he August 1, 2019 05:17
Copy link
Contributor

@shaobo-he shaobo-he left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@keram88
Copy link
Contributor Author

keram88 commented Aug 1, 2019

Travis is failing this because of clang-format failing something in lib/smack/Regions.cpp. As far as I can tell the failing code is removed in LLVM 7.1. Regardless, this PR is unrelated to this file, and all regressions pass.

Perhaps we should merge this after LLVM 7.1 lands.

@michael-emmi
Copy link
Contributor

I’m curious about knowing more; is there a reference on this?

Overall, I wonder whether it might instead be a good idea to take advantage of these assumptions by modeling them as boogie assumes. Depending on why they’re added. For instance, if they correspond to things the rust type checker has already proven, then why not use them?

@keram88
Copy link
Contributor Author

keram88 commented Aug 1, 2019

@michael-emmi That's a good point, I should have discussed how Rust uses this function.

Rust doesn't appear to do anything interesting with llvm.assume. In all the Rust benchmarks, the uses of llvm.assume follow this pattern:

i1 x = (some boolean expression);
llvm.assume(x <= 1);

Basically, Rust is telling LLVM its booleans are 0 or 1, but LLVM should know this since their type is i1.

The strongest thing I think Rust has ever done with respect to its type-system is add the noalias attribute to function arguments. This issue goes into more detail about some of the problems with this: rust-lang/rust#53105.

Another approach is to make generating Boogie assumptions opt-in or opt-out through a command line argument. Something like --use-llvm-assume or --no-llvm-assume. @shaobo-he @zvonimir What do you think about this?

We should update #53 regardless.

@shaobo-he
Copy link
Contributor

@michael-emmi That's a good point, I should have discussed how Rust uses this function.

Rust doesn't appear to do anything interesting with llvm.assume. In all the Rust benchmarks, the uses of llvm.assume follow this pattern:

i1 x = (some boolean expression);
llvm.assume(x <= 1);

Basically, Rust is telling LLVM its booleans are 0 or 1, but LLVM should know this since their type is i1.

The strongest thing I think Rust has ever done with respect to its type-system is add the noalias attribute to function arguments. This issue goes into more detail about some of the problems with this: rust-lang/rust#53105.

Another approach is to make generating Boogie assumptions opt-in or opt-out through a command line argument. Something like --use-llvm-assume or --no-llvm-assume. @shaobo-he @zvonimir What do you think about this?

We should update #53 regardless.

I think without evident benefits of leveraging the LLVM assume intrinsic, the current PR is the better option.

@zvonimir
Copy link
Member

zvonimir commented Aug 2, 2019

Please fix formatting so that CI is passing. It is an easy fix. Also, please address this compilation warning if possible:

../lib/smack/SmackInstGenerator.cpp:896:31: warning: lambda capture 'this' is not used [-Wunused-lambda-capture]
1984  static const auto assume = [this](CallInst *ci) { return; };

@zvonimir
Copy link
Member

zvonimir commented Aug 2, 2019

I see. Formatting got broken elsewhere. I'll fix it. You still have to take care of the warning though.

@michael-emmi
Copy link
Contributor

Could we at least understand whether the assumes here represent facts that have been “proven” by previous analysis, or whether e.g., they represent speculative optimization hints that may be invalidated and require unoptimized fallback?

@zvonimir
Copy link
Member

zvonimir commented Aug 2, 2019

From the LLVM language spec (https://llvm.org/docs/LangRef.html#llvm-assume-intrinsic):

The llvm.assume allows the optimizer to assume that the provided condition is true. This information can then be used in simplifying other parts of the code.

and also

If the condition is violated during execution, the behavior is undefined.

So to me it seems that the purpose of assume intrinsics is to provide "proven" facts.

I think we should ultimately have a command line switch --leverage-llvm-assumes that turns on using them for verification, and we drop them otherwise. Then we can experiment with this more.

@keram88
Copy link
Contributor Author

keram88 commented Aug 2, 2019

From https://llvm.org/docs/LangRef.html#llvm-assume-intrinsic:

The intrinsic allows the optimizer to assume that the provided condition is
always true whenever the control flow reaches the intrinsic call. No code is
generated for this intrinsic, and instructions that contribute only to the
provided condition are not used for code generation. If the condition is
violated during execution, the behavior is undefined.

Note that the optimizer might limit the transformations performed on values used
by the llvm.assume intrinsic in order to preserve the instructions only used to
form the intrinsic’s input argument. This might prove undesirable if the extra
information provided by the llvm.assume intrinsic does not cause sufficient
overall improvement in code quality. For this reason, llvm.assume should not be
used to document basic mathematical invariants that the optimizer can otherwise
deduce or facts that are of little use to the optimizer.

Could we at least understand whether the assumes here represent facts that have been “proven” by previous analysis, or whether e.g., they represent speculative optimization hints that may be invalidated and require unoptimized fallback?

LLVM appears to treat these as axioms, so it won't generate any code to handle an invalid assumption.

I think the uses of llvm.assume by Rust I gave earlier are of the kind that LLVM says not to generate, but this doesn't affect the correctness of SMACK. One of the most powerful uses of the assume statement is documentation of loop invariants from the compiler. LLVM probably can't discover these facts due to the loss of semantic information. This information is likely very helpful for SMACK.

I'm starting to fall on the side of generating boogie assumes for these; they can't hurt verification time and might help. My biggest concern is that the compiler will generate an assume which is false. However, we already assume that the compiler reasonably transforms the source. If erroneous assumes were being routinely generated, then incorrect optimizations in LLVM would occur and compiler bugs would be reported. My other consideration is that the compiler provides a correct assumption, but an undefined behavior invalidates it. I'm not too worried about this because the program is already ill-formed, and SMACK has an expanding set of checks for undefined behaviors to help diagnose this.

@keram88
Copy link
Contributor Author

keram88 commented Aug 2, 2019

So to me it seems that the purpose of assume intrinsics is to provide "proven" facts.

I think we should ultimately have a command line switch --leverage-llvm-assumes that turns on using them for verification, and we drop them otherwise. Then we can experiment with this more.

This seems reasonable. We should try generating assumes to see how they behave before enabling or disabling them generally. I want to update this PR with your proposal.

@michael-emmi
Copy link
Contributor

Yea, I agree that experimenting with whether these generated llvm.assumes provide useful information to the verifier is the right thing to do. My hunch would also be that they could help significantly, e.g., if they carried higher-level program invariants from, e.g., rustc to LLVM.

One small note: it’s not obvious to me that additional assumes “can’t hurt verification time”, but perhaps that is another conversation altogether :-)

@keram88
Copy link
Contributor Author

keram88 commented Aug 2, 2019

One small note: it’s not obvious to me that additional assumes “can’t hurt verification time”, but perhaps that is another conversation altogether :-)

This is worth investigating. I know some of our floating-point axioms slowed down verification. Solvers are complex beasts and it's hard to know what hurts or helps.

@keram88 keram88 changed the title Model the llvm.assume intrinsic as a no-op [WIP] Model the llvm.assume intrinsic as a no-op Aug 2, 2019
@keram88 keram88 changed the title [WIP] Model the llvm.assume intrinsic as a no-op [WIP] Optionally model the LLVM assume statement as a Boogie assume statement Aug 2, 2019
@keram88 keram88 changed the title [WIP] Optionally model the LLVM assume statement as a Boogie assume statement [WIP] Optionally model the LLVM assume statement in Boogie Aug 2, 2019
@keram88
Copy link
Contributor Author

keram88 commented Aug 3, 2019

I implemented Zvonimir's suggestion. I witnessed one interesting case along the lines of this (translated from Rust):

int x = __VERIFIER_nondet_int();
int y = 2*x;
...
int z = y&1;
...
// @llvm.assume(z == 0);
__VERIFIER_assert(z == 0);

SMACK is able to verify this with --leverage-llvm-assumes despite --bit-precise being off; without --bit-precice, z would be nondet and the assertion would fail.

Rust also generates some useful assume statements, namely it states that its pointers are not null.

@keram88
Copy link
Contributor Author

keram88 commented Aug 3, 2019

What do you think of also adding a "strict" assume mode? Clang has __builtin_assume:
https://clang.llvm.org/docs/LanguageExtensions.html#builtin-assume
This seems to always emit an @llvm.assume statement. We could emit an assert to check that the assumption always holds.

@zvonimir
Copy link
Member

zvonimir commented Aug 3, 2019

I see. So there should be a command line argument with 3 options: ignore assumes, assume assumes, and assert assumes. I like that. Please add the assert assumes option as well then. Good suggestion.

@zvonimir
Copy link
Member

zvonimir commented Aug 3, 2019

Also, could you create a regression that checks for this?

@zvonimir
Copy link
Member

zvonimir commented Aug 7, 2019

@keram88 : What's the status of this pull request? Any chance you could finish it this week?

@keram88
Copy link
Contributor Author

keram88 commented Aug 9, 2019

I have implemented the following:

  • --llvm-assumes=none: This ignores LLVM assume statements as SMACK currently does, and is the default behavior.
  • --llvm-assumes=use: This uses LLVM assumes as boogie assumes.
  • --llvm-assumes=check: This uses LLVM assumes and additionally checks their validity.

I added the regressions here: https://github.com/smackers/smack/tree/no-warn-assume/test/special.
I wasn't sure where to put them since they are kind of weird tests. I think it covers the range of odd behaviors that could be verified using __builtin_assume, such as confirming that 1=0.

But as I found in Rust, using the assumes generated by the compiler could help us avoid running bit-precise.

@keram88
Copy link
Contributor Author

keram88 commented Aug 9, 2019

I addressed @zvonimir's comments. I don't like force pushing, so if this looks good, I can squash the commits.

Copy link
Member

@zvonimir zvonimir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Please squash and clean up your commits and then I'll merge it.

@keram88
Copy link
Contributor Author

keram88 commented Aug 9, 2019

Done. Thanks, @michael-emmi for driving this toward what I think is a much better solution.

@keram88 keram88 force-pushed the no-warn-assume branch 2 times, most recently from d55cf00 to be1a5d1 Compare August 9, 2019 17:58
LLVM has the llvm.assume intrinsic which is intended to allow a
compiler to communicate facts about a program to LLVM's
optimizers. These facts may be useful for verification, so this allows
them to be used. Additionally, checking of assumptions is possible in
order to diagnose faulty assumptions arising from source code or
compiler bugs.

Specifically, a new flag "--llvm-assumes" is added with three modes:
- "none": This ignores the llvm.assume intrinsic and is the default
  mode.
- "use": This enables generation of Boogie assumes from llvm.assume
  instructions.
- "check": This enables assertion checking of the validity of assumed
  statements.
@zvonimir zvonimir merged commit 000c3cb into develop Aug 9, 2019
@zvonimir zvonimir deleted the no-warn-assume branch August 9, 2019 20:43
shaobo-he added a commit that referenced this pull request Feb 17, 2020
This flag is no longer needed because of PR #478

Fixes #498
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants