Skip to content

Refactor Dataflow Analysis of Casts #883

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Apr 21, 2023
Merged

Conversation

aneksteind
Copy link
Contributor

@aneksteind aneksteind commented Apr 7, 2023

Fixes #878.

Lays the groundwork to check whether or not certain types of ptr-to-ptr casts are supported.

Implements the reorganization specified in #878, and lays the groundwork for #839 as outlined in #839 (comment)

Copy link
Contributor

@kkysen kkysen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit hard to tell what changed vs. just moved in this PR. For f2b89e3, there are 2 moves, but also a few changes. Could those be split out?

Copy link
Contributor

@kkysen kkysen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused by the change in 57ff503 alone, as that code seems to be moved (not sure if changed at all) in 47a053f. Could those commits be combined so it's clear that it's a move? Otherwise, I'm not sure how 57ff503 does what the commit message says.

@aneksteind
Copy link
Contributor Author

aneksteind commented Apr 7, 2023

@kkysen 47a053f wasn't meant to signify a movement. It's just some additional checks that were made elsewhere previously. Yes, checks that were in one place are now present in another, but semantically those two commits represent 1. cutting out unnecessary checks given what's now returned (the labeled casted-to type) and 2. adding some cast-specific type checks to type_check.rs.

As far as 57ff503 itself is concerned, before we were taking portions of the casted-from type (T) and the casted-to type (U) and combining them into another type (X). This instead just labels U and returns LTy(U).

@spernsteiner
Copy link
Collaborator

Any idea why this changes the behavior on the as_ptr.rs test case, which doesn't contain any casts?

Otherwise all tests are producing identical output, with the exception of insertion_sort_driver.rs, which I suspect is another instance of whatever change we're seeing in as_ptr.rs.

@aneksteind
Copy link
Contributor Author

aneksteind commented Apr 10, 2023

@spernsteiner here is the MIR. I think the difference is in how we label the _3 = move _4 as &[i32] (Pointer(Unsize)); cast to return labeled U instead of a labeled X: 57ff503

// WARNING: This output format is intended for human consumers only
// and is subject to change without notice. Knock yourself out.
fn array_as_ptr_load(_1: &[i32; 10]) -> i32 {
    debug x => _1;                       // in scope 0 at [src/lib.rs:5:33: 5:34](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021#)
    let mut _0: i32;                     // return place in scope 0 at [src/lib.rs:5:51: 5:54](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021#)
    let _2: *const i32;                  // in scope 0 at [src/lib.rs:7:9: 7:10](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021#)
    let mut _3: &[i32];                  // in scope 0 at [src/lib.rs:7:13: 7:23](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021#)
    let mut _4: &[i32; 10];              // in scope 0 at [src/lib.rs:7:13: 7:23](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021#)
    scope 1 {
        debug p => _2;                   // in scope 1 at [src/lib.rs:7:9: 7:10](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021#)
    }

    bb0: {
        _4 = _1;                         // scope 0 at [src/lib.rs:7:13: 7:23](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021#)
        _3 = move _4 as &[i32] (Pointer(Unsize)); // scope 0 at [src/lib.rs:7:13: 7:23](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021#)
        _2 = core::slice::<impl [i32]>::as_ptr(move _3) -> bb1; // scope 0 at [src/lib.rs:7:13: 7:23](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021#)
                                         // mir::Constant
                                         // + span: [src/lib.rs:7:15: 7:21](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021#)
                                         // + literal: Const { ty: for<'a> fn(&'a [i32]) -> *const i32 {core::slice::<impl [i32]>::as_ptr}, val: Value(<ZST>) }
    }

    bb1: {
        _0 = (*_2);                      // scope 1 at [src/lib.rs:8:5: 8:7](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021#)
        return;                          // scope 0 at [src/lib.rs:9:2: 9:2](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021#)
    }
}

array_as_ptr_load::{constant#0}: usize = {
    let mut _0: usize;                   // return place in scope 0 at [src/lib.rs:5:43: 5:45](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021#)

    bb0: {
        _0 = const 10_usize;             // scope 0 at [src/lib.rs:5:43: 5:45](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021#)
        return;                          // scope 0 at [src/lib.rs:5:43: 5:45](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021#)
    }
}

@aneksteind aneksteind force-pushed the refactor.analysis.cast branch 2 times, most recently from fdef6d0 to c400932 Compare April 10, 2023 19:48
@spernsteiner
Copy link
Collaborator

spernsteiner commented Apr 11, 2023

I think there is actually a bug in this as_ptr.rs example. Specifically, it seems like we're missing a dataflow edge somewhere between _4 and _3.

Both this branch and master give _3 offset permissions:

_3 (37: x.as_ptr()): addr_of = UNIQUE, type = READ | UNIQUE | OFFSET_ADD | OFFSET_SUB#&[i32][(empty)#[i32][(empty)#i32[]]]

However, master gives the same permissions to _4, while this branch gives no permissions:

-_4 (37: x.as_ptr()): addr_of = UNIQUE, type = READ | UNIQUE | OFFSET_ADD | OFFSET_SUB#&[i32; 10][(empty)#[i32; 10][(empty)#i32[]]]
+_4 (36: x.as_ptr()): addr_of = UNIQUE, type = UNIQUE#&[i32; 10][(empty)#[i32; 10][(empty)#i32[]]]

This branch similarly gives no permissions to _1, but since _1 only flows into _4, the lack of permissions on _4 seems like the real problem.

@aneksteind
Copy link
Contributor Author

Ah yes, I see that the way X was being constructed before included using the LTy of op and its args to construct LTy of X. So now that connection's been severed. I'll look for a way to maintain those permissions, but I think we risk running into the same parity-checking in terms of structure shape as before.

@aneksteind
Copy link
Contributor Author

aneksteind commented Apr 11, 2023

@spernsteiner I put the old functionality back for now. I'm thinking about how to avoid combining T and U into X but am unsure how to do that in general when the types are not of the same structure. Seems like we have to do it by hand on a case-by-case basis except where the pointee types are the same. Also, I think the rewriting bug I thought this original solution had fixed (but didn't) is because of the propagation of the offset permission not taking into account that the original type is already an array type that can be offset (so it rewrites it with an extra []). Maybe the propagation upward should only go as far as the as_ptr call. WDYT?

@spernsteiner
Copy link
Collaborator

I think it would be cleaner and more in line with the purpose of this branch to revert that last commit (so the rvalue_tys entry for a cast is always just the labeled target type) and instead implement the equivalent of the old rules by setting up appropriate dataflow edges inside dataflow::type_check. For the Unsize in this example, with an operand type of & /*p0*/ [T0; n] and an rvalue type of & /*p1*/ [T1], it should record an edge from p0 to p1 (that is, make p1's permissions a subset of p0's) and assert that T0 is equivalent to T1. This is similar to what we normally do for assignments (like an assignment from & /*p2*/ T2 to & /*p3*/ T3), but here one side's pointee type is wrapped in TyKind::Array while the other's uses TyKind::Slice (and also we're doing this "assignment" between the operand type and the expected/rvalue_tys type of a single rvalue, rather than between the LHS and RHS of a StatementKind::Assign).

@spernsteiner
Copy link
Collaborator

Forgot to mention - since this branch is meant to be purely a refactor, it's probably better to leave fixes like the &[[..]] thing to a separate branch.

@aneksteind
Copy link
Contributor Author

@spernsteiner reverted the last commit and setup dataflow rules in 2807326. I'm fine saving fixes for another PR but am still curious if you agree whether or not I have the right idea regarding the cause.

@aneksteind
Copy link
Contributor Author

@spernsteiner would you mind taking another look at this?

@spernsteiner
Copy link
Collaborator

I think the rewriting bug I thought this original solution had fixed (but didn't) is because of the propagation of the offset permission not taking into account that the original type is already an array type that can be offset (so it rewrites it with an extra []). Maybe the propagation upward should only go as far as the as_ptr call. WDYT?

Sounds plausible to me. And I think your proposed fix sounds good - the *mut T output of as_ptr might be OFFSET even when the *mut [T; N] input is not. So as_ptr should not propagate OFFSET from its output to its input.

@aneksteind
Copy link
Contributor Author

Sounds plausible to me. And I think your proposed fix sounds good - the *mut T output of as_ptr might be OFFSET even when the *mut [T; N] input is not. So as_ptr should not propagate OFFSET from its output to its input.

Thanks @spernsteiner, I will look into this and add the fix in another PR

As per the discussion [here](#839 (comment)),
type_of_rvalue was combining portions of types T and U and returning labeled type X. This
forced some similarity checking in the types that ideally should be placed in a
yet-to-exist is-castable-to check. This commit labels the type U and makes that the LabeledTy
for the RValue of each cast.
…guments from T and types of U"

This reverts commit 2bbf77e.
@aneksteind aneksteind force-pushed the refactor.analysis.cast branch from 8473be0 to 8560dda Compare April 18, 2023 16:25
Copy link
Contributor

@kkysen kkysen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now. As I commented earlier, there are some parts where I'm not sure it'll be right once #839 and #841 land, but I'll just fix them in those PRs.

Also, could you just update the PR title to reflect the description? Thanks!

@aneksteind aneksteind changed the title Refactor.analysis.cast Refactor Dataflow Analysis of Casts Apr 21, 2023
@aneksteind aneksteind merged commit 73d9b48 into master Apr 21, 2023
@aneksteind aneksteind deleted the refactor.analysis.cast branch April 21, 2023 13:56
kkysen added a commit that referenced this pull request Apr 26, 2023
#900)

This cleans up and simplifies some of the code from #883 in preparation
for the string cast PRs, #739 and #741.

It extracts things into a separate `visit_cast` method, simplifies a few
things, and clarifies exactly which `Ty`s and `LTy`s are from and to in
a cast. This makes it easier to understand and read, and makes it
simpler to rebase #739 onto it.
kkysen added a commit that referenced this pull request May 1, 2023
…cker::do_unify` (back to strict equality) as #883 resolved this (#839 (comment)).
kkysen added a commit that referenced this pull request May 1, 2023
…cker::do_unify` (back to strict equality) as #883 resolved this (#839 (comment)).
kkysen added a commit that referenced this pull request May 1, 2023
…#883.  It's not complete, but fixes the crash in `as_ptr.rs`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

c2rust-analyze: use rvalue_tys for all casts
3 participants