Skip to content

Refactor to allow partial views #245

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Nov 23, 2020
Merged

Refactor to allow partial views #245

merged 19 commits into from
Nov 23, 2020

Conversation

vbarrielle
Copy link
Collaborator

This is a refactoring that should fix #244. This is very much WIP, but the basic test suite is passing. @mulimoen I'd be very interested in your feedback.

This introduces lots of breaking changes, for instance lots of iterators are now using an impl Iterator pattern instead of an explicit type.

There are still a few rough edges, which may not be addressed in this PR but should be addressed before 0.10 is released:

  • middle_outer_views is cumbersome to use, it probably should take a start and end parameter (or a range) to be more convenient and less error prone.
  • the IndPtr type is not used to build owned indptr types, but it could be interesting to use it to enforce good patterns. As it would not be practical to have runtime checks, this would be a crate-only API
  • smmp was quite hard to port, mostly because it does not take a CsMatViewI in its API. I probably should change that before 0.10.

As suggested by @mulimoen
(#244 (comment)), it's
doable to have a concrete type abstracting the classic indptr case with
the first element at 0, and the sliced matrix case, by simply performing
the subtraction everywhere.
This API will be useful to interface with C APIs, where we need to pass
a proper indptr.
This way iterating over indices and data should become ergonomic (though
we'll have to clone the range).
@mulimoen
Copy link
Collaborator

mulimoen commented Nov 14, 2020

@vbarrielle I really like this addition! The methods for iterating are really quite nice, maybe we should implement IntoIterator for IndPtrBase which forwards to iter_outer_sz?

The only problem is of course the breaking change introduced. I'll go through some of the crates using sprs and see what we can do to minimize the impact to other crates. One cheap change is

diff --git a/src/sparse/indptr.rs b/src/sparse/indptr.rs
index 3b4b0f7..7320a30 100644
--- a/src/sparse/indptr.rs
+++ b/src/sparse/indptr.rs
@@ -358,6 +358,18 @@ impl<'a, Iptr: SpIndex> IndPtrView<'a, Iptr> {
     }
 }
 
+/// Allows comparison to vectors and slices
+impl<Iptr: SpIndex, IptrStorage, IptrStorage2> std::cmp::PartialEq<IptrStorage2>
+    for IndPtrBase<Iptr, IptrStorage>
+where
+    IptrStorage: Deref<Target = [Iptr]>,
+    IptrStorage2: Deref<Target = [Iptr]>,
+{
+    fn eq(&self, other: &IptrStorage2) -> bool {
+        self.raw_storage() == other.deref()
+    }
+}
+
 #[cfg(test)]
 mod tests {
     use super::{IndPtr, IndPtrView};
@@ -476,4 +488,13 @@ mod tests {
         assert_eq!(iter.next().unwrap(), 2);
         assert!(iter.next().is_none());
     }
+
+    #[test]
+    fn compare_with_slices() {
+        let iptr = IndPtrView::new(&[0, 1, 3, 8]).unwrap();
+        assert!(iptr == &[0, 1, 3, 8][..]);
+        assert!(iptr == vec![0, 1, 3, 8]);
+        let iptr = IndPtrView::new(&[1, 1, 3, 8]).unwrap();
+        assert!(iptr == &[1, 1, 3, 8][..]);
+    }
 }

which makes the comparisons in triplet.rs simpler (although they require a slice coercion).

I'll fixup the serde failures shortly to see what clippy wants
Edit: See #246 for a PR against this branch

@vbarrielle
Copy link
Collaborator Author

@mulimoen thanks for the serde fixes. I've fixed the other crates in the workspace as well. Your PartialEq implementation looks nice, I'm on mobile now so I cannot see easily the impact on triplet.rs or dependent crates though. I'll look into it tomorrow (except if you'd rather submit another pr against this branch of course).

I do hope there's not too much breakage in dependent crates, I think it affects most code that wanted to deal with the inner workings of the sparse structure, which should not be that common. We'll see.

@mulimoen
Copy link
Collaborator

I had a look at which crates would be affected by these changes. The dependent crates listed at crates.io does not use the indptr call. A search through github revealed two crates which might experience difficulties:

add a PartialEq for indptr and slices
@vbarrielle
Copy link
Collaborator Author

vbarrielle commented Nov 18, 2020

Thanks @mulimoen for your impact investigations. Looks like vtext and mkl-sys will have to use as_slice, as it seems hard to use a partial view through ffi. Speaking of which, I think the ffi changes I've made in sprs-benches are probably wrong if a partial view is passed...

Edit: I'm a bit sleepy right now, but I think to_proper works actually...

Apparently formatting has changed a tiny bit...
@vbarrielle
Copy link
Collaborator Author

I've added 51606cf which makes it easier to pass a sparse matrix to C interfaces. I think vtext and mkl-sys will be able to use the new to_proper_ffi to upgrade.

I think this PR is mostly done, what do you think @mulimoen ? Do you see something missing? If not, do your prefer #237 to be merged before this one?

@mulimoen
Copy link
Collaborator

I believe 51606cf is a bit unfortunate, any changes made to the returned Cow will possibly invalidate the pointer. I think we can add a warning on to_proper warning about a potential use-after-free, with a reference to CString::as_ptr() and how this must be handled. I've opened an issue for clippy to catch such mistakes (rust-lang/rust-clippy#6349).

I would prefer to fixup #237 after this, I'm missing an IndPtrShadow struct for correctness for deserialization of an IndPtr.

@vbarrielle
Copy link
Collaborator Author

Why not put the warning on to_proper_ffi? I found the correct usage of to_proper for ffi to be quite cumbersome, while to_proper_ffi meant having only one line. You're right though that it would be nice to have a way to prevent dropping the Cow. I'm just not sure to_proper_ffi is more prone to safety issues.

@vbarrielle
Copy link
Collaborator Author

Ok I've thought about it again, and you're right: there's nothing gained by using to_proper_ffi. It's indeed better to have a warning in the documentation of to_proper, and an example showing the correct usage.

Now I'm wondering if there's a way to prevent this kind of issue, eg using Pin, but I fear it's not possible.

@mulimoen
Copy link
Collaborator

You would end up with the same problem if you used to_ptr_ffi().1. I don't think you can work around this, as we have CString with the same problem (although this has a lint)

@vbarrielle
Copy link
Collaborator Author

I removed IndPtrBase::to_proper_ffi and introduced CsMat::proper_indptr, which simply forwards to IndPtrBase::to_proper but avoids the inconvenient pattern I had before where one had to keep around let indptr = mat.indptr() and let proper = indptr.to_proper() to get correct lifetimes. Changing that also showed that I was in fact already misusing to_proper_ffi, as I was not keeping the mat.indptr() around I had an invalid pointer. So this further shows to_proper_ffi was more dangerous. Thanks @mulimoen for pointing it out!

This API is safer than the previously introduced
IndPtrBase::to_proper_ffi, which could very easily lead to use after
free in unsafe code, as pointed out by @mulimoen. This new API still can
be misused, but that's a common pitfall for methods returning Cow. The
documentation thus warns against this pitfall, and illustrates the
proper usage.
@vbarrielle vbarrielle changed the title WIP: Refactor to allow partial views Refactor to allow partial views Nov 22, 2020
@vbarrielle vbarrielle merged commit 7297305 into master Nov 23, 2020
@vbarrielle vbarrielle added this to the 0.10 milestone Dec 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate possible issues with partial views
2 participants