-
Notifications
You must be signed in to change notification settings - Fork 11
Add GetBlobRaw #83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add GetBlobRaw #83
Conversation
This is used by https://github.com/cgwalters/cstor-dist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some initial comments.
I have a bit of a hard time understanding everything that's going on here. It might benefit from some splitting up with descriptive commit messages for each step along the way...
9ebf000
to
a118c05
Compare
5e2deb5
to
89695c5
Compare
I've gone and made a mess out of the commit ordering now, of course. We can sort that out tomorrow.... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed another commit on top of this that I think is closer to the best-of-both of what we were each trying to do. But I don't feel super strongly about it again, so feel free to push whatever else, or if you really want revert the top commit.
It was an interesting learning exercise at least.
Ya! This was a fun one! Your most recent commit resembles my first attempt. The reason I walked away from it was just a sort of "wow, this is a lot of code..." sort of feeling. My approach is definitely getting close to the "magic" line, and maybe stepping over it a bit. Yours is very explicit, but you also pay a price for that in terms of the "lots of code" aspect:
I also think that we're not really losing any actual safety with my approach: the methods calling the
and that knowledge is encoded in the methods themselves. If we were to add a new sort of method in the future that wanted some other configuration of fds and pipe ids, we'd be able to do that without adding a new struct variant for that. Your approach encodes the existing common patterns into structs that the methods can use. One of the strongest arguments for your approach, I think is that there are indeed some commonalities between APIs (the most established one being "read from this fd and then call FinishPipe") which can then be handled in a nicer way. In fact, it's even tempting to make this pattern a method on your FinishPipe struct, except that it also needs access to the socket to make the call... I'm also pretty neutral on this. I prefer my approach (maybe with a few tweaks), but your most recent push is already worlds better than the first version of this patch and has an advantage or two over my approach (cleaner handling of the common "read+finish" case, less magic). I'd also be happy to go with it. Maybe we should let @jeckersb be the tie-breaker? |
fwiw, I'd make at least this change to my code: @@ -347,9 +347,7 @@ fn fixed_from_iterable<T, const N: usize>(
iterable: impl IntoIterator<IntoIter: FusedIterator, Item = T>,
) -> Result<[T; N]> {
let mut iter = iterable.into_iter();
- // We make use of the fact that [_; N].map() returns [_; N]. That makes this a bit more
- // awkward than it would otherwise be, but it's not too bad.
- let collected = [(); N].map(|_| iter.next());
+ let collected = std::array::from_fn(|_| iter.next());
// Count the Some() in `collected` plus leftovers in the iter.
let actual = collected.iter().flatten().count() + iter.count();
if actual == N { or maybe even rewrite it like this (at the cost of an allocation): // Consumes an iterable and tries to convert it to a fixed-size array. Returns Ok([T; N]) if the
// number of items in the iterable was correct, else an error describing the mismatch.
fn fixed_from_iterable<T, const N: usize>(
name: &str,
iterable: impl IntoIterator<IntoIter: FusedIterator, Item = T>,
) -> Result<[T; N]> {
let vec = iterable.into_iter().collect::<Vec<T>>();
vec.try_into().map_err(|vec: Vec<T>| {
Error::Other(format!("Expected {N} {name} but got {}", vec.len()).into())
})
} in nightly we could do: let mut iter = iterable.into_iter();
std::array::try_from_fn(|_| iter.next()).ok_or(Error::Other(
format!("Expected {N} {name} but got fewer").into(),
)) with the loss of knowledge of the exact number of items that were present in the failing case (only important for the error message) and it would also require an explicit check on the iter being empty at the end... really, we just need |
I asked Gemini and it pointed me in the direction of https://docs.rs/itertools/latest/itertools/trait.Itertools.html#method.collect_array which I think we should probably use instead of rolling our own (though I used What do you think about that latest commit? |
e5333c2
to
33c7dc1
Compare
Wait no it's not, for some reason I forgot it's as easy to destructure an array as a tuple. Back to using an array. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm loving this iteration. We're approaching something really nice, in my opinion. In particular, I think the array destructuring approach makes your code a lot nicer.
I don't like taking depends, and itertools isn't in our transitive dependencies yet. We could probably just directly implement the one thing we need, but this thing has like 6000+ rdeps so it's probably okay...
The _nofds variant is nice as well, although I'd maybe flip the naming on its head: request_impl -> request_with_fds(), and the other one just .request()?
For clarity (since I'm probably gonna disappear for the weekend now): I'm happy with the direction here and I'm happy to land whatever variant you guys manage to cook up. :) |
This is needed for my "registry frontend for containers-storage" project, but should also be useful in general as we've discussed moving checksum verification into callers. Then there's no need for a driver etc. Assisted-by: Gemini Code Assist (for unit tests) Signed-off-by: Colin Walters <[email protected]> Signed-off-by: Allison Karlitskaya <[email protected]> Signed-off-by: Colin Walters <[email protected]>
These can be expressed with `impl Trait` argument syntax, making things a bit easier to read. Signed-off-by: Allison Karlitskaya <[email protected]> Signed-off-by: Colin Walters <[email protected]>
It's always nicer to directly hint the return type or let it be inferred by its surroundings. There's only one case where we need to be a bit more explicit about it, and even this explicitness is nicer. Signed-off-by: Allison Karlitskaya <[email protected]> Signed-off-by: Colin Walters <[email protected]>
...and the presence of a pipe ID. Create a new internal PipeId structure to keep track of pipe ids on the server. This is a very small wrapper around NonZeroU32. Have .impl_request() return a Result<> of a 3-tuple: - a deserialized result, as before - a fixed-sized array of fds ([OwnedFd; N]) - a fixed-sized array of PipeIds ([PipeId; M]) Introduce a bit of generics magic to make sure that we got the expected number of fds and pipeids from the other side, or an error. Then you can make calls like: - `let ((), [], []) = self.impl_request("CloseImage", [img.0]).await?;` - `let (digest, [datafd], [pipeid]) = self.impl_request("GetManifest", [img.0]).await?;` - `let (bloblen, [datafd, errfd], []) = self.impl_request("GetRawBlob", args).await?;` and you'll get an Ok() with the statically-correct number of fds and pipeids, or you'll get an Err() with a message why, like: Expected 1 OwnedFd but got 0 Of course, we can only ever have 0 or 1 PipeIds, but the fixed-sized array syntax has rather nice ergonomics here and its lets us share logic with the fds. Signed-off-by: Allison Karlitskaya <[email protected]> Signed-off-by: Colin Walters <[email protected]>
Instead of a generic array, use a trait on different structs. This ensures even more type safety for callers. Signed-off-by: Colin Walters <[email protected]>
Signed-off-by: Colin Walters <[email protected]>
Most requests don't have fds, so let's optimize it a bit and avoid the callers needing to have `()` spelled out. Signed-off-by: Colin Walters <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Adds a new get_raw_blob
API to fetch blob data and its error stream without checksum verification, refactors request handling to generically support file-descriptor replies, and updates the example client to exercise the raw‐blob path.
- Introduce
FromReplyFds
trait and two implementations (FinishPipe
,DualFds
) to parse returned FDs - Refactor
impl_request*
methods to return typed FDs and pipe IDs - Add
ImageProxy::get_raw_blob
and corresponding error type + tests - Update example CLI to support
GetBlobRaw
and the--raw-blobs
flag - Bump
itertools
dependency
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
File | Description |
---|---|
src/imageproxy.rs | Core FD‐aware request refactor, get_raw_blob API |
examples/client.rs | New GetBlobRaw command, raw-blob demo implementation |
Cargo.toml | Added itertools for array utilities |
Comments suppressed due to low confidence (4)
src/imageproxy.rs:77
- [nitpick] The
Other
variant’s display string is just"error"
, which is too generic—consider including the inner message with#[error("other: {0}")]
or similar to surface the cause.
#[error("error")]
src/imageproxy.rs:688
- [nitpick] There aren’t any unit or integration tests covering the successful end-to-end path of
get_raw_blob
; consider adding a mock-based test to verify you get back the correct length, data FD, and error behavior.
pub async fn get_raw_blob(
src/imageproxy.rs:364
- The syntax
impl IntoIterator<IntoIter: FusedIterator, Item = OwnedFd>
is invalid; it should beimpl IntoIterator<Item = OwnedFd, IntoIter: FusedIterator>
to constrain the associated iterator type correctly.
fn from_reply(
iterable: impl IntoIterator<IntoIter: FusedIterator, Item = OwnedFd>,
examples/client.rs:109
tokio::fs::File
does not haveread
in scope by default; adduse tokio::io::AsyncReadExt;
at the top soread(&mut buffer)
is available.
let n = datafd.read(&mut buffer).await?;
This is needed for my "registry frontend for containers-storage" project, but should also be useful in general as we've discussed moving checksum verification into callers. Then there's no need for a driver etc.
Depends: containers/skopeo#2601