-
Notifications
You must be signed in to change notification settings - Fork 2
added predictors #86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
added predictors #86
Conversation
19f36f8
to
e0a80af
Compare
src/predictor.rs
Outdated
predictor_info: &PredictorInfo, | ||
tile_x: u32, | ||
tile_y: u32, | ||
) -> AsyncTiffResult<Bytes>; // having this Bytes will give alignment issues later on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? Bytes
is pretty much just Arc<Vec<u8>>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if:
- the tiff is something with alignment>1, e.g.
f32
- the global allocator gives out a misaligned Vec (which is doesn't often do)
then the user has two options:
- copy over the data into a Vec using
f32::from_ne_bytes()
- use bytemuck and hope for the best
I looked into this quite deeply, and afaik, most "standard" allocators allocate with alignment 8, it would just save a copy in my mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think instead we should discuss: at what points should be storing Vec<u8>
and at what points be storing structured array types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think to add to my previous comment: the question is where we convert out of Bytes
. I think the core networking trait (AsyncFileReader
) should remain as it is, where get_bytes
returns bytes::Bytes
. Most networking code, at least reqwest
and object_store
return buffers as Bytes
.
There's no way to convert from a Bytes
to a Vec<u8>
zero-copy. (You can sometimes convert from a Bytes
to BytesMut
zero-copy). So that implies at some point we make a data copy from a Bytes
into a Vec<T>
in order to be safe. Or we could use similar code as from the Arrow project and build typed interfaces on top of an Arc<Bytes>
, such as their Buffer
type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think to add to my previous comment: the question is where we convert out of
Bytes
. I think the core networking trait (AsyncFileReader
) should remain as it is, whereget_bytes
returnsbytes::Bytes
. Most networking code, at leastreqwest
andobject_store
return buffers asBytes
.
I'm all up for not changing AsyncFileReader. I would say after the decompression step is where we can have typed arrays, since decompression itself is not zero-copy (except for no compression). In contrast to reqwest
and object_store
, we actually know the underlying datatype
Then for the endianness fixing and horizontal predictor, it is also nice to have exclusive access to the underlying buffer (&mut[u8]
), since that is zero-copy (not the float predictor). Passing typed arrays over to the predictor doesn't make much sense there imho either, since the operations don't differentiate between e.g. f64
and u64
and it's already quite complex as-is.
so I thought something like:
// pseudocode
fn Tile.decode(&self, &decoder) -> AsyncTiffResult<DecodingResult> {
// tiff2's DecodingResult
let mut res: DecodingResult = todo!() // smart buffer sizing
decoder.decode(self.compressed_bytes, &mut res.buf_mut()); //bytemuck casting
match self.predictor_info.predictor {
// mutates in-place
Predictor::Horizontal => unpredict_horizontal(&mut res.buf_mut(), &self.predictor_info, self.x)
// also mutates in-place, but uses a copy
Predictor::Float => unpredict_float(&mut res.buf_mut(), &self.predictor_info, self.x, self.y)
}
}
But maybe put this in a separate issue/PR? I didn't put it in here, because this PR was already medium-large and "my ideal situation^" would change some non-related things.
There's no way to convert from a
Bytes
to aVec<u8>
zero-copy. (You can sometimes convert from aBytes
toBytesMut
zero-copy). So that implies at some point we make a data copy from aBytes
into aVec<T>
in order to be safe. Or we could use similar code as from the Arrow project and build typed interfaces on top of anArc<Bytes>
, such as theirBuffer
type.
In the current (this PR) implementation, I already did a BytesMut::From()
quite some times, even though we were still in the same decoding step.
Sidenote: there is also some discussion going on over at image-tiff, where they want to just output a Bytes or &[u8], but then also the alignment issue was raised and somewhat ignored.
but... separate issue/PR? I think this discussion is more about optimization/api than predictors? (even though they overlap quite a bit, could also be here?) #87
src/tile.rs
Outdated
} | ||
|
||
/// The number of chunks in the horizontal (x) direction | ||
pub fn chunks_across(&self) -> u32 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To add to my previous comment on removing the PredictorInfo
struct; both chunks_across
and chunks_down
are useful outside of handling predictors. It would make sense to include these on the IFD
struct (ex. IFD.chunks_across
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if we keep the PredictorInfo
, as I think we probably should, we can put this functionality in a shared TileMath
trait which both the IFD and PredictorInfo
implement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also the PredictorInfo separate made tests easier than having to build a full IFD with partial unused required data.
One other option still is to add all PredictorInfo attributes to Tile
and then passing a &Tile
into the unpredict_...
functions for a bit more of a flat struct layout.
Then implement the TileMath trait for Tile and IFD, which makes a lot of sense api-wise
I made a PR with suggested changes into this branch. See feefladder#1 |
temporary sadness: "Ah noo... I just finished incorporating feedback :'(" |
src/predictor.rs
Outdated
) -> AsyncTiffResult<Bytes> { | ||
let output_row_stride = predictor_info.output_row_stride(tile_x)?; | ||
let mut res: BytesMut = | ||
BytesMut::zeroed(output_row_stride * predictor_info.output_rows(tile_y)?); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here now I did an output_rows, which is more if we have Planar
planar config. I thought then at least we can give the user the data and they can split it themselves?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So now the Planar
output will be:
[
Red...,
Green.,
Blue..,
]
where each band is chunk_width_pixels()*chunk_height_pixels()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this made sense, because the decoder (decompression) step decodes the entire data, which is like structured like this
/// the number of rows the output has, taking padding and PlanarConfiguration into account. | ||
fn output_rows(&self, y: u32) -> AsyncTiffResult<usize> { | ||
match self.planar_configuration { | ||
PlanarConfiguration::Chunky => Ok(self.chunk_height_pixels(y)? as usize), | ||
PlanarConfiguration::Planar => { | ||
Ok((self.chunk_height_pixels(y)? as usize) | ||
.saturating_mul(self.samples_per_pixel as _)) | ||
} | ||
} | ||
} | ||
|
||
fn bits_per_pixel(&self) -> usize { | ||
match self.planar_configuration { | ||
PlanarConfiguration::Chunky => { | ||
self.bits_per_sample as usize * self.samples_per_pixel as usize | ||
} | ||
PlanarConfiguration::Planar => self.bits_per_sample as usize, | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these depend on planar_config
Thanks @kylebarron for the changes! |
… to fix_endianness
#[cfg(target_endian = "little")] | ||
if let Endianness::LittleEndian = byte_order { | ||
return buffer; | ||
} | ||
#[cfg(target_endian = "big")] | ||
if let Endianness::BigEndian = byte_order { | ||
return buffer; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought splitting the cfg->noop up here was a bit clearer, before cfgs were mixed inside the match statement
Predictor::None => Ok(fix_endianness( | ||
decoded_tile, | ||
self.predictor_info.endianness(), | ||
self.predictor_info.bits_per_sample(), | ||
)), | ||
Predictor::Horizontal => { | ||
unpredict_hdiff(decoded_tile, &self.predictor_info, self.x as _) | ||
} | ||
Predictor::FloatingPoint => { | ||
unpredict_float(decoded_tile, &self.predictor_info, self.x as _, self.y as _) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed the trait and only have crate-public functions now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should also help with less-copy as mocked up in #87, since float predictor doesn't do in-place modification and having a shared trait doesn't allow the differentiation
resolved: |
Added predictors and tests.
Since floating point predictors shuffle horizontal padding into the output, quite a lot more information is needed, so I made a public
PredictorInfo
struct with public methods that give tile/chunk info.Summary:
unpredict_float/hdiff
functionsfirst does the horizontal differencing (on bytes) and then fixes endianness together with the shuffling.always orders bytes be-like spec pdfPredictor::None
-> usefix_endianness
Predictor::Horizontal
horizontal prediction and endianness, insideunpredict_hdiff
Predictor::Float
floating point prediction and endianness. based on this commentfrom_tags
function.PredictorInfo
struct, inspired by tiff2.PlanarConfiguration
, even though no decoding actually supportsPlanar
, onlyChunky
SampleFormat
, even though we don't test for them in predictorsPlanar
, except ifbits_per_sample
is non-homogeneouschunk_width=image_width
image-tiff
andtiff2
, strips and tiles are kept separate, where the end result is that the same calculations are done through different functions with different implementations.tiff2
, but realized it too latePredictorInfo
on non-tiled tiffSome notes:
&mut [u8]
function input. The&mut [u8]
would be preferred by me, because then it can be directly read into a user-provided buffer that has also ensured the alignment of the buffer (e.g. using initializing aVec<f32>
buffer and then bytemucking to&mut [u8]
)