Skip to content

added predictors #86

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 20 commits into
base: main
Choose a base branch
from
Open

Conversation

feefladder
Copy link
Contributor

@feefladder feefladder commented Apr 2, 2025

Added predictors and tests.

Since floating point predictors shuffle horizontal padding into the output, quite a lot more information is needed, so I made a public PredictorInfo struct with public methods that give tile/chunk info.

Summary:

  • Added crate-public unpredict_float/hdiff functions
    • Endianness fixing is together with predictors, since the horizontal predictor fixes endianness first and then does the differencing, whereas the floating point predictor first does the horizontal differencing (on bytes) and then fixes endianness together with the shuffling. always orders bytes be-like spec pdf
    • Predictor::None -> use fix_endianness
    • Predictor::Horizontal horizontal prediction and endianness, inside unpredict_hdiff
    • Predictor::Float floating point prediction and endianness. based on this comment
  • Because endianness is needed info, include in in IFD and their from_tags function.
  • Added a PredictorInfo struct, inspired by tiff2.
    • provided functions are expected to be used by the user in their own implementation:
      • included PlanarConfiguration, even though no decoding actually supports Planar, only Chunky
      • included SampleFormat, even though we don't test for them in predictors
      • made functions work on Planar, except if bits_per_sample is non-homogeneous
    • has all needed info to determine:
      • byte width of a chunk row
      • number of samples and their size(s)
  • Made the abstraction that a stripped tiff = tiled tiff with chunk_width=image_width
    • In image-tiff and tiff2, strips and tiles are kept separate, where the end result is that the same calculations are done through different functions with different implementations.
    • I wanted to make this abstraction earlier in tiff2, but realized it too late
    • Can be easily undone by erroring when trying to create a PredictorInfo on non-tiled tiff

Some notes:

  • I did put in the possibility for multiple sample formats, which links our lifetime to the tile... I think it'd also be perfectly fine to accept only a single value, but then I think it'd also make sense to reflect that at IFD?
  • image-tiff does not support horizontal predictor for floating-point rasters, GDAL does. Since here the two layers are decoupled, I didn't put any checks, but just give the user bogus output if they give bogus input.
  • output being Bytes, rather than having a &mut [u8] function input. The &mut [u8] would be preferred by me, because then it can be directly read into a user-provided buffer that has also ensured the alignment of the buffer (e.g. using initializing a Vec<f32> buffer and then bytemucking to &mut [u8])

@feefladder feefladder force-pushed the predictors branch 2 times, most recently from 19f36f8 to e0a80af Compare April 2, 2025 14:17
src/predictor.rs Outdated
predictor_info: &PredictorInfo,
tile_x: u32,
tile_y: u32,
) -> AsyncTiffResult<Bytes>; // having this Bytes will give alignment issues later on
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Bytes is pretty much just Arc<Vec<u8>>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if:

  • the tiff is something with alignment>1, e.g. f32
  • the global allocator gives out a misaligned Vec (which is doesn't often do)
    then the user has two options:
  1. copy over the data into a Vec using f32::from_ne_bytes()
  2. use bytemuck and hope for the best
    I looked into this quite deeply, and afaik, most "standard" allocators allocate with alignment 8, it would just save a copy in my mind?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think instead we should discuss: at what points should be storing Vec<u8> and at what points be storing structured array types.

Copy link
Member

@kylebarron kylebarron Apr 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think to add to my previous comment: the question is where we convert out of Bytes. I think the core networking trait (AsyncFileReader) should remain as it is, where get_bytes returns bytes::Bytes. Most networking code, at least reqwest and object_store return buffers as Bytes.

There's no way to convert from a Bytes to a Vec<u8> zero-copy. (You can sometimes convert from a Bytes to BytesMut zero-copy). So that implies at some point we make a data copy from a Bytes into a Vec<T> in order to be safe. Or we could use similar code as from the Arrow project and build typed interfaces on top of an Arc<Bytes>, such as their Buffer type.

Copy link
Contributor Author

@feefladder feefladder Apr 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think to add to my previous comment: the question is where we convert out of Bytes. I think the core networking trait (AsyncFileReader) should remain as it is, where get_bytes returns bytes::Bytes. Most networking code, at least reqwest and object_store return buffers as Bytes.

I'm all up for not changing AsyncFileReader. I would say after the decompression step is where we can have typed arrays, since decompression itself is not zero-copy (except for no compression). In contrast to reqwest and object_store, we actually know the underlying datatype

Then for the endianness fixing and horizontal predictor, it is also nice to have exclusive access to the underlying buffer (&mut[u8]), since that is zero-copy (not the float predictor). Passing typed arrays over to the predictor doesn't make much sense there imho either, since the operations don't differentiate between e.g. f64 and u64 and it's already quite complex as-is.

so I thought something like:

// pseudocode
fn Tile.decode(&self, &decoder) -> AsyncTiffResult<DecodingResult> {
   //  tiff2's DecodingResult
   let mut res: DecodingResult = todo!() // smart buffer sizing
   decoder.decode(self.compressed_bytes, &mut res.buf_mut()); //bytemuck casting
   match self.predictor_info.predictor {
     // mutates in-place
     Predictor::Horizontal => unpredict_horizontal(&mut res.buf_mut(), &self.predictor_info, self.x)
     // also mutates in-place, but uses a copy
     Predictor::Float => unpredict_float(&mut res.buf_mut(), &self.predictor_info, self.x, self.y)
   }
}

But maybe put this in a separate issue/PR? I didn't put it in here, because this PR was already medium-large and "my ideal situation^" would change some non-related things.

There's no way to convert from a Bytes to a Vec<u8> zero-copy. (You can sometimes convert from a Bytes to BytesMut zero-copy). So that implies at some point we make a data copy from a Bytes into a Vec<T> in order to be safe. Or we could use similar code as from the Arrow project and build typed interfaces on top of an Arc<Bytes>, such as their Buffer type.

In the current (this PR) implementation, I already did a BytesMut::From() quite some times, even though we were still in the same decoding step.

Sidenote: there is also some discussion going on over at image-tiff, where they want to just output a Bytes or &[u8], but then also the alignment issue was raised and somewhat ignored.

but... separate issue/PR? I think this discussion is more about optimization/api than predictors? (even though they overlap quite a bit, could also be here?) #87

src/tile.rs Outdated
}

/// The number of chunks in the horizontal (x) direction
pub fn chunks_across(&self) -> u32 {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To add to my previous comment on removing the PredictorInfo struct; both chunks_across and chunks_down are useful outside of handling predictors. It would make sense to include these on the IFD struct (ex. IFD.chunks_across).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we keep the PredictorInfo, as I think we probably should, we can put this functionality in a shared TileMath trait which both the IFD and PredictorInfo implement

Copy link
Contributor Author

@feefladder feefladder Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also the PredictorInfo separate made tests easier than having to build a full IFD with partial unused required data.

One other option still is to add all PredictorInfo attributes to Tile and then passing a &Tile into the unpredict_... functions for a bit more of a flat struct layout.

Then implement the TileMath trait for Tile and IFD, which makes a lot of sense api-wise

@kylebarron
Copy link
Member

I made a PR with suggested changes into this branch. See feefladder#1

@feefladder
Copy link
Contributor Author

feefladder commented Apr 2, 2025

temporary sadness: "Ah noo... I just finished incorporating feedback :'("

src/predictor.rs Outdated
) -> AsyncTiffResult<Bytes> {
let output_row_stride = predictor_info.output_row_stride(tile_x)?;
let mut res: BytesMut =
BytesMut::zeroed(output_row_stride * predictor_info.output_rows(tile_y)?);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here now I did an output_rows, which is more if we have Planar planar config. I thought then at least we can give the user the data and they can split it themselves?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So now the Planar output will be:

[
Red...,
Green.,
Blue..,
]

where each band is chunk_width_pixels()*chunk_height_pixels()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this made sense, because the decoder (decompression) step decodes the entire data, which is like structured like this

Comment on lines +107 to +125
/// the number of rows the output has, taking padding and PlanarConfiguration into account.
fn output_rows(&self, y: u32) -> AsyncTiffResult<usize> {
match self.planar_configuration {
PlanarConfiguration::Chunky => Ok(self.chunk_height_pixels(y)? as usize),
PlanarConfiguration::Planar => {
Ok((self.chunk_height_pixels(y)? as usize)
.saturating_mul(self.samples_per_pixel as _))
}
}
}

fn bits_per_pixel(&self) -> usize {
match self.planar_configuration {
PlanarConfiguration::Chunky => {
self.bits_per_sample as usize * self.samples_per_pixel as usize
}
PlanarConfiguration::Planar => self.bits_per_sample as usize,
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these depend on planar_config

@feefladder
Copy link
Contributor Author

Thanks @kylebarron for the changes!

Comment on lines +209 to +216
#[cfg(target_endian = "little")]
if let Endianness::LittleEndian = byte_order {
return buffer;
}
#[cfg(target_endian = "big")]
if let Endianness::BigEndian = byte_order {
return buffer;
}
Copy link
Contributor Author

@feefladder feefladder Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought splitting the cfg->noop up here was a bit clearer, before cfgs were mixed inside the match statement

Comment on lines +81 to +91
Predictor::None => Ok(fix_endianness(
decoded_tile,
self.predictor_info.endianness(),
self.predictor_info.bits_per_sample(),
)),
Predictor::Horizontal => {
unpredict_hdiff(decoded_tile, &self.predictor_info, self.x as _)
}
Predictor::FloatingPoint => {
unpredict_float(decoded_tile, &self.predictor_info, self.x as _, self.y as _)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the trait and only have crate-public functions now

Copy link
Contributor Author

@feefladder feefladder Apr 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also help with less-copy as mocked up in #87, since float predictor doesn't do in-place modification and having a shared trait doesn't allow the differentiation

@feefladder
Copy link
Contributor Author

feefladder commented Apr 3, 2025

resolved: On another note: Since now there is this abstraction of "Chunks" for tiles and strips, the name Tile doesn't make much sense anymore for stripped tiffs, but I think that's fine, and is merely a documentation issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants