Fix predicates not matching the Arrow type of columns read from parquet files #1308

phillipleblanc · 2025-05-09T08:41:27Z

Which issue does this PR close?

Closes Iceberg row filter predicates error when they don't match the parquet type #1307

What changes are included in this PR?

I check the type of the literal scalar against the value we read from the parquet file and convert the literal to match the Parquet Arrow data type.

Are these changes tested?

Tested with a new unit test to cover the different cases.

files

jonathanc-n

Lgtm!

sdd

Looks good to me - in fact I think I've also encountered this issue before with data written by pyiceberg, so I'm glad to see it fixed!

Approved, with a couple of minor comments to address

crates/iceberg/src/arrow/reader.rs

sdd · 2025-05-12T19:43:50Z

Approved and merged - thanks for your contribution @phillipleblanc 👍🏼

phillipleblanc added 3 commits May 9, 2025 17:34

Fix predicates not matching the Arrow type of columns read from parquet

c283ecb

files

Fix lint

3d1d044

fmt

877ead3

phillipleblanc mentioned this pull request May 9, 2025

Fix Iceberg predicates not matching the Arrow type of columns read from parquet files spiceai/spiceai#5761

Merged

jonathanc-n approved these changes May 9, 2025

View reviewed changes

sdd previously approved these changes May 12, 2025

View reviewed changes

crates/iceberg/src/arrow/reader.rs Outdated Show resolved Hide resolved

crates/iceberg/src/arrow/reader.rs Outdated Show resolved Hide resolved

phillipleblanc added 2 commits May 12, 2025 15:59

review feedback

c7d9b01

Merge branch 'main' into phillip/250509-predicate-fix

d4c0488

phillipleblanc dismissed sdd’s stale review via d4c0488 May 12, 2025 06:59

sdd approved these changes May 12, 2025

View reviewed changes

sdd merged commit b81e824 into apache:main May 12, 2025
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix predicates not matching the Arrow type of columns read from parquet files #1308

Fix predicates not matching the Arrow type of columns read from parquet files #1308

phillipleblanc commented May 9, 2025

jonathanc-n left a comment

sdd left a comment

sdd commented May 12, 2025

Fix predicates not matching the Arrow type of columns read from parquet files #1308

Fix predicates not matching the Arrow type of columns read from parquet files #1308

Conversation

phillipleblanc commented May 9, 2025

Which issue does this PR close?

What changes are included in this PR?

Are these changes tested?

jonathanc-n left a comment

Choose a reason for hiding this comment

sdd left a comment

Choose a reason for hiding this comment

sdd commented May 12, 2025