Description
Is your feature request related to a problem or challenge? Please describe what you are trying to do.
This ticket records the symptoms reported by @mbutrovich in (discord) where they see inconsistent performance. It appears the root cause is allocations related to computing the RowSelection to evaluate multiple predicates:
In our case it's currently RowSelection::and_then, so I'm trying to make sense of that function and see if there's a more efficient way to go about it other than the iter().cloned() over both inputs, mutating those, and building the output one element at a time
i was wondering about the better representation of Vec
I'm coming at Rust from C and C++, and a struct with a uint64 and a bool stuck on teh end is just gonna end up aligned to 64 bits with a bunch of padding on the end between each one. Is Rust going to do something similar?
Background:
RowSelection::and_then
is used to combine the results of multiple ArrowPredicates in a RowFilter -- see source:
Here is the code for RowSelection::and_then
.
Describe the solution you'd like
I would like the combination of multiple RowSelection
s to go faster
Describe alternatives you've considered
Some suggestions from @Dandandan in discord:
selectors can reduce allocations in from log(N) to 1 allocations using Vec::with_capacity(len_left + len_right)
Alternatively: the self.selectors allocation probably could be reused for the new one
Any better way to represent Vec ?
Here is one idea for better representing RowSelection
instead of Vec<RowSelector>
Additional context