-
Notifications
You must be signed in to change notification settings - Fork 927
refactor: Reduce how much code is instantiated for comparisons #2365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…igned numbers These are just bit-equality so we can cut out a lot of generated code by re-using these
…(-2.5%) The main issue is still that the comparison operators force this to instantiate to much (587). But it helps a bit.
@@ -201,6 +213,45 @@ impl<'a, T: ArrowPrimitiveType> ArrayAccessor for &'a PrimitiveArray<T> { | |||
} | |||
} | |||
|
|||
pub(crate) struct NativeArrayView<'a, T> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we perhaps find some way to eliminate this part, in a trade-off between more unsafe and getting LLVM to do more work, I would always take the latter...
Do you perhaps have some compile time metrics for this change? |
Before
After
|
} | ||
|
||
// Trait to help reduce the number of function instantiations by merging types which compare the | ||
// same (u8/i8 are compared the same way) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is correct, as u8 is an unsigned comparison whereas i8 is signed? You can apply this style optimization for things like TimestampNanosecondArray, i.e. logical types, but I'm not sure you can discard the sign-ness of the type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True I were to hasty and only though about ==
/!=
. It needs to stay for ordered comparisons
I'm closing this PR as it has bit-rotted substantially, and there has been significant effort to address this under #2594. Please feel to reopen if/when updated |
Rationale for this change
The
arrow
crate generates a lot of code due the the dynamic comparison operators which takes a long time to compile and potentially bloats the final binary. By extracting code from generic functions and otherwise reducing how much code gets instantiated we get a faster and slimmer compilation.Doesn't really fix #1858 as the amount of code is still huge, but it does still help.
(Output of
cargo llvm-line -p arrow
)Before
After
What changes are included in this PR?
Refactorings to reduce how much code is instantiated. The PR is left in draft as some of the changes to compare native types instead of "arrow types" is a bit hacky.