-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Set HashJoin seed #15783
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set HashJoin seed #15783
Conversation
6a8379b
to
231629a
Compare
Co-authored-by: Alex Huang <[email protected]>
🤖 |
@@ -86,6 +86,10 @@ use datafusion_physical_expr_common::physical_expr::fmt_sql; | |||
use futures::{ready, Stream, StreamExt, TryStreamExt}; | |||
use parking_lot::Mutex; | |||
|
|||
/// Hard-coded seed to ensure hash values from the hash join differ from `RepartitionExec`, avoiding collisions. | |||
const HASH_JOIN_SEED: RandomState = | |||
RandomState::with_seeds('J' as u64, 'O' as u64, 'I' as u64, 'N' as u64); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ctsk -- this makes sense to me. I am running a few benchmarks to make sure this doesn't cause any regressions, but overall it looks good to me
🤖: Benchmark completed Details
|
Given these queries don't have joins I am not sure that is reproduceable 😬 |
Thanks again @ctsk -- sorry for the delay in review / merge |
* Set HashJoin seed * fmt * whitespace grr * Document hash seed Co-authored-by: Alex Huang <[email protected]> --------- Co-authored-by: Alex Huang <[email protected]>
Which issue does this PR close?
What changes are included in this PR?
The hash join seed is hard-coded to a different value that the RepartitionExec seed.
Are these changes tested?
Covered by existing tests.
Are there any user-facing changes?
No.