-
Notifications
You must be signed in to change notification settings - Fork 28.6k
Revert "[SPARK-47895][SQL] group by alias should be idempotent" #50567
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revert "[SPARK-47895][SQL] group by alias should be idempotent" #50567
Conversation
This reverts commit 70e0e20.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but we still need to fix the problem of reanalysis for aliases
cc @cloud-fan |
.getOrElse((u, -1)) | ||
|
||
trimAliases(result) match { | ||
// HACK ALERT: If the expanded grouping expression is an integer literal, don't use it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hack only works if the resolved integer literal is the root expression, which may not be true here.
Good catch!
thanks, merging to master/4.0! |
### What changes were proposed in this pull request? This PR reverts #50461 because it introduces a correctness issue by replacing an `Alias` with incorrect literal. A followup will be made with a different way to fix this issue. ### Why are the changes needed? In the below example, alias `abc` is replaced with `2` resulting in 2 rows instead of correct 3. Before erronous PR:  After erronous PR:  ### Does this PR introduce _any_ user-facing change? User now sees an error message instead of an incorrect result. ### How was this patch tested? Added a test case to check for this behavior in the future ### Was this patch authored or co-authored using generative AI tooling? No Closes #50567 from mihailotim-db/mihailotim-db/revert_group_by. Authored-by: Mihailo Timotic <[email protected]> Signed-off-by: Wenchen Fan <[email protected]> (cherry picked from commit 8718eba) Signed-off-by: Wenchen Fan <[email protected]>
For the record, the JIRA ID seems to be reused mistakenly in the original PR and this revert unfortunately. SPARK-47895 was
Could you confirm that, @mihailotim-db ? |
Correct, but the root of the issue is the same so I guess that is why they reused the same ticket |
Thank you for the comment. It seems that there is misunderstanding here if it's done intentionally. To @mihailotim-db and @mihailom-db . I just want to give you the context in order to prevent future accident like the following.
ASF JIRA IDs have Since this reverting PR itself has no problem, let me post the same comment on the original PR, too. |
@dongjoon-hyun Good to know, thanks! |
What changes were proposed in this pull request?
This PR reverts #50461 because it introduces a correctness issue by replacing an
Alias
with incorrect literal. A followup will be made with a different way to fix this issue.Why are the changes needed?
In the below example, alias

abc
is replaced with2
resulting in 2 rows instead of correct 3.Before erronous PR:
After erronous PR:

Does this PR introduce any user-facing change?
User now sees an error message instead of an incorrect result.
How was this patch tested?
Added a test case to check for this behavior in the future
Was this patch authored or co-authored using generative AI tooling?
No