Skip to content

Minor: remove duplicated select #11424

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

jayzhan211
Copy link
Contributor

Which issue does this PR close?

Closes #.

Rationale for this change

First, I think this is a duplicated select expression, since .sql("select count(*) from t1") already do the job.
And, I think we should consider this invalid, as an alternative expression should be .select(vec![col("count(wildcard())")])?

#11229 (comment)

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Signed-off-by: jayzhan211 <[email protected]>
@github-actions github-actions bot added the core Core DataFusion crate label Jul 12, 2024
@@ -212,7 +212,6 @@ async fn test_count_wildcard_on_aggregate() -> Result<()> {
let sql_results = ctx
.sql("select count(*) from t1")
.await?
.select(vec![col("count(*)")])?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a reason for doing it; I once wanted to remove it as well 😂.
See #10459 (comment)

Copy link
Contributor Author

@jayzhan211 jayzhan211 Jul 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, but I still don't get it.

Perhaps it meant to show how to select the count(*) again -- if so maybe we could update the tests to do something like this (rather than removing it entirely):

If it is just to show select twice can get the correct result, could we do count(wildcard()) instead of count(*)?

For count(*), it directly goes to PlanBuilder and miss the chance for ExprPlanner rewrite. IMO we should use count(wildcard()) instead but consider select(vec![col("count(*)")]) invalid 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea was to show that it is possible to select a column named "count(*)" after computing the actual count(*) value in a previous plan.

Projection(expr=[col("count(*)")]
  Aggregate(agg=[count(*))
   Scan(..)

If we can remove the code and the test still passes though clearly something is not working as intended.

Maybe it should be instead col("count(*)") + lit(1) so the expression doesn't get removed by the optimizer 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😕

@jayzhan211 jayzhan211 marked this pull request as ready for review July 12, 2024 11:41
@jayzhan211 jayzhan211 marked this pull request as draft July 17, 2024 05:12
Copy link

Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale PR has not had any activity for some time label Sep 16, 2024
@github-actions github-actions bot closed this Sep 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate Stale PR has not had any activity for some time
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants