Skip to content

feat: ORDER BY ALL #15772

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
May 6, 2025
Merged

feat: ORDER BY ALL #15772

merged 16 commits into from
May 6, 2025

Conversation

PokIsemaine
Copy link
Contributor

Which issue does this PR close?

  • None

Rationale for this change

What changes are included in this PR?

Added support for ORDER BY ALL syntax

Are these changes tested?

order.slt
test case from https://duckdb.org/docs/stable/sql/query_syntax/orderby.html#order-by-all-examples

Are there any user-facing changes?

@github-actions github-actions bot added sql SQL Planner logical-expr Logical plan and expressions sqllogictest SQL Logic Tests (.slt) labels Apr 19, 2025
Comment on lines 704 to 711
/// OrderBy Expressions
pub enum OrderByExprs {
OrderByExprVec(Vec<OrderByExpr>),
All {
exprs: Vec<Expr>,
options: OrderByOptions,
},
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we need the Enum.

Maybe it's enough to wrap Vec<OrderByExpr> as a struct and add an extra flag to indicate the ALL

pub struct OrderByExprs {
    exprs: Vec<OrderByExpr>,
    all: bool
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires converting a datafusion_expr::Expr to a SQLExpr, but I'm not sure if there's a good way to do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this enum seems to only be used in the sql planner, so I don't think it is needed in datafusion-expr -- maybe we can just move this into the datafusion-sql crate and make it pub(crate)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I’ll try to move it this weekend

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @PokIsemaine -- this is a pretty cool feature

@PokIsemaine PokIsemaine marked this pull request as draft May 3, 2025 15:59
@github-actions github-actions bot removed the logical-expr Logical plan and expressions label May 4, 2025
@PokIsemaine PokIsemaine marked this pull request as ready for review May 4, 2025 06:49
Comment on lines 60 to 62
if order_by_exprs.is_empty() {
return Ok(vec![]);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why put the early return here? IMO, the earlier we do the check, the more useless computation we can avoid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I moved it by accident.

return Ok(vec![]);
}

let mut sort_expr_vec = vec![];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I usually incline to give it a pre-allocation

options: order_by_options.clone(),
with_fill: None,
}),
_ => None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @PokIsemaine, the PR seems good, but I have one question: Are we going to exclude select expressions which are not column? Is this the same behavior of duckDB?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, it is directly filtered out. DuckDB is capable of handling other types of expressions. Perhaps I can add a TODO comment and use not_impl_err! to notify the user, then gradually add support for other types of SELECT expressions in future updates.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the user gives something like select foo as bar from ... order by all, it doesn't order anything, am I wrong? if so, we should return a not_impl_err!

PokIsemaine and others added 7 commits May 5, 2025 14:03
…ta columns (apache#15935)

* fix query results for predicates referencing partition columns and data columns

* fmt

* add e2e test

* newline
Bumps [substrait](https://github.com/substrait-io/substrait-rs) from 0.55.0 to 0.55.1.
- [Release notes](https://github.com/substrait-io/substrait-rs/releases)
- [Changelog](https://github.com/substrait-io/substrait-rs/blob/main/CHANGELOG.md)
- [Commits](substrait-io/substrait-rs@v0.55.0...v0.55.1)

---
updated-dependencies:
- dependency-name: substrait
  dependency-version: 0.55.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* feat: create helpers to set the max_temp_directory_size

Signed-off-by: Jérémie Drouet <[email protected]>

* refactor: use helper in cli

Signed-off-by: Jérémie Drouet <[email protected]>

* refactor: update error message

Signed-off-by: Jérémie Drouet <[email protected]>

* refactor: use setter in tests

Signed-off-by: Jérémie Drouet <[email protected]>

---------

Signed-off-by: Jérémie Drouet <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
* refactor filter pushdown apis

* remove commented out code

* fix tests

* fail to fix bug

* fix

* add/fix docs

* lint

* add some docstrings, some minimal cleaup

* review suggestions

* add more comments

* fix doc links

* fmt

* add comments

* make test deterministic

* add bench

* fix bench

* register bench

* fix bench

* cargo fmt

---------

Co-authored-by: berkaysynnada <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>
@github-actions github-actions bot added optimizer Optimizer rules core Core DataFusion crate execution Related to the execution crate datasource Changes to the datasource crate labels May 5, 2025
@github-actions github-actions bot removed optimizer Optimizer rules core Core DataFusion crate execution Related to the execution crate datasource Changes to the datasource crate labels May 5, 2025
Copy link
Contributor

@berkaysynnada berkaysynnada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks ready now, thanks @PokIsemaine. I'll merge it after a few hours unless someone else brings any suggestion here

@berkaysynnada berkaysynnada merged commit 2d1551d into apache:main May 6, 2025
27 checks passed
@alamb
Copy link
Contributor

alamb commented May 6, 2025

This looks ready now, thanks @PokIsemaine. I'll merge it after a few hours unless someone else brings any suggestion here

Thanks @PokIsemaine and @berkaysynnada - I also added this to the list of things we could add to a blog about DataFusion 48 in #15771

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sql SQL Planner sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants