Skip to content

fix: nested window function #15033

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Apr 5, 2025
Merged

Conversation

chenkovsky
Copy link
Contributor

@chenkovsky chenkovsky commented Mar 5, 2025

Which issue does this PR close?

Rationale for this change

current implementation doesn't support nested window function in projection.

What changes are included in this PR?

use expr visitor to find nested window function.

Are these changes tested?

unit test

Are there any user-facing changes?

No

@chenkovsky
Copy link
Contributor Author

sqllogictest complains stackoverflow. because I added a recursive visitor. but if I add stacksize, it works. so I think it's not real stackoverflow. Let me check how to refine it.

@chenkovsky
Copy link
Contributor Author

@alamb I think I have to add a BFS visitor in sqlparse crate, how do you feel about it?

@alamb
Copy link
Contributor

alamb commented Mar 6, 2025

@alamb I think I have to add a BFS visitor in sqlparse crate, how do you feel about it?

i think that would mean it will take at least another month to fix this issue - as we would need a new sql parser release and then integrate that into DataFusion

If you can find another way that would likely be faster to get in

@chenkovsky
Copy link
Contributor Author

chenkovsky commented Mar 6, 2025

@alamb I think I have to add a BFS visitor in sqlparse crate, how do you feel about it?

i think that would mean it will take at least another month to fix this issue - as we would need a new sql parser release and then integrate that into DataFusion

If you can find another way that would likely be faster to get in

I increased stack size in test

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Mar 6, 2025
@chenkovsky
Copy link
Contributor Author

i'm working on rewriting visit logic. but I found this pr apache/datafusion-sqlparser-rs#1522 . does it mean that this pr doesn't take effect on the test 🤔

@chenkovsky
Copy link
Contributor Author

by the way, If I change tokio to single thread, there's also no stack overflow.

@github-actions github-actions bot removed the sqllogictest SQL Logic Tests (.slt) label Mar 12, 2025
@chenkovsky
Copy link
Contributor Author

@alamb could you please review it agian. I have found the solution that doesn't need to touch stack size.

SUM(t1.v1) OVER w + 1
FROM
generate_series(1, 10000) AS t1(v1)
WINDOW
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should at least have a .slt test that shows this query running and producing the same result as postgres, perhaps with a smaller number of series:

postgres=# SELECT
  t1.v1,
  SUM(t1.v1) OVER w
FROM
  generate_series(1, 5) AS t1(v1)
WINDOW
  w AS (ORDER BY t1.v1 ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW);
 v1 | sum
----+-----
  1 |   1
  2 |   3
  3 |   6
  4 |  10
  5 |  15
(5 rows)

NamedWindowExpr::WindowSpec(spec) => {
WindowType::WindowSpec(spec.clone())
let mut err = None;
visit_expressions_mut(expr, |expr| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry @chenkovsky and @2010YOUY01

I don't know what

SELECT
  t1.v1,
  SUM(t1.v1) OVER w + 1
FROM
  generate_series(1, 10) AS t1(v1)
WINDOW
  w AS (ORDER BY t1.v1);

Is supposed to be computing (what does adding one to a window definition like w +1 represent?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DuckDB interprets it as (SUM(t1.v1) OVER w) + 1, and ... OVER (w + 1) is not valid

D SELECT
    t1.v1,
    SUM(t1.v1) OVER (w + 1)
  FROM
    generate_series(1, 10) AS t1(v1)
  WINDOW
    w AS (ORDER BY t1.v1);
Parser Error: syntax error at or near "+"
LINE 3:   SUM(t1.v1) OVER (w + 1)

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Mar 16, 2025
@chenkovsky chenkovsky force-pushed the fix/window_function branch from bb373ca to ffa9124 Compare April 4, 2025 14:34
@chenkovsky
Copy link
Contributor Author

could anyone help to review thir pr?

@alamb alamb requested a review from jonahgao April 4, 2025 19:17
@alamb
Copy link
Contributor

alamb commented Apr 4, 2025

@jonahgao would you possibly have time to help review this PR?

Copy link
Member

@jonahgao jonahgao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you @chenkovsky

named_windows
{
if let Some(WindowType::NamedWindow(ident)) = &f.over {
if ident.eq(window_ident) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to the current fix, we should compare them using normalized names to support

SELECT
  t1.v1,
  SUM(t1.v1) OVER W + 1
FROM
  generate_series(1, 5) AS t1(v1)
WINDOW
  w AS (ORDER BY t1.v1);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jonahgao jonahgao merged commit c0b1fbc into apache:main Apr 5, 2025
27 checks passed
nirnayroy pushed a commit to nirnayroy/datafusion that referenced this pull request May 2, 2025
* fix: nested window function

* prevent stackoverflow

* Update select.rs

* Update sqllogictests.rs

* Update sql_api.rs

* Update select.rs

* Update sqllogictests.rs

* update slt

* update slt

* clippy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate sql SQL Planner sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Planning error for compound expressions involving window functions
4 participants