Skip to content

Preserve input order of window functions when planning window operations #25589

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

RishiSajay
Copy link
Member

@RishiSajay RishiSajay commented Apr 15, 2025

Description

Fixes a bug where initial plans for the exact same query string are non-deterministic when the query includes window functions with different resolved windows over the same window. (Resolved window takes into account partition by, order by, etc)

Additional context and related issues

Window functions with the same window are planned in the same operation (see commit). QueryPlanner::planWindowFunctions takes in a list of windowFunctions and converts it into a map to be processed. However, HashMaps do not guarantee input order, leading to different initial query plans for the same query string.

This PR replaces the default HashMap with a LinkedHashMap to preserve the order of resolved windows (and by extension their associated window functions) when planning window functions, therefore guaranteeing the same initial plan every time the same query is run.

Minimal repro and associated diff of initial query plans from 2 different runs before this PR:

WITH data AS (
    SELECT *
    FROM (VALUES 
        ('A', 1, 100),
        ('A', 2, 200),
        ('A', 1, 150),
        ('B', 1, 300),
        ('B', 2, 100),
        ('B', 1, 200)
    ) AS t(category, subcategory, value)
)
SELECT 
    RANK() OVER (
        PARTITION BY category 
        ORDER BY value DESC
    ) AS rank_by_category,
    RANK() OVER (
        PARTITION BY category, subcategory 
        ORDER BY value DESC
    ) AS rank_by_category_subcategory,
    RANK() OVER (
        PARTITION BY category, subcategory, value
        ORDER BY value DESC
    ) AS rank_by_category_subcategory_value
FROM data;

image

Release notes

(x) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

## Section
* Fix some things. ({issue}`issuenumber`)

Copy link

cla-bot bot commented Apr 15, 2025

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Rishi Sajay.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@raunaqmorarka raunaqmorarka added the bug Something isn't working label Apr 16, 2025
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 1 out of 1 changed files in this pull request and generated no comments.

Comments suppressed due to low confidence (1)

core/trino-main/src/main/java/io/trino/sql/planner/QueryPlanner.java:1436

  • Changing the groupingBy collector to use a LinkedHashMap is the desired fix to preserve the input order; ensure that subsequent usage of this map leverages the order preservation appropriately.
.collect(Collectors.groupingBy(analysis::getWindow, LinkedHashMap::new, Collectors.toList()));

Copy link
Member

@kasiafi kasiafi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved % comments.

@RishiSajay RishiSajay force-pushed the rishisajay/make-window-planning-deterministic branch from 602ba82 to a6cd182 Compare April 17, 2025 00:54
Copy link

cla-bot bot commented Apr 17, 2025

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Rishi Sajay.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

Map<ResolvedWindow, List<io.trino.sql.tree.FunctionCall>> functions = scopeAwareDistinct(subPlan, windowFunctions)
.stream()
.collect(Collectors.groupingBy(analysis::getWindow));
.collect(Collectors.groupingBy(analysis::getWindow, LinkedHashMap::new, Collectors.toUnmodifiableList()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: static import toUnmodifiableList

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

PARTITION BY category, subcategory, value, subvalue
) AS rank_by_category_subcategory_value_subvalue
FROM data
""";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the indentation

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@RishiSajay RishiSajay force-pushed the rishisajay/make-window-planning-deterministic branch from a6cd182 to 924b219 Compare April 17, 2025 18:16
Copy link

cla-bot bot commented Apr 17, 2025

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Rishi Sajay.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@RishiSajay RishiSajay force-pushed the rishisajay/make-window-planning-deterministic branch from 924b219 to 107f4fc Compare April 17, 2025 18:21
Copy link

cla-bot bot commented Apr 17, 2025

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Rishi Sajay.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

@raunaqmorarka
Copy link
Member

@RishiSajay please submit the CLA as described in https://github.com/trinodb/cla
We need that to start merging PRs from any contributor in Trino

@martint
Copy link
Member

martint commented Apr 18, 2025

@cla-bot check

Copy link

cla-bot bot commented Apr 18, 2025

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Rishi Sajay.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

Copy link

cla-bot bot commented Apr 18, 2025

The cla-bot has been summoned, and re-checked this pull request!

@RishiSajay
Copy link
Member Author

@cla-bot check

Copy link

cla-bot bot commented Apr 18, 2025

Thank you for your pull request and welcome to our community. We could not parse the GitHub identity of the following contributors: Rishi Sajay.
This is most likely caused by a git client misconfiguration; please make sure to:

  1. check if your git client is configured with an email to sign commits git config --list | grep email
  2. If not, set it up using git config --global user.email [email protected]
  3. Make sure that the git commit email is configured in your GitHub account settings, see https://github.com/settings/emails

Copy link

cla-bot bot commented Apr 18, 2025

The cla-bot has been summoned, and re-checked this pull request!

@RishiSajay RishiSajay force-pushed the rishisajay/make-window-planning-deterministic branch from 107f4fc to 97cfcbd Compare April 18, 2025 14:49
@cla-bot cla-bot bot added the cla-signed label Apr 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working cla-signed
Development

Successfully merging this pull request may close these issues.

4 participants