Skip to content

fix: add an "expr_planners" method to SessionState #15119

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 12, 2025

Conversation

niebayes
Copy link
Contributor

@niebayes niebayes commented Mar 10, 2025

Which issue does this PR close?

Rationale for this change

This method makes it convenient for users to retrieve expression planners from SessionState. The details are explained in the corresponding issue.

What changes are included in this PR?

Added an expr_planners to SessionState.

Are these changes tested?

Covered by existing test cases.

Are there any user-facing changes?

Yes, users now can use the expr_planners method to get registered expression planners, including default expression planners, from SessionState.

@github-actions github-actions bot added the core Core DataFusion crate label Mar 10, 2025
@niebayes niebayes changed the title chore: add "expr_planners api chore: add a "expr_planners" method to SessionState Mar 10, 2025
@niebayes niebayes changed the title chore: add a "expr_planners" method to SessionState chore: add an "expr_planners" method to SessionState Mar 10, 2025
@alamb alamb marked this pull request as draft March 10, 2025 17:13
@alamb
Copy link
Contributor

alamb commented Mar 10, 2025

CI appears to be failing so marking as draft

@@ -1632,7 +1632,7 @@ impl FunctionRegistry for SessionContext {
}

fn expr_planners(&self) -> Vec<Arc<dyn ExprPlanner>> {
self.state.read().expr_planners()
self.state.read().expr_planners().to_vec()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, it's the FunctionRegistry::expr_planners method gets called and a vector of expression planners is returned.
Now, the SessionState::expr_planners takes precedence and a slice of expression planners is returned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alamb Fixed CI. PTAL

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how the code in this PR is any different than what is in on main

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, both the FunctionRegistry and SessionState provide an expr_planners method, but they return different data type. FunctionRegistry::expr_planners returns Vec<dyn ExprPlanner, and SessionState::expr_planners returns &[dyn ExprPlanner].

Now, the SessionState::expr_planners takes precedence and a slice of expression planners is returned. Therefore, we have add a to_vec.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's ContextProvider that uses a slice vs Vec. Bit annoying the types are not aligned. SessionState uses expr_planners: Vec<Arc<dyn ExprPlanner>>,

Copy link
Contributor Author

@niebayes niebayes Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Omega359 I agree, it's better to align the types.

I tried to align the types but ran into some difficulties. If both functions return a slice, SessionContext::expr_planners would encounter the error Cannot return a reference of temporary values.

If both return a Vec, then SessionState's implementation of ContextProvider::get_expr_planners would also face the same issue. The only way to resolve this would be to change the return type of ContextProvider::get_expr_planners to Vec, but that would be a breaking change.

@niebayes
Copy link
Contributor Author

@alamb I wonder if we can remove the register_expr_planners and expr_planners from the FunctionRegistry trait. I have checked the codebase and they're only used by a test. And it's kind of weird as these methods don’t seem to align with the core responsibility of the trait.

@niebayes niebayes marked this pull request as ready for review March 11, 2025 02:55
@alamb
Copy link
Contributor

alamb commented Mar 11, 2025

@alamb I wonder if we can remove the register_expr_planners and expr_planners from the FunctionRegistry trait. I have checked the codebase and they're only used by a test. And it's kind of weird as these methods don’t seem to align with the core responsibility of the trait.

It seem reasonable to me -- maybe we could make a PR to deprecate them first ?

Actually, I thought the idea was that SessionState implemented FunctionRegistry or something like that

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @niebayes -- can you please add a test a test for this fix

@niebayes
Copy link
Contributor Author

@alamb Sure, I maybe write an example to demonstrate how why the newly added expr_planners method is helpful

@alamb
Copy link
Contributor

alamb commented Mar 12, 2025

@alamb Sure, I maybe write an example to demonstrate how why the newly added expr_planners method is helpful

What I am thinking is "how will we ensure that we don't accidentally break this feature during future refactors"

As it stands now, we could potentially break the feature without any tests failing

Ideally we would have some test that fails so we would know to fix it

@alamb alamb changed the title chore: add an "expr_planners" method to SessionState fix: add an "expr_planners" method to SessionState Mar 13, 2025
@alamb
Copy link
Contributor

alamb commented Mar 13, 2025

I plan to make a 46.0.1 release tomorrow

Is this something we should include? Or is there a workaround?

@niebayes
Copy link
Contributor Author

niebayes commented Mar 17, 2025

@alamb Sorry for late response, Some emergent personal matters have caused a delay. I will complete this in the next few hours.

@niebayes niebayes force-pushed the chore/add_expr_planners_api branch from 3bbff38 to 40a717b Compare March 17, 2025 11:01
@niebayes
Copy link
Contributor Author

@alamb Hi, I've written a test to demonstrate why it's more convenient and somewhat necessary to provide an expr_planners method for SessionState.

I'm not sure if this test clearly conveyed my thoughts. Appreciate any feedback.

@niebayes niebayes requested a review from alamb March 19, 2025 02:04
},
physical_plan::ExecutionPlan,
sql::{planner::ContextProvider, ResolvedTableReference, TableReference},
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may as well pull in the other 2 crate lines above if you are going to use this import style.

Copy link
Contributor Author

@niebayes niebayes Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Omega359 I have flattened all imports.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @niebayes -- I am sorry this PR got lost. It seems like a reasonable change to me

@alamb alamb merged commit 3910073 into apache:main May 12, 2025
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Regression: count(*) does not work unless using default expression planners
3 participants