Skip to content

Regression: count(*) does not work unless using default expression planners #15114

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Tracked by #15151
niebayes opened this issue Mar 10, 2025 · 7 comments · Fixed by #15119
Closed
Tracked by #15151

Regression: count(*) does not work unless using default expression planners #15114

niebayes opened this issue Mar 10, 2025 · 7 comments · Fixed by #15119
Assignees

Comments

@niebayes
Copy link
Contributor

niebayes commented Mar 10, 2025

Previously, the count wildcard is expanded by the CountWildCard analyzer rule which however was removed in the latest release 46.0. And the functionality is now moved to the ExprPlanner for planning aggregate functions, see #14689.

This functionality assumes users could get expression planners from the ContextProvider. See:

for planner in self.context_provider.get_expr_planners().iter() {
match planner.plan_aggregate(aggregate_expr)? {
PlannerResult::Planned(expr) => return Ok(expr),
PlannerResult::Original(expr) => aggregate_expr = expr,
}
}

It's ok for the DataFusion native context provider SessionContextProvider since it's able to retrieve expression planners from the SessionState. See:

struct SessionContextProvider<'a> {
state: &'a SessionState,
tables: HashMap<ResolvedTableReference, Arc<dyn TableSource>>,
}
impl ContextProvider for SessionContextProvider<'_> {
fn get_expr_planners(&self) -> &[Arc<dyn ExprPlanner>] {
&self.state.expr_planners
}

Unfortunately, user defined ContextProvider is unable to get expression planners from the SessionState, since the expr_planners field is not exposed.

As a result, users who wish to utilize the default expression planners must import the FunctionRegistry trait, which provides an expr_planners method, and notably, SessionState implements this trait.

To address this issue, I propose adding a method to SessionState that returns a reference of the registered expression planners. This would provide a more straightforward and consistent way for user-defined ContextProvider implementations to access these planners without requiring importing the additional FunctionRegistry trait.

@jayzhan211
Copy link
Contributor

expr_planners can be a method of trait Session similar to scalar_functions, aggregate_functions.

@niebayes
Copy link
Contributor Author

I'm not sure which one is better, adding an expr_planners method to Session, or to SessionState. I’d appreciate if someone more familiar with the codebase can guide this decision.

@jayzhan211
Copy link
Contributor

The difference is that if you add expr_planners to SessionState only, you need SessionState which is only accessible in core crate, if you add it to Session trait then you can call it in catalog crate or others that import Session

@niebayes
Copy link
Contributor Author

niebayes commented Mar 10, 2025

I think adding expr_planners to SessionState suffices to this issue. I don't know any other usages that may access expression planners from Session.

@jayzhan211
Copy link
Contributor

I think adding expr_planners to SessionState suffices to this issue. I don't know any other usages that may access expression planners from Session.

Then we can add it to SessionState first, adding it to Session is easy if we need to

@niebayes
Copy link
Contributor Author

take

@alamb
Copy link
Contributor

alamb commented Mar 11, 2025

BTW I think we should consider fixing this in a 46.0.0 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants