-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Attach Diagnostic
to "more than one column in subquery" error
#14438
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think this is a good first issue as the need is clear and the tests in https://github.com/apache/datafusion/blob/85fbde2661bdb462fc498dc18f055c44f229604c/datafusion/sql/tests/cases/diagnostic.rs are well structured for extension. |
take |
Hey @irenjj how is it going with this ticket :) Can I help with anything? |
Hi, @eliaperantoni, Sorry for not updating my status for a long time. I was working on other tasks the past few days, but I will address this issue in the next few days.❤️ |
Hi @irenjj , |
Hi @changsun20 ,feel free to take over this task, thank you!❤️ |
Hi @eliaperantoni, After investigating this issue, here are my initial thoughts on implementation: The most straightforward approach would be to add a new pub struct Subquery {
/// The subquery
pub subquery: Arc<LogicalPlan>,
/// The outer references used in the subquery
pub outer_ref_columns: Vec<Expr>,
pub span: Option<Span>,
} The span would be extracted in the pub(super) fn parse_scalar_subquery(
&self,
subquery: Query,
// other params...
) -> Result<Expr> {
// other logic...
let span = Span::try_from_sqlparser_span(subquery.some_way_to_index_the_span_from_the query());
Ok(Expr::ScalarSubquery(Subquery {
subquery: Arc::new(sub_plan),
outer_ref_columns,
span,
}))
} This span would then be accessible when generating error messages, allowing us to add diagnostic information at // This is the original implementation
if subquery.subquery.schema().fields().len() > 1 {
return plan_err!(
"Scalar subquery should only return one column, but found {}: {}",
subquery.subquery.schema().fields().len(),
subquery.subquery.schema().field_names().join(", ")
);
} However, a drawback of this approach is that modifying the Please let me know if this is on the right track or if you have any suggestions. Thank you! |
Thank you so much @changsun20 for the incredible work, that's a very nice and comprehensive plan! I think it looks awesome and I definitely like it. Just one minor thing: it seems like there are scalar subqueries and then "in" subqueries. e.g. |
Sure, no problem! I'll also address that in my pr. |
That's amazing, thank you ❤️ |
Is your feature request related to a problem or challenge?
For a query like:
The only message that the end user of an application built atop of DataFusion sees is:
We want to provide a richer message that references and highlights locations in the original SQL query, and contextualises and helps the user understand the error. In the end, it would be possible to display errors in a fashion akin to what was enabled by #13664 for some errors:
See #14429 for more information.
Describe the solution you'd like
Attach a well crafted
Diagnostic
to theDataFusionError
, building on top of the foundations laid in #13664. See #14429 for more information.Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: