Skip to content

Add related source code locations to errors #13664

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 39 commits into from
Feb 1, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
13ad9db
Reset branch to no changes other than SqlRel
eliaperantoni Jan 16, 2025
4603b35
feat: improve `Diagnostic` ergonomics
eliaperantoni Jan 16, 2025
86e1722
feat: 'table not found' diagnostic in `SqlToRel`
eliaperantoni Jan 16, 2025
c3532c1
feat: unresolved fields
eliaperantoni Jan 17, 2025
91f4e3c
feat: union with different number of columns
eliaperantoni Jan 17, 2025
85db8af
feat: `Spans`
eliaperantoni Jan 23, 2025
3f4ddd2
feat: adjust field not found error
eliaperantoni Jan 23, 2025
b4df859
feat: incompatible types in binary expr
eliaperantoni Jan 23, 2025
e4652d7
fix: tests not compiling
eliaperantoni Jan 23, 2025
d2e70ad
feat: ambiguous column
eliaperantoni Jan 23, 2025
e44cb9f
feat: possible references
eliaperantoni Jan 23, 2025
31772ed
feat: group by
eliaperantoni Jan 23, 2025
c15d8f5
chore: fmt
eliaperantoni Jan 23, 2025
09d7065
feat: relay `Spans` in `create_col_from_scalar_expr`
eliaperantoni Jan 23, 2025
1d5c647
chore: cargo check
eliaperantoni Jan 23, 2025
5fdc10e
chore: cargo fmt
eliaperantoni Jan 23, 2025
f63a493
Merge branch 'main' into eper/diagnostics
eliaperantoni Jan 23, 2025
6f6722d
feat: diagnostic can only be either error or warning
eliaperantoni Jan 24, 2025
9a8ef78
feat: simplify to one `Diagnostic` per error
eliaperantoni Jan 24, 2025
fd46dce
test: add tests
eliaperantoni Jan 24, 2025
360ac20
chore: add license headers
eliaperantoni Jan 24, 2025
3dd66fc
chore: update CLI lockfile
eliaperantoni Jan 24, 2025
13c4c9d
chore: taplo format
eliaperantoni Jan 24, 2025
fe74dce
fix: doctest
eliaperantoni Jan 24, 2025
829430a
chore: configs.md
eliaperantoni Jan 24, 2025
2772c6b
fix: test
eliaperantoni Jan 24, 2025
1b35207
fix: clippy, by removing smallvec
eliaperantoni Jan 24, 2025
243b788
fix: update slt
eliaperantoni Jan 24, 2025
e98694b
Merge branch 'main' into eper/diagnostics
eliaperantoni Jan 24, 2025
e847477
feat: support all binary expressions
eliaperantoni Jan 24, 2025
60a9f8b
refactor: move diagnostic tests to datafusion-sql
eliaperantoni Jan 27, 2025
f178db0
chore: format
eliaperantoni Jan 27, 2025
1308b85
Merge branch 'main' into eper/diagnostics
eliaperantoni Jan 27, 2025
51dd141
fix: tests
eliaperantoni Jan 27, 2025
b5e326c
feat: pr feedback
eliaperantoni Jan 28, 2025
1cd0f3e
fix: ci checks
eliaperantoni Jan 28, 2025
20e9f61
Merge remote-tracking branch 'apache/main' into eper/diagnostics
alamb Jan 29, 2025
e1e6ac5
Merge remote-tracking branch 'apache/main' into eper/diagnostics
alamb Feb 1, 2025
7265bd2
Merge remote-tracking branch 'apache/main' into eper/diagnostics
alamb Feb 1, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 29 additions & 50 deletions datafusion-cli/Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

81 changes: 75 additions & 6 deletions datafusion/common/src/column.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,23 +17,33 @@

//! Column

use arrow_schema::{Field, FieldRef};

use crate::error::_schema_err;
use crate::utils::{parse_identifiers_normalized, quote_identifier};
use crate::{DFSchema, Result, SchemaError, TableReference};
use crate::{DFSchema, Diagnostic, Result, SchemaError, Spans, TableReference};
use arrow_schema::{Field, FieldRef};
use std::collections::HashSet;
use std::convert::Infallible;
use std::fmt;
use std::str::FromStr;

/// A named reference to a qualified field in a schema.
#[derive(Debug, Clone, PartialEq, Eq, Hash, PartialOrd, Ord)]
#[derive(Clone, PartialEq, Eq, Hash, PartialOrd, Ord)]
pub struct Column {
/// relation/table reference.
pub relation: Option<TableReference>,
/// field/column name.
pub name: String,
/// Original source code location, if known
pub spans: Spans,
}

impl fmt::Debug for Column {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
f.debug_struct("Column")
.field("relation", &self.relation)
.field("name", &self.name)
.finish()
}
}

impl Column {
Expand All @@ -50,6 +60,7 @@ impl Column {
Self {
relation: relation.map(|r| r.into()),
name: name.into(),
spans: Spans::new(),
}
}

Expand All @@ -58,6 +69,7 @@ impl Column {
Self {
relation: None,
name: name.into(),
spans: Spans::new(),
}
}

Expand All @@ -68,6 +80,7 @@ impl Column {
Self {
relation: None,
name: name.into(),
spans: Spans::new(),
}
}

Expand Down Expand Up @@ -99,7 +112,11 @@ impl Column {
// identifiers will be treated as an unqualified column name
_ => return None,
};
Some(Self { relation, name })
Some(Self {
relation,
name,
spans: Spans::new(),
})
}

/// Deserialize a fully qualified name string into a column
Expand All @@ -113,6 +130,7 @@ impl Column {
Self {
relation: None,
name: flat_name,
spans: Spans::new(),
},
)
}
Expand All @@ -124,6 +142,7 @@ impl Column {
Self {
relation: None,
name: flat_name,
spans: Spans::new(),
},
)
}
Expand Down Expand Up @@ -239,7 +258,30 @@ impl Column {

// If not due to USING columns then due to ambiguous column name
return _schema_err!(SchemaError::AmbiguousReference {
field: Column::new_unqualified(self.name),
field: Column::new_unqualified(&self.name),
})
.map_err(|err| {
let mut diagnostic = Diagnostic::new_error(
format!("column '{}' is ambiguous", &self.name),
self.spans().first(),
);
// TODO If [`DFSchema`] had spans, we could show the
// user which columns are candidates, or which table
// they come from. For now, let's list the table names
// only.
for qualified_field in qualified_fields {
let (Some(table), _) = qualified_field else {
continue;
};
diagnostic.add_note(
format!(
"possible reference to '{}' in table '{}'",
&self.name, table
),
None,
);
}
err.with_diagnostic(diagnostic)
});
}
}
Expand All @@ -254,6 +296,33 @@ impl Column {
.collect(),
})
}

/// Returns a reference to the set of locations in the SQL query where this
/// column appears, if known.
pub fn spans(&self) -> &Spans {
&self.spans
}

/// Returns a mutable reference to the set of locations in the SQL query
/// where this column appears, if known.
pub fn spans_mut(&mut self) -> &mut Spans {
&mut self.spans
}

/// Replaces the set of locations in the SQL query where this column
/// appears, if known.
pub fn with_spans(mut self, spans: Spans) -> Self {
self.spans = spans;
self
}

/// Qualifies the column with the given table reference.
pub fn with_relation(&self, relation: TableReference) -> Self {
Self {
relation: Some(relation),
..self.clone()
}
}
}

impl From<&str> for Column {
Expand Down
5 changes: 5 additions & 0 deletions datafusion/common/src/config.rs
Original file line number Diff line number Diff line change
Expand Up @@ -250,6 +250,11 @@ config_namespace! {
/// specified. The Arrow type system does not have a notion of maximum
/// string length and thus DataFusion can not enforce such limits.
pub support_varchar_with_length: bool, default = true

/// When set to true, the source locations relative to the original SQL
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the idea that we will keep this off by default at first as we evaluate the impact it has on performance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah and also until we produce diagnostics for more of the most common errors, or so I thought. Right now it's very few errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And that's okay I think, I wanted this PR to be more digestible and a PoC for what can be achieved with Diagnostic. Rolling it out to more places will come later.

/// query (i.e. [`Span`](sqlparser::tokenizer::Span)) will be collected
/// and recorded in the logical plan nodes.
pub collect_spans: bool, default = false
}
}

Expand Down
5 changes: 1 addition & 4 deletions datafusion/common/src/dfschema.rs
Original file line number Diff line number Diff line change
Expand Up @@ -502,10 +502,7 @@ impl DFSchema {
Ok((fields_without_qualifier[0].0, fields_without_qualifier[0].1))
} else {
_schema_err!(SchemaError::AmbiguousReference {
field: Column {
relation: None,
name: name.to_string(),
},
field: Column::new_unqualified(name.to_string(),),
})
}
}
Expand Down
Loading