-
Notifications
You must be signed in to change notification settings - Fork 1.5k
use 'lit' as the field name for literal values #16498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me -- thank you @adriangb
FYI @timsaucer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds reasonable, especially since the final schema will not use the field name here. If we change that at some point I could see it causing a problem.
Thanks folks. One more thought: I wonder if this improves performance by some small fraction given that I could see Anyway I pla to leave this open for review for another day or so then I'll merge it. |
@@ -66,8 +66,7 @@ impl Literal { | |||
value: ScalarValue, | |||
metadata: Option<FieldMetadata>, | |||
) -> Self { | |||
let mut field = | |||
Field::new(format!("{value}"), value.data_type(), value.is_null()); | |||
let mut field = Field::new("lit".to_string(), value.data_type(), value.is_null()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we see string allocations on the hot path, perhaps we could do something like cache the FieldRef for each datatype or something 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(As a follow on PR, to be clera)
As per #16491 (comment) I think it's a bit strange that we try to create a field name from the repr of the value.
Consider this example: https://datafusion.apache.org/user-guide/sql/scalar_functions.html#id273
For cases of an array with hundreds of elements it will blow up and make a mess!
Could we use a fixed constant like
'lit'
or'field'
instead?The main issue I could see happening is name collisions, e.g.
select 1, 2, 3
will cause an error which is unfortunate, not sure how to resolve that but also the current behavior isn't much better:FWIW Postgres seems to have the concept of an "un-named" column:
But I'm not sure we want to introduce an "unnamed" field.