Skip to content

deprecating return_type in favor of return_type_from_args #15662

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
rluvaton opened this issue Apr 9, 2025 · 1 comment
Open

deprecating return_type in favor of return_type_from_args #15662

rluvaton opened this issue Apr 9, 2025 · 1 comment

Comments

@rluvaton
Copy link
Contributor

rluvaton commented Apr 9, 2025

Having return_type does not really reflect the UDF actual nullability.

the following problems exists when UDF wrongly report their nullability:

Disallow optimization based on the expressions nullability.

For example lets assume I have 2 columns:

  1. col1 which is i64 without nulls
  2. col2 which is i64 with nulls

and the expression ceil does not does not implement return_type_from_args, so by default the nullability is true

can't optimize the following query and

this means that I can can't optimize this query

select coalesce(ceil('col1'), 'col2') from tbl;

to be this:

select ceil('col1') from tbl;

because ceil says that the nullability is true even though the input expression is not nullable

the same can be for not able to remove the expression array_remove_all(array, null) if the array expression report that it is nullable while it is not

Wrong types

creating the wrong type:

> select arrow_typeof(make_array(1));
+------------------------------------------------------------------------------------------------------------------+
| arrow_typeof(make_array(Int64(1)))                                                                               |
+------------------------------------------------------------------------------------------------------------------+
| List(Field { name: "item", data_type: Int64, nullable: true, dict_id: 0, dict_is_ordered: false, metadata: {} }) |
+------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.002 seconds.

should be nullable false but because make_array implement return_type it does not get that information

@rluvaton
Copy link
Contributor Author

rluvaton commented Apr 9, 2025

@alamb take:

We can start return_type deprecation after this PR

I recommend we do not deprecate return_type - after this PR I think we have a nice API. Namely most simple uses can use return_type and ones that need more control (like returning nullability, or getting scalar args) can use return_type_from_args

Originally posted by @alamb in #14094 (comment)

Which I disagree as I can no longer do the optimization and more

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant