Skip to content

Refactor regexplike signature #13394

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Dec 8, 2024
Merged

Conversation

jiashenC
Copy link
Contributor

Which issue does this PR close?

Close some tasks in #13301.

What changes are included in this PR?

Simplify function signature.

Are these changes tested?

Added more tests for implicit casting.

Are there any user-facing changes?

No

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Nov 13, 2024
@jiashenC
Copy link
Contributor Author

@jayzhan211, I made some progress first for the RegExpLike function. In the original issue, you mentioned the signature can be replaced with Signature::string(2, Volatility::Immutable), but I do see this function can also accept 3 arguments. And this will fail tests like dataframe::dataframe_functions::test_fn_regexp_like. I wonder if I misunderstood something here?

@jayzhan211
Copy link
Contributor

jayzhan211 commented Nov 13, 2024

@jayzhan211, I made some progress first for the RegExpLike function. In the original issue, you mentioned the signature can be replaced with Signature::string(2, Volatility::Immutable), but I do see this function can also accept 3 arguments. And this will fail tests like dataframe::dataframe_functions::test_fn_regexp_like. I wonder if I misunderstood something here?

You can use one_of

signature: Signature::one_of(
                vec![
                    TypeSignature::String(2),
                    TypeSignature::String(3)
                ],
                Volatility::Immutable,
            ),

@jiashenC jiashenC marked this pull request as ready for review November 14, 2024 00:25
@jiashenC
Copy link
Contributor Author

@jayzhan211 thanks for the reply! I mark this as ready for review.

@jayzhan211
Copy link
Contributor

@jayzhan211 thanks for the reply! I mark this as ready for review.

You can run cargo test --test sqllogictests and ./dev/rust_lint.sh to cleanup CI

@github-actions github-actions bot removed the sqllogictest SQL Logic Tests (.slt) label Nov 15, 2024
@jiashenC jiashenC changed the title Refactor regexplife signature Refactor regexplike signature Nov 20, 2024
@jiashenC jiashenC force-pushed the refactor_regexplife_sig branch from 2968fa3 to 3b0c412 Compare November 20, 2024 19:29
@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Nov 20, 2024
@alamb
Copy link
Contributor

alamb commented Nov 23, 2024

There appears to be a CI test failure in the examples were some queries no longer work

@alamb
Copy link
Contributor

alamb commented Nov 23, 2024

Thank you for this contribution @jiashenC

@Omega359
Copy link
Contributor

I think the CI failures are a type casting issue. There are two rows in the example that have 4000 as the values that I think are not being coerced to strings as they should be.

 Error: Plan("Internal error: Failed to match any signature, errors: Error during planning: The signature expected 3 arguments but received 2,Error during planning: The signature expected NativeType::String but received NativeType::Int64.\nThis was likely caused by a bug in DataFusion's code and we would welcome that you file an bug report in our issue tracker No function matches the given name and argument types 'regexp_like(Utf8, Int64, Utf8)'. You might need to add explicit type casts.\n\tCandidate functions:\n\tregexp_like(String(2))\n\tregexp_like(String(3))")

Input data for the example: https://github.com/apache/datafusion/blob/main/datafusion/physical-expr/tests/data/regex.csv

@alamb
Copy link
Contributor

alamb commented Nov 27, 2024

Marking as draft as I think this PR is no longer waiting on feedback. Please mark it as ready for review when it is ready for another look

I am trying to make the review backlog easier to understand

@alamb alamb closed this Nov 27, 2024
@alamb alamb reopened this Nov 27, 2024
@alamb alamb marked this pull request as draft November 27, 2024 19:23
@alamb
Copy link
Contributor

alamb commented Nov 27, 2024

(sorry I accidnetally closed the PR)

@jiashenC jiashenC force-pushed the refactor_regexplife_sig branch from 3b0c412 to 1c18eff Compare November 30, 2024 01:27
@jiashenC jiashenC force-pushed the refactor_regexplife_sig branch from 1c18eff to 2e30074 Compare November 30, 2024 02:18
@jiashenC jiashenC marked this pull request as ready for review November 30, 2024 03:43
@jayzhan211 jayzhan211 merged commit f2de2c4 into apache:main Dec 8, 2024
25 checks passed
@jayzhan211
Copy link
Contributor

Thanks @jiashenC @alamb @Omega359

zhuliquan pushed a commit to zhuliquan/datafusion that referenced this pull request Dec 11, 2024
* update

* update

* update

* clean up errors

* fix flags types

* fix failed example
zhuliquan pushed a commit to zhuliquan/datafusion that referenced this pull request Dec 15, 2024
* update

* update

* update

* clean up errors

* fix flags types

* fix failed example
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
functions Changes to functions implementation sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants