Skip to content

Improve sqllogictest error reporting #15905

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 5, 2025

Conversation

gabotechs
Copy link
Contributor

@gabotechs gabotechs commented Apr 30, 2025

Which issue does this PR close?

  • Closes #.

Rationale for this change

Improve the way errors get shown to the developer while running sqllogictests.

What changes are included in this PR?

  • Canonicalizes the path of the file that contains an error in a sqllogictests, so that when reporting the error in the following format: "{path}:{line_number}" its clickable from IDEs, pointing directly to the exact line containing the failing test (verified on RustRover)
  • Runs all the sqllogictests in a single file even if some of them fail, reporting the first 10 failures, instead of just the first one.

Example output after intentionally introducing 3 failures in aggregate.slt:
Before:

External error: query result mismatch:
[SQL] select array_sort(c1), array_sort(c2) from (
  select array_agg(distinct column1) as c1, array_agg(distinct column2) ignore nulls as c2 from array_agg_distinct_list_table
);
[Diff] (-expected|+actual)
-   [NULL, b, w] [[0, 1], [0, 1]]
+   [NULL, b, w] [[0, 1], [1, 0]]
at test_files/aggregate.slt:313

After:

External error: 3 errors in file /Users/foo/github/datafusion/datafusion/sqllogictest/test_files/aggregate.slt

1. query result mismatch:
[SQL] select array_sort(c1), array_sort(c2) from (
  select array_agg(distinct column1) as c1, array_agg(distinct column2) ignore nulls as c2 from array_agg_distinct_list_table
);
[Diff] (-expected|+actual)
-   [NULL, b, w] [[0, 1], [0, 1]]
+   [NULL, b, w] [[0, 1], [1, 0]]
at /Users/foo/github/datafusion/datafusion/sqllogictest/test_files/aggregate.slt:313


2. query result mismatch:
[SQL] SELECT bit_or(c5), bit_or(c6), bit_or(c7), bit_or(c8), bit_or(c9) FROM aggregate_test_100
[Diff] (-expected|+actual)
-   -1 -1 255 65534 4294967295
+   -1 -1 255 65535 4294967295
at /Users/foo/github/datafusion/datafusion/sqllogictest/test_files/aggregate.slt:416


3. statement is expected to fail with error:
	(regex) DataFusion error: Random failure
but got error:
	DataFusion error: This feature is not implemented: VAR_POP(DISTINCT) aggregations are not available
[SQL] SELECT var_pop(c2), var_pop(distinct c2) FROM aggregate_test_100
at /Users/foo/github/datafusion/datafusion/sqllogictest/test_files/aggregate.slt:584

Are these changes tested?

These are changes to how the tests are reported, so I suppose?

Are there any user-facing changes?

No

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Apr 30, 2025
@Omega359
Copy link
Contributor

Runs all the sqllogictests in a single file even if some of them fail, reporting all failures, instead of just the first one.

I would prefer it to be limited to say the first 10. Otherwise this looks good.

@gabotechs
Copy link
Contributor Author

I would prefer it to be limited to say the first 10. Otherwise this looks good.

👍 no strong opinion here, I imagine that if more than 10 tests fail, it means that something went really wrong

@alamb
Copy link
Contributor

alamb commented May 1, 2025

I would prefer it to be limited to say the first 10. Otherwise this looks good.

👍 no strong opinion here, I imagine that if more than 10 tests fail, it means that something went really wrong

yeah I agree limiting the error messages to some reasonable limit would make for a better experience

@gabotechs gabotechs force-pushed the improve-sqllogictest-failure-reporting branch 3 times, most recently from c619d6c to a435880 Compare May 1, 2025 08:39
@gabotechs gabotechs marked this pull request as ready for review May 1, 2025 08:40
@gabotechs
Copy link
Contributor Author

👍 Done! also added some failure indexes to help our eyes parse the different errors quickly

@gabotechs gabotechs force-pushed the improve-sqllogictest-failure-reporting branch from a435880 to 1ad97d2 Compare May 1, 2025 17:24
@gabotechs gabotechs force-pushed the improve-sqllogictest-failure-reporting branch from 1ad97d2 to 5b3ef92 Compare May 2, 2025 08:37
@gabotechs
Copy link
Contributor Author

Just so that people have a bit more context about why is this PR coming suddenly out of the blue, I'm exploring this idea that implies adding a new sqllogictest runner that, to the surprise of no one, results in a massive amount of failures.

I'm using the contents of this PR to better triage the errors that I see there, but I'm keeping this as a separate PR as I imagine this can be useful regardless of whether my experiment turns out to be a good idea or not.

Copy link
Contributor

@2010YOUY01 2010YOUY01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. I think it's good to go after fixing the CI.

.run_file_async(path)
.await
.map_err(|e| DataFusionError::External(Box::new(e)));
async fn run_file_in_runner<D: AsyncDB, M: MakeConnection<Conn = D>>(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
async fn run_file_in_runner<D: AsyncDB, M: MakeConnection<Conn = D>>(
/// Run the test file specified by `path` using the provided runner.
/// Only first `ERRS_PER_FILE_LIMIT` errors will be displayed.
async fn run_file_in_runner<D: AsyncDB, M: MakeConnection<Conn = D>>(

@alamb alamb merged commit 90fbb98 into apache:main May 5, 2025
27 checks passed
@alamb
Copy link
Contributor

alamb commented May 5, 2025

🚀

PokIsemaine pushed a commit to PokIsemaine/datafusion that referenced this pull request May 5, 2025
berkaysynnada added a commit that referenced this pull request May 6, 2025
* feat: ORDER BY ALL

* refactor: orderyby all

* refactor: order_by_to_sort_expr

* refactor: TODO comment

* fix query results for predicates referencing partition columns and data columns (#15935)

* fix query results for predicates referencing partition columns and data columns

* fmt

* add e2e test

* newline

* chore(deps): bump substrait from 0.55.0 to 0.55.1 (#15941)

Bumps [substrait](https://github.com/substrait-io/substrait-rs) from 0.55.0 to 0.55.1.
- [Release notes](https://github.com/substrait-io/substrait-rs/releases)
- [Changelog](https://github.com/substrait-io/substrait-rs/blob/main/CHANGELOG.md)
- [Commits](substrait-io/substrait-rs@v0.55.0...v0.55.1)

---
updated-dependencies:
- dependency-name: substrait
  dependency-version: 0.55.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* feat: create helpers to set the max_temp_directory_size (#15919)

* feat: create helpers to set the max_temp_directory_size

Signed-off-by: Jérémie Drouet <[email protected]>

* refactor: use helper in cli

Signed-off-by: Jérémie Drouet <[email protected]>

* refactor: update error message

Signed-off-by: Jérémie Drouet <[email protected]>

* refactor: use setter in tests

Signed-off-by: Jérémie Drouet <[email protected]>

---------

Signed-off-by: Jérémie Drouet <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>

* Fix main CI (#15942)

* Improve sqllogictest error reporting (#15905)

* refactor filter pushdown apis (#15801)

* refactor filter pushdown apis

* remove commented out code

* fix tests

* fail to fix bug

* fix

* add/fix docs

* lint

* add some docstrings, some minimal cleaup

* review suggestions

* add more comments

* fix doc links

* fmt

* add comments

* make test deterministic

* add bench

* fix bench

* register bench

* fix bench

* cargo fmt

---------

Co-authored-by: berkaysynnada <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: Jérémie Drouet <[email protected]>
Co-authored-by: silezhou <[email protected]>
Co-authored-by: Adrian Garcia Badaracco <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Jérémie Drouet <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
Co-authored-by: xudong.w <[email protected]>
Co-authored-by: Gabriel <[email protected]>
Co-authored-by: berkaysynnada <[email protected]>
Co-authored-by: Berkay Şahin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants