Open
Description
Is your feature request related to a problem or challenge?
It is clear that the current statistics code is lacking tests . For example I was about to delete code #8172 but thankfully @berkaysynnada pointed out the code was actually different, yet not tests failed.
I spent some time auditing the codebase for tests, and here is what I found:
places that I think could do with some additional coverage
These places have tests, but we should review them to ensure that the coverage is adequte
Here are the places that do have tests, but the coverage probably needs to be reviewed
- LocalLimitExec: https://github.com/apache/arrow-datafusion/blob/fdf3f6c3304956cd56131d8783d7cb38a2242a9f/datafusion/physical-plan/src/limit.rs#L833
- UnionExec: https://github.com/apache/arrow-datafusion/blob/c2e768052c43e4bab6705ee76befc19de383c2cb/datafusion/physical-plan/src/union.rs#L696
- FilterExec: https://github.com/apache/arrow-datafusion/blob/e1c2f9583015db326b3439897376f14f6b83a99a/datafusion/physical-plan/src/filter.rs#L455
places that appear to be lacking coverage at all
Here are impl ExecutionPlan
that implement statistics
but I didn't find any tests (though I could have missed them)
-
get_statistics_with_limit
: https://github.com/apache/arrow-datafusion/blob/e54894c39202815b14d9e7eae58f64d3a269c165/datafusion/core/src/datasource/statistics.rs#L34-L33 -
Join
statistics: https://github.com/apache/arrow-datafusion/blob/e642cc2a94f38518d765d25c8113523aedc29198/datafusion/physical-plan/src/joins/utils.rs#L455-L454 -
HashAggregateExec
https://github.com/apache/arrow-datafusion/blob/67d66faa829ea2fe102384a7534f86e66a3027b7/datafusion/physical-plan/src/aggregates/mod.rs#L888-L887 -
WindowExec
: https://github.com/apache/arrow-datafusion/blob/c2e768052c43e4bab6705ee76befc19de383c2cb/datafusion/physical-plan/src/windows/window_agg_exec.rs#L250 -
BoundedWindowExec
: https://github.com/apache/arrow-datafusion/blob/c2e768052c43e4bab6705ee76befc19de383c2cb/datafusion/physical-plan/src/windows/bounded_window_agg_exec.rs#L311
Describe the solution you'd like
Review and add coverage as necessary to locations above
Describe alternatives you've considered
No response
Additional context
No response