Skip to content

Introducing mutation testing #14590

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Conversation

edmondop
Copy link
Contributor

Adds an ad-hoc pipeline for running cargo mutants as discussed in #14589

@github-actions github-actions bot added the development-process Related to development process of DataFusion label Feb 10, 2025
@comphead
Copy link
Contributor

Thanks @edmondop I cannot see this flow in the list of PR checks, how long it takes?

@edmondop
Copy link
Contributor Author

edmondop commented Feb 11, 2025

Thanks @edmondop I cannot see this flow in the list of PR checks, how long it takes?

On my fork it didn't appear either, you need to merge the PR. Maybe you can try to merge on your own fork?
Note that at the moment this checks can only be run manually (until we figure out what to do with the information in it, which is useful imho)

See here, it took 54 minutes https://github.com/edmondop/arrow-datafusion/actions/runs/13248520757 and it crashed before it could run all 22844 mutants

Found 22844 mutants to test
 WARN An explicit test timeout is recommended when using --baseline=skip; using 300 seconds by default
MISSED   datafusion/functions/src/math/log.rs:222:61: replace == with != in <impl ScalarUDFImpl for LogFunc>::simplify in 84.2s build + 6.9s test
MISSED   datafusion/functions/src/lib.rs:179:5: replace register_all -> Result<()> with Ok(()) in 18.1s build + 5.7s test
MISSED   datafusion/physical-optimizer/src/join_selection.rs:227:9: replace || with && in try_collect_left in 77.1s build + 0.4s test
MISSED   datafusion/physical-plan/src/aggregates/mod.rs:1278:27: replace <= with > in group_id_array in 90.6s build + 41.3s test
MISSED   datafusion/physical-optimizer/src/limit_pushdown.rs:77:9: replace <impl PhysicalOptimizerRule for LimitPushdown>::name -> &str with "" in 14.8s build + 0.3s test
MISSED   datafusion/functions-aggregate/src/array_agg.rs:383:13: replace - with + in <impl Accumulator for DistinctArrayAggAccumulator>::size in 70.4s build + 1.7s test
MISSED   datafusion/functions-aggregate/src/sum.rs:469:9: replace <impl Accumulator for DistinctSumAccumulator<T>>::size -> usize with 1 in 35.2s build + 1.6s test
MISSED   datafusion/functions-nested/src/concat.rs:292:35: delete ! in <impl ScalarUDFImpl for ArrayConcat>::return_type in 40.2s build + 5.2s test
MISSED   datafusion/proto-common/src/to_proto/mod.rs:247:9: replace <impl TryFrom for protobuf::Schema>::try_from -> Result<Self, Self::Error> with Ok(Default::default()) in 14.9s build + 2.7s test
MISSED   datafusion/optimizer/src/eliminate_duplicated_expr.rs:45:9: replace <impl PartialEq for SortExprWrapper>::eq -> bool with true in 46.4s build + 11.5s test
MISSED   datafusion/functions-aggregate/src/string_agg.rs:176:9: replace <impl Accumulator for StringAggAccumulator>::size -> usize with 0 in 41.0s build + 1.6s test
MISSED   datafusion/physical-plan/src/aggregates/group_values/single_group_by/bytes.rs:77:9: replace <impl GroupValues for GroupValuesByes<O>>::size -> usize with 0 in 23.8s build + 39.4s test
MISSED   datafusion/catalog/src/memory.rs:166:9: replace <impl SchemaProvider for MemorySchemaProvider>::table_names -> Vec<String> with vec!["xyzzy".into()] in 25.3s build + 0.6s test
MISSED   datafusion/expr-common/src/type_coercion/aggregates.rs:276:5: replace is_covariance_support_arg_type -> bool with false in 8.7s build + 5.7s test
MISSED   datafusion/physical-expr/src/window/window_expr.rs:294:20: delete ! in AggregateWindowExpr::get_result_column in 38.3s build + 9.0s test
TIMEOUT  datafusion/physical-plan/src/aggregates/row_hash.rs:1059:9: replace GroupedHashAggregateStream::update_merged_stream -> Result<()> with Ok(()) in 53.6s build + 300.0s test
MISSED   datafusion/physical-expr-common/src/sort_expr.rs:429:9: replace LexOrdering::from_lex_requirement -> LexOrdering with Default::default() in 7.7s build + 10.2s test
MISSED   datafusion/sql/src/statement.rs:1607:69: replace - with / in SqlToRel<'_, S>::show_variable_to_plan in 45.2s build + 9.4s test

@comphead
Copy link
Contributor

I'm wondering should we have such pipelines optional just before the release, it would be pretty much expensive to run it per each PR... 🤔

@edmondop
Copy link
Contributor Author

Agreed! For now it requires manual execution

@2010YOUY01
Copy link
Contributor

Thanks @edmondop I cannot see this flow in the list of PR checks, how long it takes?

On my fork it didn't appear either, you need to merge the PR. Maybe you can try to merge on your own fork? Note that at the moment this checks can only be run manually (until we figure out what to do with the information in it, which is useful imho)

See here, it took 54 minutes https://github.com/edmondop/arrow-datafusion/actions/runs/13248520757 and it crashed before it could run all 22844 mutants

Found 22844 mutants to test
 WARN An explicit test timeout is recommended when using --baseline=skip; using 300 seconds by default
MISSED   datafusion/functions/src/math/log.rs:222:61: replace == with != in <impl ScalarUDFImpl for LogFunc>::simplify in 84.2s build + 6.9s test
MISSED   datafusion/functions/src/lib.rs:179:5: replace register_all -> Result<()> with Ok(()) in 18.1s build + 5.7s test
MISSED   datafusion/physical-optimizer/src/join_selection.rs:227:9: replace || with && in try_collect_left in 77.1s build + 0.4s test
MISSED   datafusion/physical-plan/src/aggregates/mod.rs:1278:27: replace <= with > in group_id_array in 90.6s build + 41.3s test
MISSED   datafusion/physical-optimizer/src/limit_pushdown.rs:77:9: replace <impl PhysicalOptimizerRule for LimitPushdown>::name -> &str with "" in 14.8s build + 0.3s test
MISSED   datafusion/functions-aggregate/src/array_agg.rs:383:13: replace - with + in <impl Accumulator for DistinctArrayAggAccumulator>::size in 70.4s build + 1.7s test
MISSED   datafusion/functions-aggregate/src/sum.rs:469:9: replace <impl Accumulator for DistinctSumAccumulator<T>>::size -> usize with 1 in 35.2s build + 1.6s test
MISSED   datafusion/functions-nested/src/concat.rs:292:35: delete ! in <impl ScalarUDFImpl for ArrayConcat>::return_type in 40.2s build + 5.2s test
MISSED   datafusion/proto-common/src/to_proto/mod.rs:247:9: replace <impl TryFrom for protobuf::Schema>::try_from -> Result<Self, Self::Error> with Ok(Default::default()) in 14.9s build + 2.7s test
MISSED   datafusion/optimizer/src/eliminate_duplicated_expr.rs:45:9: replace <impl PartialEq for SortExprWrapper>::eq -> bool with true in 46.4s build + 11.5s test
MISSED   datafusion/functions-aggregate/src/string_agg.rs:176:9: replace <impl Accumulator for StringAggAccumulator>::size -> usize with 0 in 41.0s build + 1.6s test
MISSED   datafusion/physical-plan/src/aggregates/group_values/single_group_by/bytes.rs:77:9: replace <impl GroupValues for GroupValuesByes<O>>::size -> usize with 0 in 23.8s build + 39.4s test
MISSED   datafusion/catalog/src/memory.rs:166:9: replace <impl SchemaProvider for MemorySchemaProvider>::table_names -> Vec<String> with vec!["xyzzy".into()] in 25.3s build + 0.6s test
MISSED   datafusion/expr-common/src/type_coercion/aggregates.rs:276:5: replace is_covariance_support_arg_type -> bool with false in 8.7s build + 5.7s test
MISSED   datafusion/physical-expr/src/window/window_expr.rs:294:20: delete ! in AggregateWindowExpr::get_result_column in 38.3s build + 9.0s test
TIMEOUT  datafusion/physical-plan/src/aggregates/row_hash.rs:1059:9: replace GroupedHashAggregateStream::update_merged_stream -> Result<()> with Ok(()) in 53.6s build + 300.0s test
MISSED   datafusion/physical-expr-common/src/sort_expr.rs:429:9: replace LexOrdering::from_lex_requirement -> LexOrdering with Default::default() in 7.7s build + 10.2s test
MISSED   datafusion/sql/src/statement.rs:1607:69: replace - with / in SqlToRel<'_, S>::show_variable_to_plan in 45.2s build + 9.4s test

Thank you for the experiments, I think mutation tests definitely can improve software quality

I'm wondering how many mutations have this CI run executed in 53 minutes? Looks like each takes 1 minute, and it's only 50/22844 ran and so many errors showed up.
And each one I believe is expensive to fix and get reviewed. I tried to fix the first one and this took some time, I think they all require us to figure out how an unfamiliar code snippet works. (then it's impossible to fix them all)

An alternative can be for each PR, only run mutation tests for modified code, it would be much easier to fix test coverage issues since the PR authors just wrote the code

@Omega359
Copy link
Contributor

Omega359 commented Mar 9, 2025

I am wondering if this test is just too strenuous for the ci runners. If it could be narrowed down to just the modified code I could see this working, otherwise perhaps it'll either have to be run manually on hardware with more cores, etc.

Copy link

github-actions bot commented May 8, 2025

Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale PR has not had any activity for some time label May 8, 2025
@github-actions github-actions bot closed this May 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development-process Related to development process of DataFusion Stale PR has not had any activity for some time
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants