Use the chaining methods on PartialOrd for slices too #138881

scottmcm · 2025-03-24T07:52:11Z

#138135 added these doc-hidden trait methods to improve the tuple codegen. This PR adds more implementations and callers so that the codegen for slice (and array) comparisons also improves.

scottmcm · 2025-03-24T07:54:40Z

tests/codegen/array-cmp.rs

+
+// CHECK-LABEL: @array_of_tuple_le
+#[no_mangle]
+pub fn array_of_tuple_le(a: &[(i16, u16); 2], b: &[(i16, u16); 2]) -> bool {


Nightly, note the setl+setg from the cmp methods:

array_of_tuple_le: movzwl (%rdi), %eax movzwl (%rsi), %ecx cmpw %cx, %ax setl %cl setg %al jne .LBB0_1 movzwl 2(%rdi), %eax cmpw 2(%rsi), %ax seta %al sbbb $0, %al testb %al, %al jne .LBB0_7 jmp .LBB0_4 .LBB0_1: subb %cl, %al testb %al, %al jne .LBB0_7 .LBB0_4: movzwl 4(%rdi), %eax movzwl 4(%rsi), %ecx cmpw %cx, %ax setl %cl setg %al jne .LBB0_5 movzwl 6(%rdi), %eax cmpw 6(%rsi), %ax seta %al sbbb $0, %al jmp .LBB0_7 .LBB0_5: subb %cl, %al .LBB0_7: testb %al, %al setle %al retq

With this PR:

array_of_tuple_le: movzwl (%rcx), %eax movzwl (%rdx), %r8d cmpw %r8w, %ax jne .LBB1_1 movzwl 2(%rcx), %r8d movzwl 2(%rdx), %r9d cmpw %r9w, %r8w jne .LBB1_2 movzwl 4(%rcx), %eax movzwl 4(%rdx), %r8d cmpw %r8w, %ax jne .LBB1_1 movzwl 6(%rcx), %r8d movzwl 6(%rdx), %r9d movb $1, %al cmpw %r9w, %r8w jne .LBB1_2 retq .LBB1_1: cmpw %r8w, %ax setle %al retq .LBB1_2: cmpw %r9w, %r8w setb %al retq

That's so nice -- straight line with no jumps taken if they're equal, otherwise jumps to the check it needs (dedup'd by consistent register choices), saves that, and returns.

Note that even with #133984 landed, we still don't optimize away the [su]cmps in nightly today: https://rust.godbolt.org/z/ss5E7q6PT.

scottmcm · 2025-03-24T07:56:48Z

library/core/src/slice/cmp.rs

+        // This is certainly not the obvious way to implement these methods.
+        // Unfortunately, using anything that looks at the discriminant means that
+        // LLVM sees a check for `2` (aka `ControlFlow<bool>::Continue(())`) and
+        // gets very distracted by that, ending up generating extraneous code.
+        // This should be changed to something simpler once either LLVM is smarter,
+        // see <https://github.com/llvm/llvm-project/issues/132678>, or we generate
+        // niche discriminant checks in a way that doesn't trigger it.


So there's a backlink: llvm/llvm-project#132678

scottmcm · 2025-03-24T23:54:04Z

Oh, I'm glad this passes in LLVM18 too.

As the one with context from #138135,
r? @Mark-Simulacrum

bors · 2025-03-26T07:36:38Z

☔ The latest upstream changes (presumably #138956) made this pull request unmergeable. Please resolve the merge conflicts.

Mark-Simulacrum · 2025-04-12T14:18:33Z

@bors r+

bors · 2025-04-12T14:18:36Z

📌 Commit cf991d9 has been approved by Mark-Simulacrum

It is now in the queue for this repository.

…-Simulacrum Use the chaining methods on PartialOrd for slices too rust-lang#138135 added these doc-hidden trait methods to improve the tuple codegen. This PR adds more implementations and callers so that the codegen for slice (and array) comparisons also improves.

…enton Rollup of 9 pull requests Successful merges: - rust-lang#137494 (libstd: init(): dup() subsequent /dev/nulls instead of opening them again) - rust-lang#138881 (Use the chaining methods on PartialOrd for slices too) - rust-lang#138972 (std: Fix build for NuttX targets) - rust-lang#139107 (std: make `cmath` functions safe) - rust-lang#139607 (Add regression test for rust-lang#127424) - rust-lang#139691 (Document that `opt-dist` requires metrics to be enabled) - rust-lang#139707 (Fix comment in bootstrap) - rust-lang#139708 (Fix name of field in doc comment) - rust-lang#139709 (bootstrap: fix typo in doc string) r? `@ghost` `@rustbot` modify labels: rollup

…-Simulacrum Use the chaining methods on PartialOrd for slices too rust-lang#138135 added these doc-hidden trait methods to improve the tuple codegen. This PR adds more implementations and callers so that the codegen for slice (and array) comparisons also improves.

…enton Rollup of 8 pull requests Successful merges: - rust-lang#138881 (Use the chaining methods on PartialOrd for slices too) - rust-lang#138972 (std: Fix build for NuttX targets) - rust-lang#139107 (std: make `cmath` functions safe) - rust-lang#139607 (Add regression test for rust-lang#127424) - rust-lang#139691 (Document that `opt-dist` requires metrics to be enabled) - rust-lang#139707 (Fix comment in bootstrap) - rust-lang#139708 (Fix name of field in doc comment) - rust-lang#139709 (bootstrap: fix typo in doc string) r? `@ghost` `@rustbot` modify labels: rollup

ChrisDenton · 2025-04-13T01:06:16Z

Looks like this failed in rollup #139723 (comment)

@bors r-

scottmcm · 2025-04-13T05:12:17Z

Doh, I hard-coded i64 in GEP without realizing it's not a 64-bit-only test. Fixed by just accepting i32|i64 there.

scottmcm · 2025-04-13T06:17:40Z

@bors r=Mark-Simulacrum rollup=iffy (I think I fixed the codegen test, but who knows if there's something else I missed)

bors · 2025-04-13T06:17:43Z

📌 Commit 756670f has been approved by Mark-Simulacrum

It is now in the queue for this repository.

bors · 2025-04-13T11:47:26Z

⌛ Testing commit 756670f with merge 53d4476...

bors · 2025-04-13T14:55:37Z

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing 53d4476 to master...

github-actions · 2025-04-13T14:57:19Z

What is this?

This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 65fa0ab (parent) -> 53d4476 (this PR)

Test differences

Show 2 test diffs

Stage 1

[codegen] tests/codegen/array-cmp.rs: pass -> ignore (ignored when randomizing layouts ((checks depend on tuple layout))) (J1)

Stage 2

[codegen] tests/codegen/array-cmp.rs: pass -> ignore (ignored when randomizing layouts ((checks depend on tuple layout))) (J0)

Job group index

J0: x86_64-gnu-llvm-19-1, x86_64-gnu-llvm-19-2
J1: x86_64-gnu-llvm-19-3

Job duration changes

dist-x86_64-apple: 7777.8s -> 9709.6s (24.8%)
dist-riscv64-linux: 4855.1s -> 5246.0s (8.1%)
x86_64-rust-for-linux: 2603.4s -> 2789.6s (7.2%)
dist-various-2: 3351.8s -> 3564.8s (6.4%)
dist-x86_64-freebsd: 5100.2s -> 4813.9s (-5.6%)
dist-loongarch64-linux: 6252.1s -> 6552.8s (4.8%)
x86_64-gnu-llvm-19-1: 5196.3s -> 5438.3s (4.7%)
x86_64-apple-2: 6367.8s -> 6619.8s (4.0%)
aarch64-gnu-debug: 4088.9s -> 4226.1s (3.4%)
dist-x86_64-mingw: 7564.4s -> 7809.2s (3.2%)

How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

rust-timer · 2025-04-13T16:13:22Z

Finished benchmarking commit (53d4476): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary 2.8%, secondary 8.3%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	2.8%	[2.8%, 2.8%]	1
Regressions ❌ (secondary)	8.3%	[8.3%, 8.3%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	2.8%	[2.8%, 2.8%]	1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (primary -0.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	0.1%	[0.1%, 0.1%]	2
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-0.2%	[-0.5%, -0.1%]	7
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-0.2%	[-0.5%, 0.1%]	9

Bootstrap: 779.489s -> 780.32s (0.11%)
Artifact size: 365.53 MiB -> 365.51 MiB (-0.01%)

…imulacrum Use the chaining methods on PartialOrd for slices too rust-lang#138135 added these doc-hidden trait methods to improve the tuple codegen. This PR adds more implementations and callers so that the codegen for slice (and array) comparisons also improves.

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Mar 24, 2025

scottmcm commented Mar 24, 2025

View reviewed changes

rustbot assigned Mark-Simulacrum Mar 24, 2025

scottmcm marked this pull request as ready for review March 24, 2025 23:54

scottmcm removed the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Mar 25, 2025

codegen test for non-memcmp array comparison

19648ce

scottmcm force-pushed the more-chaining-ord branch from 17ab2d3 to 877f8de Compare March 31, 2025 07:21

This comment has been minimized.

Sign in to view

scottmcm force-pushed the more-chaining-ord branch from 877f8de to cf991d9 Compare March 31, 2025 07:53

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 12, 2025

ChrisDenton mentioned this pull request Apr 12, 2025

Rollup of 9 pull requests #139717

Closed

ChrisDenton mentioned this pull request Apr 12, 2025

Rollup of 8 pull requests #139723

Closed

bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 13, 2025

Extend the chaining logic to slices too

756670f

scottmcm force-pushed the more-chaining-ord branch from cf991d9 to 756670f Compare April 13, 2025 05:10

thaliaarchi mentioned this pull request Apr 13, 2025

std: Fix build for NuttX targets #138972

Merged

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 13, 2025

bors added the merged-by-bors This PR was explicitly merged by bors. label Apr 13, 2025

bors merged commit 53d4476 into rust-lang:master Apr 13, 2025
7 checks passed

rustbot added this to the 1.88.0 milestone Apr 13, 2025

bors mentioned this pull request Apr 13, 2025

Implement Random for tuple #136801

Open

scottmcm deleted the more-chaining-ord branch April 13, 2025 17:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use the chaining methods on PartialOrd for slices too #138881

Use the chaining methods on PartialOrd for slices too #138881

scottmcm commented Mar 24, 2025 •

edited

Loading

scottmcm Mar 24, 2025 •

edited

Loading

scottmcm Mar 31, 2025

scottmcm Mar 24, 2025

scottmcm commented Mar 24, 2025

bors commented Mar 26, 2025

This comment has been minimized.

Mark-Simulacrum commented Apr 12, 2025

bors commented Apr 12, 2025

ChrisDenton commented Apr 13, 2025

scottmcm commented Apr 13, 2025 •

edited

Loading

scottmcm commented Apr 13, 2025

bors commented Apr 13, 2025

bors commented Apr 13, 2025

bors commented Apr 13, 2025

github-actions bot commented Apr 13, 2025

Stage 1

Stage 2

rust-timer commented Apr 13, 2025

Use the chaining methods on PartialOrd for slices too #138881

Use the chaining methods on PartialOrd for slices too #138881

Conversation

scottmcm commented Mar 24, 2025 • edited Loading

scottmcm Mar 24, 2025 • edited Loading

Choose a reason for hiding this comment

scottmcm Mar 31, 2025

Choose a reason for hiding this comment

scottmcm Mar 24, 2025

Choose a reason for hiding this comment

scottmcm commented Mar 24, 2025

bors commented Mar 26, 2025

This comment has been minimized.

Mark-Simulacrum commented Apr 12, 2025

bors commented Apr 12, 2025

ChrisDenton commented Apr 13, 2025

scottmcm commented Apr 13, 2025 • edited Loading

scottmcm commented Apr 13, 2025

bors commented Apr 13, 2025

bors commented Apr 13, 2025

bors commented Apr 13, 2025

github-actions bot commented Apr 13, 2025

Test differences

Stage 1

Stage 2

Job duration changes

rust-timer commented Apr 13, 2025

Overall result: no relevant changes - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Binary size

scottmcm commented Mar 24, 2025 •

edited

Loading

scottmcm Mar 24, 2025 •

edited

Loading

scottmcm commented Apr 13, 2025 •

edited

Loading