Skip to content

Use the chaining methods on PartialOrd for slices too #138881

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 13, 2025

Conversation

scottmcm
Copy link
Member

@scottmcm scottmcm commented Mar 24, 2025

#138135 added these doc-hidden trait methods to improve the tuple codegen. This PR adds more implementations and callers so that the codegen for slice (and array) comparisons also improves.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Mar 24, 2025

// CHECK-LABEL: @array_of_tuple_le
#[no_mangle]
pub fn array_of_tuple_le(a: &[(i16, u16); 2], b: &[(i16, u16); 2]) -> bool {
Copy link
Member Author

@scottmcm scottmcm Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nightly, note the setl+setg from the cmp methods:

array_of_tuple_le:
        movzwl  (%rdi), %eax
        movzwl  (%rsi), %ecx
        cmpw    %cx, %ax
        setl    %cl
        setg    %al
        jne     .LBB0_1
        movzwl  2(%rdi), %eax
        cmpw    2(%rsi), %ax
        seta    %al
        sbbb    $0, %al
        testb   %al, %al
        jne     .LBB0_7
        jmp     .LBB0_4
.LBB0_1:
        subb    %cl, %al
        testb   %al, %al
        jne     .LBB0_7
.LBB0_4:
        movzwl  4(%rdi), %eax
        movzwl  4(%rsi), %ecx
        cmpw    %cx, %ax
        setl    %cl
        setg    %al
        jne     .LBB0_5
        movzwl  6(%rdi), %eax
        cmpw    6(%rsi), %ax
        seta    %al
        sbbb    $0, %al
        jmp     .LBB0_7
.LBB0_5:
        subb    %cl, %al
.LBB0_7:
        testb   %al, %al
        setle   %al
        retq

With this PR:

array_of_tuple_le:
	movzwl	(%rcx), %eax
	movzwl	(%rdx), %r8d
	cmpw	%r8w, %ax
	jne	.LBB1_1
	movzwl	2(%rcx), %r8d
	movzwl	2(%rdx), %r9d
	cmpw	%r9w, %r8w
	jne	.LBB1_2
	movzwl	4(%rcx), %eax
	movzwl	4(%rdx), %r8d
	cmpw	%r8w, %ax
	jne	.LBB1_1
	movzwl	6(%rcx), %r8d
	movzwl	6(%rdx), %r9d
	movb	$1, %al
	cmpw	%r9w, %r8w
	jne	.LBB1_2
	retq
.LBB1_1:
	cmpw	%r8w, %ax
	setle	%al
	retq
.LBB1_2:
	cmpw	%r9w, %r8w
	setb	%al
	retq

That's so nice -- straight line with no jumps taken if they're equal, otherwise jumps to the check it needs (dedup'd by consistent register choices), saves that, and returns.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that even with #133984 landed, we still don't optimize away the [su]cmps in nightly today: https://rust.godbolt.org/z/ss5E7q6PT.

Comment on lines +55 to +61
// This is certainly not the obvious way to implement these methods.
// Unfortunately, using anything that looks at the discriminant means that
// LLVM sees a check for `2` (aka `ControlFlow<bool>::Continue(())`) and
// gets very distracted by that, ending up generating extraneous code.
// This should be changed to something simpler once either LLVM is smarter,
// see <https://github.com/llvm/llvm-project/issues/132678>, or we generate
// niche discriminant checks in a way that doesn't trigger it.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there's a backlink: llvm/llvm-project#132678

@scottmcm
Copy link
Member Author

Oh, I'm glad this passes in LLVM18 too.

As the one with context from #138135,
r? @Mark-Simulacrum

@scottmcm scottmcm marked this pull request as ready for review March 24, 2025 23:54
@scottmcm scottmcm removed the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Mar 25, 2025
@bors
Copy link
Collaborator

bors commented Mar 26, 2025

☔ The latest upstream changes (presumably #138956) made this pull request unmergeable. Please resolve the merge conflicts.

@scottmcm scottmcm force-pushed the more-chaining-ord branch from 17ab2d3 to 877f8de Compare March 31, 2025 07:21
@rust-log-analyzer

This comment has been minimized.

@scottmcm scottmcm force-pushed the more-chaining-ord branch from 877f8de to cf991d9 Compare March 31, 2025 07:53
@Mark-Simulacrum
Copy link
Member

@bors r+

@bors
Copy link
Collaborator

bors commented Apr 12, 2025

📌 Commit cf991d9 has been approved by Mark-Simulacrum

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Apr 12, 2025
ChrisDenton added a commit to ChrisDenton/rust that referenced this pull request Apr 12, 2025
…-Simulacrum

Use the chaining methods on PartialOrd for slices too

rust-lang#138135 added these doc-hidden trait methods to improve the tuple codegen.  This PR adds more implementations and callers so that the codegen for slice (and array) comparisons also improves.
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 12, 2025
…enton

Rollup of 9 pull requests

Successful merges:

 - rust-lang#137494 (libstd: init(): dup() subsequent /dev/nulls instead of opening them again)
 - rust-lang#138881 (Use the chaining methods on PartialOrd for slices too)
 - rust-lang#138972 (std: Fix build for NuttX targets)
 - rust-lang#139107 (std: make `cmath` functions safe)
 - rust-lang#139607 (Add regression test for rust-lang#127424)
 - rust-lang#139691 (Document that `opt-dist` requires metrics to be enabled)
 - rust-lang#139707 (Fix comment in bootstrap)
 - rust-lang#139708 (Fix name of field in doc comment)
 - rust-lang#139709 (bootstrap: fix typo in doc string)

r? `@ghost`
`@rustbot` modify labels: rollup
ChrisDenton added a commit to ChrisDenton/rust that referenced this pull request Apr 12, 2025
…-Simulacrum

Use the chaining methods on PartialOrd for slices too

rust-lang#138135 added these doc-hidden trait methods to improve the tuple codegen.  This PR adds more implementations and callers so that the codegen for slice (and array) comparisons also improves.
bors added a commit to rust-lang-ci/rust that referenced this pull request Apr 13, 2025
…enton

Rollup of 8 pull requests

Successful merges:

 - rust-lang#138881 (Use the chaining methods on PartialOrd for slices too)
 - rust-lang#138972 (std: Fix build for NuttX targets)
 - rust-lang#139107 (std: make `cmath` functions safe)
 - rust-lang#139607 (Add regression test for rust-lang#127424)
 - rust-lang#139691 (Document that `opt-dist` requires metrics to be enabled)
 - rust-lang#139707 (Fix comment in bootstrap)
 - rust-lang#139708 (Fix name of field in doc comment)
 - rust-lang#139709 (bootstrap: fix typo in doc string)

r? `@ghost`
`@rustbot` modify labels: rollup
@ChrisDenton
Copy link
Member

Looks like this failed in rollup #139723 (comment)

@bors r-

@bors bors added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. labels Apr 13, 2025
@scottmcm scottmcm force-pushed the more-chaining-ord branch from cf991d9 to 756670f Compare April 13, 2025 05:10
@scottmcm
Copy link
Member Author

scottmcm commented Apr 13, 2025

Doh, I hard-coded i64 in GEP without realizing it's not a 64-bit-only test. Fixed by just accepting i32|i64 there.

@scottmcm
Copy link
Member Author

@bors r=Mark-Simulacrum rollup=iffy (I think I fixed the codegen test, but who knows if there's something else I missed)

@bors
Copy link
Collaborator

bors commented Apr 13, 2025

📌 Commit 756670f has been approved by Mark-Simulacrum

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Apr 13, 2025
@bors
Copy link
Collaborator

bors commented Apr 13, 2025

⌛ Testing commit 756670f with merge 53d4476...

@bors
Copy link
Collaborator

bors commented Apr 13, 2025

☀️ Test successful - checks-actions
Approved by: Mark-Simulacrum
Pushing 53d4476 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Apr 13, 2025
@bors bors merged commit 53d4476 into rust-lang:master Apr 13, 2025
7 checks passed
@rustbot rustbot added this to the 1.88.0 milestone Apr 13, 2025
Copy link

What is this? This is an experimental post-merge analysis report that shows differences in test outcomes between the merged PR and its parent PR.

Comparing 65fa0ab (parent) -> 53d4476 (this PR)

Test differences

Show 2 test diffs

Stage 1

  • [codegen] tests/codegen/array-cmp.rs: pass -> ignore (ignored when randomizing layouts ((checks depend on tuple layout))) (J1)

Stage 2

  • [codegen] tests/codegen/array-cmp.rs: pass -> ignore (ignored when randomizing layouts ((checks depend on tuple layout))) (J0)

Job group index

Job duration changes

  1. dist-x86_64-apple: 7777.8s -> 9709.6s (24.8%)
  2. dist-riscv64-linux: 4855.1s -> 5246.0s (8.1%)
  3. x86_64-rust-for-linux: 2603.4s -> 2789.6s (7.2%)
  4. dist-various-2: 3351.8s -> 3564.8s (6.4%)
  5. dist-x86_64-freebsd: 5100.2s -> 4813.9s (-5.6%)
  6. dist-loongarch64-linux: 6252.1s -> 6552.8s (4.8%)
  7. x86_64-gnu-llvm-19-1: 5196.3s -> 5438.3s (4.7%)
  8. x86_64-apple-2: 6367.8s -> 6619.8s (4.0%)
  9. aarch64-gnu-debug: 4088.9s -> 4226.1s (3.4%)
  10. dist-x86_64-mingw: 7564.4s -> 7809.2s (3.2%)
How to interpret the job duration changes?

Job durations can vary a lot, based on the actual runner instance
that executed the job, system noise, invalidated caches, etc. The table above is provided
mostly for t-infra members, for simpler debugging of potential CI slow-downs.

@rust-timer
Copy link
Collaborator

Finished benchmarking commit (53d4476): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary 2.8%, secondary 8.3%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
2.8% [2.8%, 2.8%] 1
Regressions ❌
(secondary)
8.3% [8.3%, 8.3%] 1
Improvements ✅
(primary)
- - 0
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) 2.8% [2.8%, 2.8%] 1

Cycles

This benchmark run did not return any relevant results for this metric.

Binary size

Results (primary -0.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

mean range count
Regressions ❌
(primary)
0.1% [0.1%, 0.1%] 2
Regressions ❌
(secondary)
- - 0
Improvements ✅
(primary)
-0.2% [-0.5%, -0.1%] 7
Improvements ✅
(secondary)
- - 0
All ❌✅ (primary) -0.2% [-0.5%, 0.1%] 9

Bootstrap: 779.489s -> 780.32s (0.11%)
Artifact size: 365.53 MiB -> 365.51 MiB (-0.01%)

@scottmcm scottmcm deleted the more-chaining-ord branch April 13, 2025 17:25
github-actions bot pushed a commit to model-checking/verify-rust-std that referenced this pull request Apr 19, 2025
…imulacrum

Use the chaining methods on PartialOrd for slices too

rust-lang#138135 added these doc-hidden trait methods to improve the tuple codegen.  This PR adds more implementations and callers so that the codegen for slice (and array) comparisons also improves.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants