Skip to content

Add CallModifier.never_intrinsify #21900

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

ParfenovIgor
Copy link

issue
I attempted to implement this. The string attribute no-builtins is passed to llvm. (I checked it explicitly in this place.) However, there is still recursion in generated code. I doubt, if this attribute is implemented in llvm.

Also, I don't much understand, what should I do in Sema.zig, and I don't know how to correctly bootstrap (note, that the field never_intrinsify is the last in the enum).

@alexrp
Copy link
Member

alexrp commented Nov 3, 2024

Thanks for taking a stab at this.

Note that the linked issue is a proposal. This PR can't be merged until the proposal is accepted.

Also, the attribute you want for this purpose is nobuiltin.

@alexrp alexrp linked an issue Nov 3, 2024 that may be closed by this pull request
@alexrp
Copy link
Member

alexrp commented Nov 3, 2024

@mlugg maybe you can give some pointers for the Sema changes?

@ParfenovIgor
Copy link
Author

I was a bit confused by this message
I've just tried to use nobuiltin instead of no-builtins, but this didn't change the llvm's output.

@alexrp
Copy link
Member

alexrp commented Nov 3, 2024

Have you verified that the resulting LLVM IR in fact contains nobuiltin on the call site?

Also compare pre-optimization (--verbose-llvm-ir) vs post-optimization (-femit-llvm-ir).

@ParfenovIgor
Copy link
Author

ParfenovIgor commented Nov 4, 2024

Yes

; Function Attrs: noimplicitfloat noredzone nosanitize_coverage nounwind skipprofile uwtable
define weak_odr dso_local void @__aeabi_memcpy(ptr nonnull align 1 %0, ptr nonnull align 1 %1, i32 %2) #1 {
...
  %13 = load i32, ptr %12, align 4
  %14 = call ptr @memcpy(ptr noalias align 1 %9, ptr noalias readonly align 1 %11, i32 %13) #9
  ret void
}
...
attributes #9 = { nobuiltin }

This is -femit-llvm-ir.

If I understand correctly, there is no way to use --verbose-llvm-ir in zig build. I used this line: ../build/stage4/bin/zig build-obj main.zig --verbose-llvm-ir=program.ll -target arm-freestanding-eabi -fcompiler-rt -fdata-sections -ffunction-sections. The behavior is the same (nobuiltin exists). But with -femit-llvm-ir the code became bigger, so probably I converted to line incorrectly.

@alexrp
Copy link
Member

alexrp commented Nov 4, 2024

I don't see anything wrong in that LLVM IR. It's retaining the memcpy call rather than turning it into an intrinsic call, and that's the behavior we want. If you compile it to machine code, you should hopefully now see bl memcpy in the assembly for __aeabi_memcpy.

@ParfenovIgor
Copy link
Author

But actually, I get this after disassembly.
llvm-objdump -D zig-out/compiler_rt.o

00000000 <__aeabi_memcpy>:
       0: 00 48 2d e9  	push	{r11, lr}
       4: 0d b0 a0 e1  	mov	r11, sp
       8: 10 d0 4d e2  	sub	sp, sp, #16
       c: 04 00 8d e5  	str	r0, [sp, #4]
      10: 08 10 8d e5  	str	r1, [sp, #8]
      14: 0c 20 8d e5  	str	r2, [sp, #12]
      18: 04 00 9d e5  	ldr	r0, [sp, #4]
      1c: 08 10 9d e5  	ldr	r1, [sp, #8]
      20: 0c 20 9d e5  	ldr	r2, [sp, #12]
      24: fe ff ff eb  	bl	0x24 <__aeabi_memcpy+0x24> @ imm = #-8
      28: 0b d0 a0 e1  	mov	sp, r11
      2c: 00 88 bd e8  	pop	{r11, pc}

(Those str and ldr instructions appear, once I wrap the call into @call, independently of attribute used.)

@alexrp
Copy link
Member

alexrp commented Nov 4, 2024

24: fe ff ff eb bl 0x24 <__aeabi_memcpy+0x24> @ imm = #-8

This looks like a call with a relocation. Try also passing -r to llvm-objdump to see what it's for.

@ParfenovIgor
Copy link
Author

I compared disassembly with -r between with @call(.never_intrinsify...) and without.

<        8: fe ff ff eb  	bl	0x8 <__aeabi_memcpy4+0x8> @ imm = #-8
< 			00000008:  R_ARM_CALL	memcpy
<        c: 00 88 bd e8  	pop	{r11, pc}
---
>        8: 10 d0 4d e2  	sub	sp, sp, #16
>        c: 04 00 8d e5  	str	r0, [sp, #4]
>       10: 08 10 8d e5  	str	r1, [sp, #8]
>       14: 0c 20 8d e5  	str	r2, [sp, #12]
>       18: 04 00 9d e5  	ldr	r0, [sp, #4]
>       1c: 08 10 9d e5  	ldr	r1, [sp, #8]
>       20: 0c 20 9d e5  	ldr	r2, [sp, #12]
>       24: fe ff ff eb  	bl	0x24 <__aeabi_memcpy4+0x24> @ imm = #-8
> 			00000024:  R_ARM_CALL	memcpy
>       28: 0b d0 a0 e1  	mov	sp, r11
>       2c: 00 88 bd e8  	pop	{r11, pc}

Moreover if I compare @call(.never_intrinsify...) and @call(.auto...), the disassembly of the function __aeabi_*** is identical.

Also in readelf in all cases:

Relocation section '.rel.text.__aeabi_memcpy' at offset 0x7c28c contains 1 entry:
 Offset     Info    Type            Sym.Value  Sym. Name
00000024  000ac21c R_ARM_CALL        00000000   memcpy

I don't much understand, how the relocation works, but I think, originally there were no recursive call.

@alexrp
Copy link
Member

alexrp commented Nov 4, 2024

I think reproducing the original bug is a bit involved. For some reason, I was only able to do it with the build.zig and instructions provided in #21831.

@ParfenovIgor
Copy link
Author

Oh, yes, I forgot about --release=safe.

With this when I don't use .never_intrinsify, then I see 00000000: R_ARM_JUMP24 __aeabi_memcpy in all functions.

When I used it, I see 00000000: R_ARM_JUMP24 __aeabi_memcpy in __aeabi_memcpy4 and __aeabi_memcpy8 and inlined memset implementation in __aeabi_memcpy.

@alexrp
Copy link
Member

alexrp commented Nov 5, 2024

Sounds like the change is working as intended then, if I understand your comment correctly.

In that case, we just need to wait for the proposal to be considered by Andrew.

@@ -5585,6 +5586,7 @@ pub const FuncGen = struct {
switch (modifier) {
.auto, .never_tail, .always_tail => {},
.never_inline => try attributes.addFnAttr(.@"noinline", &o.builder),
.never_intrinsify => try attributes.addFnAttr(.nobuiltin, &o.builder),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the handling of never_intrinsify will need to be combined with the code I added here: https://github.com/ziglang/zig/pull/21920/files#diff-4b7d5a9d78d51e4f41a41d80317e1ca360c16103d76b09c3dcb73ab3e4cc2d71R5579-R5581

@andrewrk
Copy link
Member

closing in favor of #22110

@andrewrk andrewrk closed this Feb 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Proposal: Add never_intrinsify to std.builtin.CallModifier
3 participants