Skip to content

fix: sigsev in valgrind due to incorrect optimization #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

not-matthias
Copy link
Member

@not-matthias not-matthias commented May 13, 2025

See linear issue for details. The main issue was having zeroStats and startInstrumentation in one function (with both being inlined). This seemed to confuse the compiler. Exporting one of the functions (startInstrumentation in this case) fixes the issue.

Also exported stopInstrumentation since we might want to call it manually later on to reduce the overhead (if there is one).

I'll add a CI step tomorrow:

  • Add a CI test to check for this (use just test-c)

Copy link
Member

@art049 art049 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@not-matthias is this the only way? Not inlining the instrumentation calls is an issue for accuracy of the measurement as we discussed already

@@ -48,6 +48,6 @@ pub inline fn startInstrumentation() void {
/// Use this to bypass Callgrind aggregation for uninteresting code parts.
/// To start Callgrind in this mode to ignore the setup phase, use
/// the option "--instr-atstart=no".
pub inline fn stopInstrumentation() void {
pub export fn callgrind_stop_instrumentation() void {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, not being able to inline this is kindof an issue

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I chose to export this was the following:

  1. The export will add 1 additional call to this function, which is not that much considering that we have a few instructions to setup/clean the stack before/afterwards.
  2. We can call this exported function directly from within the benchmark. We can't really control how the compiler lowers the function to assembly.

The callgrind_start_benchmark function doesn't cause any overhead since we directly return to the benchmark:
image


stop_benchmark when using export:
image

stop_benchmark when using inline:
image

As seen on the two examples above, we will execute at least 10+ other instructions to setup the stack, which will cause overhead. The only real way to ensure that we will always call it at the start would be using inline assembly which has an additional maintenance burden.


So my proposed solution is as follows:

  1. Keep the exports for starting and stopping valgrind, and call them in the root.
  2. If we are using valgrind for interpreted languages such as Python, the overhead of 10 instructions will be negligible. If we want to use valgrind for Rust or C++, we can call the callgrind_start/stop_benchmark functions directly.

@not-matthias not-matthias force-pushed the cod-868-sigsev-when-running-valgrind-with-zig-library branch 6 times, most recently from a671f32 to 0fb92ee Compare May 14, 2025 10:26
@not-matthias
Copy link
Member Author

not-matthias commented May 14, 2025

Looks like it fails to compile in CI with export, even though it works locally. Very weird 🤔

I'll invest a bit more time to try to figure out what's going on and to see if we can fix it, if that doesn't work I think it's best to just revert back to wrapping the valgrind header. There's little benefit in transpiling C -> Zig -> C.

EDIT: Turns out the issue is wrong inline assembly in the Zig std library. Using the inline assembly as defined in the valgrind header works.

     .x86_64 => asm volatile (
            \\ rolq  $3,    %%rdi ; rolq $13, %%rdi
            \\ rolq  $61,   %%rdi ; rolq $51, %%rdi
            \\ xchgq %%rbx, %%rbx
            : [_] "={rdx}" (-> usize),
            : [_] "a" (args),
              [_] "0" (default),
            : "cc", "memory"
        ),

However, this doesn't work in Zig for some reason:

zig build test --summary all
test
└─ run test
   └─ zig test Debug native failure
error: error: couldn't allocate input reg for constraint 'a'

I think the best way to deal with this is to just use the header from valgrind, then we don't have to deal with the inline assembly and differences between clang/gcc/zig. Inline assembly is currently also still unstable with major work going on (ziglang/zig#5241) so it's just safer to not yet rely on it.

@not-matthias
Copy link
Member Author

To summarize the final results:

  • We're now using the original valgrind header with wrapper C functions. This way we don't introduce any errors when translating the inline assembly to Zig and then back to C. Inline assembly is currently marked as unstable, so we shouldn't rely on it.
  • If we want absolute precision, we should either implement the callgrind marker in the target language or call the callgrind_start/stop_instrumentation wrapper functions directly. We can't control how the compiler arranges our generic start_benchmark function and it doesn't make sense doing so.

@not-matthias not-matthias requested a review from art049 May 14, 2025 15:24
@not-matthias not-matthias force-pushed the cod-868-sigsev-when-running-valgrind-with-zig-library branch from 9d8502e to 709cb61 Compare May 14, 2025 16:50
Copy link
Member

@art049 art049 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, thanks for the detailed analysis

@not-matthias not-matthias force-pushed the cod-868-sigsev-when-running-valgrind-with-zig-library branch from 709cb61 to acc213d Compare May 14, 2025 17:17
@not-matthias not-matthias merged commit acc213d into main May 14, 2025
5 checks passed
@not-matthias not-matthias deleted the cod-868-sigsev-when-running-valgrind-with-zig-library branch May 14, 2025 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants