perf: Suppress telemetry using ContextFlags(usize) instead of bool #2861

bantonsson · 2025-03-25T14:43:41Z

Changes

This PR aligns the Context struct and changes a bool into a flag field. It tries to mitigate the performance impact of #2821 on context attach/detach operations.

Merge requirement checklist

CONTRIBUTING guidelines followed
Unit tests added/updated (if applicable)
Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
Changes in public API reviewed (if applicable)

codecov · 2025-03-25T14:47:40Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 81.3%. Comparing base (bc82d4f) to head (2f628ee).

Additional details and impacted files

@@          Coverage Diff          @@
##            main   #2861   +/-   ##
=====================================
  Coverage   81.3%   81.3%           
=====================================
  Files        126     126           
  Lines      24156   24162    +6     
=====================================
+ Hits       19650   19656    +6     
  Misses      4506    4506

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

scottgerring · 2025-03-26T12:06:59Z

👷 build bot output from this run:

name	baseDuration	changesDuration	difference
context_attach/nested_cx/empty_cx	31.7±0.14ns	19.1±0.27ns	-40
context_attach/nested_cx/single_value_cx	33.7±0.22ns	19.4±0.44ns	-42
context_attach/nested_cx/span_cx	33.0±0.29ns	19.4±0.30ns	-41
context_attach/out_of_order_cx_drop/empty_cx	38.7±0.18ns	19.8±0.19ns	-49
context_attach/out_of_order_cx_drop/single_value_cx	40.1±0.13ns	20.8±0.48ns	-48
context_attach/out_of_order_cx_drop/span_cx	39.7±0.22ns	20.8±0.32ns	-48
context_attach/single_cx/empty_cx	18.1±0.22ns	12.5±0.31ns	-31
context_attach/single_cx/single_value_cx	17.6±0.08ns	12.5±0.13ns	-29
context_attach/single_cx/span_cx	17.3±0.13ns	12.5±0.25ns	-28

bantonsson · 2025-03-27T10:25:19Z

~~These are the benchmark numbers from this run:~~

These numbers are not relevant since the code has changed completely.

bantonsson · 2025-03-28T14:59:46Z

This is still a draft until #2870 has been merged and all benchmarks are run properly.

bantonsson · 2025-04-04T07:39:40Z

New performance numbers from new approach in this run:

name	baseDuration	changesDuration	difference
`context/has_active_span/in-cx/alt`	`8.4±0.03ns`	`8.4±0.05ns`	`0.0`
`context/has_active_span/in-cx/spec`	`5.2±0.17ns`	`5.0±0.16ns`	`-2.9`
`context/has_active_span/no-cx/alt`	`8.4±0.03ns`	`8.4±0.03ns`	`0.0`
`context/has_active_span/no-cx/spec`	`5.0±0.17ns`	`4.7±0.19ns`	`-6.5`
`context/has_active_span/no-sdk/alt`	`8.4±0.02ns`	`8.4±0.02ns`	`0.0`
`context/has_active_span/no-sdk/spec`	`5.0±0.18ns`	`4.7±0.17ns`	`-6.5`
`context/is_recording/in-cx/alt`	`4.7±0.19ns`	`4.7±0.14ns`	`0.0`
`context/is_recording/in-cx/spec`	`7.5±0.22ns`	`7.5±0.23ns`	`0.0`
`context/is_recording/no-cx/alt`	`4.7±0.14ns`	`4.7±0.17ns`	`0.0`
`context/is_recording/no-cx/spec`	`7.2±0.36ns`	`7.2±0.39ns`	`+1.0`
`context/is_recording/no-sdk/alt`	`4.7±0.13ns`	`4.7±0.14ns`	`0.0`
`context/is_recording/no-sdk/spec`	`7.2±0.21ns`	`7.2±0.24ns`	`0.0`
`context/is_sampled/in-cx/alt`	`8.7±0.03ns`	`8.7±0.04ns`	`0.0`
`context/is_sampled/in-cx/spec`	`5.4±0.15ns`	`5.3±0.18ns`	`-0.99`
`context/is_sampled/no-cx/alt`	`8.7±0.03ns`	`8.7±0.04ns`	`0.0`
`context/is_sampled/no-cx/spec`	`5.1±0.53ns`	`5.0±0.17ns`	`-0.99`
`context/is_sampled/no-sdk/alt`	`8.7±0.03ns`	`8.7±0.05ns`	`0.0`
`context/is_sampled/no-sdk/spec`	`5.0±0.06ns`	`5.0±0.19ns`	`0.0`
`context_attach/nested_cx/empty_cx`	`47.4±1.16ns`	`39.0±1.06ns`	`-18`
`context_attach/nested_cx/single_value_cx`	`48.7±1.19ns`	`42.8±1.00ns`	`-12`
`context_attach/nested_cx/span_cx`	`48.5±0.44ns`	`43.0±4.56ns`	`-12`
`context_attach/out_of_order_cx_drop/empty_cx`	`41.7±0.94ns`	`40.6±1.20ns`	`-2.9`
`context_attach/out_of_order_cx_drop/single_value_cx`	`42.3±2.58ns`	`42.0±0.89ns`	`-0.99`
`context_attach/out_of_order_cx_drop/span_cx`	`42.5±0.87ns`	`42.0±0.85ns`	`-0.99`
`context_attach/single_cx/empty_cx`	`24.0±0.59ns`	`19.4±0.36ns`	`-19`
`context_attach/single_cx/single_value_cx`	`23.4±0.52ns`	`23.4±0.44ns`	`0.0`
`context_attach/single_cx/span_cx`	`23.4±0.65ns`	`23.1±0.51ns`	`-2.0`
`exporter_disabled_concurrent_processor`	`2.9±0.29ns`	`3.1±0.08ns`	`+7.0`
`exporter_disabled_simple_processor`	`6.0±0.26ns`	`6.2±0.18ns`	`+5.0`
`telemetry_suppression/enter_telemetry_suppressed_scope`	`27.1±0.18ns`	`25.2±0.46ns`	`-7.4`
`telemetry_suppression/is_current_telemetry_suppressed_false`	`1.4±0.02ns`	`1.3±0.03ns`	`-6.5`
`telemetry_suppression/is_current_telemetry_suppressed_true`	`1.4±0.05ns`	`1.3±0.03ns`	`-6.5`
`telemetry_suppression/normal_attach`	`30.4±0.62ns`	`28.0±0.86ns`	`-7.4`

cijothomas · 2025-04-04T15:25:33Z

@bantonsson can you run the bench in your machine and see if you are also observing the same? I am seeing regression in my laptop. There are improvements to attach ones anyway, so we should still proceed with this PR, but I am curious how much we can trust the bench results from the CI machines!

telemetry_suppression/enter_telemetry_suppressed_scope
time: [10.170 ns 10.198 ns 10.224 ns]
change: [+8.3229% +8.8793% +9.4188%] (p = 0.00 < 0.05)
Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
2 (2.00%) low severe
3 (3.00%) low mild
1 (1.00%) high severe
telemetry_suppression/normal_attach
time: [11.386 ns 11.440 ns 11.495 ns]
change: [+9.0850% +9.5688% +10.094%] (p = 0.00 < 0.05)
Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
7 (7.00%) low mild
2 (2.00%) high mild
telemetry_suppression/is_current_telemetry_suppressed_false
time: [729.10 ps 731.32 ps 733.54 ps]
change: [-2.4232% -1.9738% -1.5230%] (p = 0.00 < 0.05)
Performance has improved.
Found 20 outliers among 100 measurements (20.00%)
4 (4.00%) low severe
8 (8.00%) low mild
4 (4.00%) high mild
4 (4.00%) high severe
telemetry_suppression/is_current_telemetry_suppressed_true
time: [730.30 ps 732.13 ps 734.16 ps]
change: [-1.7477% -1.2922% -0.7619%] (p = 0.00 < 0.05)
Change within noise threshold.

bantonsson · 2025-04-07T07:44:21Z

@cijothomas These are the numbers from my M1 Max Laptop, where I see a slight regression in checks and large improvements in entering.

Benchmarking telemetry_suppression/enter_telemetry_suppressed_scope: Collecting 100 samples in estimated 2.0001 s (135M itelemetry_suppression/enter_telemetry_suppressed_scope
                        time:   [14.479 ns 14.511 ns 14.549 ns]
                        change: [-21.316% -20.968% -20.611%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 11 outliers among 100 measurements (11.00%)
  2 (2.00%) low mild
  2 (2.00%) high mild
  7 (7.00%) high severe
telemetry_suppression/normal_attach
                        time:   [15.431 ns 15.483 ns 15.550 ns]
                        change: [-18.631% -18.300% -17.951%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  6 (6.00%) low mild
  1 (1.00%) high mild
  5 (5.00%) high severe
Benchmarking telemetry_suppression/is_current_telemetry_suppressed_false: Collecting 100 samples in estimated 2.0000 s (1telemetry_suppression/is_current_telemetry_suppressed_false
                        time:   [1.1357 ns 1.1436 ns 1.1545 ns]
                        change: [+2.1229% +2.7044% +3.3306%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe
Benchmarking telemetry_suppression/is_current_telemetry_suppressed_true: Collecting 100 samples in estimated 2.0000 s (1.telemetry_suppression/is_current_telemetry_suppressed_true
                        time:   [1.1311 ns 1.1367 ns 1.1438 ns]
                        change: [+1.1998% +1.6296% +2.0577%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 23 outliers among 100 measurements (23.00%)
  2 (2.00%) low severe
  3 (3.00%) low mild
  4 (4.00%) high mild
  14 (14.00%) high severe

And these are the numbers from my AMD Ryzen 5 3600 2.2GHz box, where I see a slight improvement in checks and large improvements in entering.

Benchmarking telemetry_suppression/enter_telemetry_suppressed_scope: Collecting 100 samples in estimated 2.0001 s (85M iteratio
telemetry_suppression/enter_telemetry_suppressed_scope
                        time:   [23.078 ns 23.081 ns 23.083 ns]
                        change: [-16.623% -16.588% -16.558%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe
telemetry_suppression/normal_attach
                        time:   [24.008 ns 24.024 ns 24.039 ns]
                        change: [-17.823% -17.631% -17.412%] (p = 0.00 < 0.05)
                        Performance has improved.
Benchmarking telemetry_suppression/is_current_telemetry_suppressed_false: Collecting 100 samples in estimated 2.0000 s (2.8B it
telemetry_suppression/is_current_telemetry_suppressed_false
                        time:   [715.65 ps 715.73 ps 715.82 ps]
                        change: [-1.7774% -1.7447% -1.7127%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  3 (3.00%) high severe
Benchmarking telemetry_suppression/is_current_telemetry_suppressed_true: Collecting 100 samples in estimated 2.0000 s (2.8B ite
telemetry_suppression/is_current_telemetry_suppressed_true
                        time:   [715.54 ps 715.63 ps 715.73 ps]
                        change: [-3.6408% -3.0627% -2.5393%] (p = 0.00 < 0.05)
                        Performance has improved.

The code seems to be highly sensitive to alignment, so use a bitfield instead of a boolean.

bantonsson force-pushed the ban/context-suppression branch 3 times, most recently from d79a782 to dc038f9 Compare March 25, 2025 15:36

scottgerring added the performance label Mar 26, 2025

scottgerring closed this Mar 26, 2025

scottgerring reopened this Mar 26, 2025

bantonsson force-pushed the ban/context-suppression branch from dc038f9 to 06eb4a1 Compare March 27, 2025 08:10

bantonsson marked this pull request as ready for review March 27, 2025 10:28

bantonsson requested a review from a team as a code owner March 27, 2025 10:28

bantonsson force-pushed the ban/context-suppression branch 4 times, most recently from e3aca49 to ad3ad40 Compare March 28, 2025 09:14

bantonsson marked this pull request as draft March 28, 2025 14:58

bantonsson force-pushed the ban/context-suppression branch 6 times, most recently from baad2fe to 01874de Compare April 3, 2025 15:41

bantonsson changed the title ~~perf: Optimize cloning of Context since it is immutable~~ perf: Suppress telemetry using ContextFlags(usize) instead of bool Apr 4, 2025

bantonsson force-pushed the ban/context-suppression branch from 01874de to 3416528 Compare April 4, 2025 07:40

bantonsson marked this pull request as ready for review April 4, 2025 07:42

perf: Suppress telemetry using ContextFlags(usize) instead of bool

2f628ee

The code seems to be highly sensitive to alignment, so use a bitfield instead of a boolean.

bantonsson force-pushed the ban/context-suppression branch from fe37e71 to 2f628ee Compare April 7, 2025 07:54

bantonsson mentioned this pull request Apr 9, 2025

REQUEST: New membership for @bantonsson open-telemetry/community#2649

Closed

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: Suppress telemetry using ContextFlags(usize) instead of bool #2861

perf: Suppress telemetry using ContextFlags(usize) instead of bool #2861

bantonsson commented Mar 25, 2025 •

edited

Loading

codecov bot commented Mar 25, 2025 •

edited

Loading

scottgerring commented Mar 26, 2025

bantonsson commented Mar 27, 2025 •

edited

Loading

bantonsson commented Mar 28, 2025

bantonsson commented Apr 4, 2025

cijothomas commented Apr 4, 2025

bantonsson commented Apr 7, 2025 •

edited

Loading

perf: Suppress telemetry using ContextFlags(usize) instead of bool #2861

Are you sure you want to change the base?

perf: Suppress telemetry using ContextFlags(usize) instead of bool #2861

Conversation

bantonsson commented Mar 25, 2025 • edited Loading

Changes

Merge requirement checklist

codecov bot commented Mar 25, 2025 • edited Loading

Codecov Report

scottgerring commented Mar 26, 2025

bantonsson commented Mar 27, 2025 • edited Loading

bantonsson commented Mar 28, 2025

bantonsson commented Apr 4, 2025

cijothomas commented Apr 4, 2025

bantonsson commented Apr 7, 2025 • edited Loading

bantonsson commented Mar 25, 2025 •

edited

Loading

codecov bot commented Mar 25, 2025 •

edited

Loading

bantonsson commented Mar 27, 2025 •

edited

Loading

bantonsson commented Apr 7, 2025 •

edited

Loading