8359110: Log accumulated GC and process CPU time upon VM exit #25779

JonasNorlinder · 2025-06-12T11:28:14Z

Add support to log CPU cost for GC during VM exit with -Xlog:gc.

[1.500s][info ][gc] GC CPU cost: 1.75%

Additionally, detailed information may be retrieved with -Xlog:gc=trace

[1.500s][trace][gc] Process CPU time: 4.945370s
[1.500s][trace][gc] GC CPU time: 0.086382s
[1.500s][info ][gc] GC CPU cost: 1.75%

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8359110: Log accumulated GC and process CPU time upon VM exit (Enhancement - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25779/head:pull/25779
$ git checkout pull/25779

Update a local copy of the PR:
$ git checkout pull/25779
$ git pull https://git.openjdk.org/jdk.git pull/25779/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 25779

View PR using the GUI difftool:
$ git pr show -t 25779

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25779.diff

Using Webrev

Link to Webrev Comment

JonasNorlinder · 2025-06-12T11:28:44Z

/label add hotspot-gc hotspot-runtime

bridgekeeper · 2025-06-12T11:29:01Z

👋 Welcome back JonasNorlinder! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

openjdk · 2025-06-12T11:29:49Z

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

openjdk · 2025-06-12T11:30:20Z

@JonasNorlinder
The hotspot-gc label was successfully added.

The hotspot-runtime label was successfully added.

JonasNorlinder · 2025-06-12T12:28:59Z

This PR is ready for review

mlbridge · 2025-06-12T12:32:46Z

Webrevs

JonasNorlinder · 2025-06-12T13:30:58Z

Refactored code per @kstefanj suggestions

tschatzl · 2025-06-12T13:39:56Z

Fwiw, I would prefer to have one message containing all the information, and add the exit tag. This decreases clutter (timestamp and tags), and allows direct selection of that message.

src/hotspot/share/gc/g1/g1CollectedHeap.cpp

tschatzl · 2025-06-12T13:53:29Z

Fwiw, I would prefer to have one message containing all the information, and add the exit tag. This decreases clutter (timestamp and tags), and allows direct selection of that message.

Also reduces the amount of parsing needed in scripts etc. (I.e. three regexps vs. one). These three values are not really too much to digest for human readers.

Another problem seems to be the large amount of digits after the comma for the times; maybe use a different time scale (ms/us).

src/hotspot/share/gc/shared/collectedHeap.cpp

JonasNorlinder · 2025-06-12T14:38:14Z

Fwiw, I would prefer to have one message containing all the information, and add the exit tag. This decreases clutter (timestamp and tags), and allows direct selection of that message.

Thank you for sharing your concern. I'm OK with putting the CPU times currently in trace into the exit tag but I strongly believe we should keep the
[1.500s][info ][gc] GC CPU cost: 1.75% as is. I discussed the exit tag option with @kstefanj but I thought that hiding the nominal values in the trace level would suffice, but can change that if we think that is preferable. I think that exposing this number at -Xlog:gc-level may be an important tool for users to understand that they may running with too small heap. We expose a lot of way for users to understand the memory footprint, but not too much about the CPU footprint. Putting this number behind exit may increase the risk that the typical user will not discover it. Adding one line at the end while at the same time logging information about each GC cycle does not clutter the message log IMO.

Another option could be to not log the nominal values at all. If one have the percentage and measure process CPU time with e.g. perf one could calculate it yourself anyways. What do you think about that?

Another problem seems to be the large amount of digits after the comma for the times; maybe use a different time scale (ms/us).

Thanks for pointing that out, I will fix that.

JonasNorlinder · 2025-06-12T15:28:25Z

FYI; I removed nominal logging

tschatzl

The string deduplication thread, which is to some degree also a GC helper thread, is not considered here. Not sure if it should, slightly tending to add it.

src/hotspot/share/gc/shared/collectedHeap.cpp

src/hotspot/share/runtime/vmThread.cpp

src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp

src/hotspot/share/gc/serial/serialHeap.cpp

tschatzl · 2025-06-13T10:04:18Z

src/hotspot/share/gc/z/zCollectedHeap.cpp

+    if (thread->is_ConcurrentGC_thread() ||
+        strstr(thread->name(), "ZWorker") != nullptr) {
+      Atomic::add(&_vtime, os::thread_cpu_time(thread));
+    }
+  }


Why does this exclude threads like the ZDirector and other ZGC background threads? That thread seems to clearly be relevant to ZGC operation, doing so would make the measurement incomplete.
The change does not exclude e.g. some random G1 "director" threads either, even if they do not contribute much to the result.

I believe it does not exclude ZDirector, etc. Adding printf("%s\n", thread->name()); to prove my point results in:

ZDirector ZDriverMajor ZDriverMinor ZStat ZUncommitter#0 ZWorkerYoung#0 ZWorkerYoung#1 ZWorkerYoung#2 ZWorkerYoung#3 ZWorkerOld#0 ZWorkerOld#1 ZWorkerOld#2 ZWorkerOld#3

This code is working around the fact that ZCollectedHeap::gc_threads_do also calls _runtime_workers.threads_do which I believe do not participate in GC related work.

Both Parallel and G1 reuse the same gc-worker threads in safepoint_workers() for non-gc work, just fyi.

FWIW; I confirmed with @stefank that _runtime_workers shoud not be accounted for GC CPU time for ZGC.

But then runtime tasks performed by the GC worker threads when using them for runtime work is differently counted towards those GCs that do this sharing/shadowing.

I looked a bit what they are doing, after JDK-8329488 they are only used by heap inspection and heap dumping. Which seems to be solely GC related task, so I kind of think they should be counted against GC.
At least make the accounting uniform across collectors.

So one option is duplicating these workers in G1/Parallel too, and fix https://bugs.openjdk.org/browse/JDK-8277394. Since we can't share GC workers and these "runtime workers" any more due to this change, the safepoint workers management should probably be moved to CollectedHeap, and they shouldn't be advertised as general purpose workers everyone can hook into.

Or just let ZGC's _runtime_threads also count towards GC time. After all both of these VM operations are VM_GC_Operations.

src/hotspot/share/gc/z/zCollectedHeap.cpp

src/hotspot/share/runtime/vmOperation.hpp

tschatzl · 2025-06-13T10:25:22Z

FYI; I removed nominal logging

Okay, these can be re-added if needed. I also see your point about that this is just one message at VM exit, so we do not need the "exit" label. I would prefer if it had, so I won't insist on it given that others do not mind either. It would fit the purpose of the exit label perfectly though.

The argument that "the user could forget specifying it", is somewhat weak imo - in that case one could argue why there are those labels, and I kind of doubt that GC cpu usage at the end only is that important to have for everyone every time.

I.e. if there is need to monitor it, only printing it at the end seems insufficient, as that kind of monitoring is continuous. It helps benchmarking though.

JonasNorlinder · 2025-06-13T10:45:15Z

The argument that "the user could forget specifying it", is somewhat weak imo - in that case one could argue why there are those labels, and I kind of doubt that GC cpu usage at the end only is that important to have for everyone every time.

I disagree, it is equally important as reporting pre and post compaction heap usage like we do now with -Xlog:gc. Users who are not experts in GC may underestimate the CPU cost of GC at a given heap max. Even experts in academia tend to run with too small heap. I maintain my position that adding it at the end is crucial.

JonasNorlinder · 2025-06-13T10:48:21Z

Additionally, if we want to we can also add capabilities to track it continuously with JFR and/or MXBeans. But that may introduce a performance penalty as sampling may not be free so I want to still keep logging it at the end as a base case.

tschatzl · 2025-06-13T11:18:48Z

The argument that "the user could forget specifying it", is somewhat weak imo - in that case one could argue why there are those labels, and I kind of doubt that GC cpu usage at the end only is that important to have for everyone every time.

I disagree, it is equally important as reporting pre and post compaction heap usage like we do now with -Xlog:gc. Users who are not experts in GC may underestimate the CPU cost of GC at a given heap max. Even experts in academia tend to run with too small heap. I maintain my position that adding it at the end is crucial.

I do not disagree about the usefulness of the message (it is - I even liked the nominal output), I only somewhat disagree about making a message purposefully printed at VM exit, to state the cpu usage at exit, not having the "exit" label.
(Regarding the nominal logging: my points were that I thought it was a waste to use three messages, and the presentation format)

We do not need to agree on everything 100%.

Additionally, if we want to we can also add capabilities to track it continuously with JFR and/or MXBeans. But that may introduce a performance penalty as sampling may not be free so I want to still keep logging it at the end as a base case.

That's fine.

Fwiw, there has even been interest from me (https://bugs.openjdk.org/browse/JDK-8349476) about regularly printing these statistics at even higher detail.

However as far as I understand, there are jstat/perf counters already for that, and they are in use (in industry). There is the jcmd command (even before that) that prints per-thread cpu usage for some time now - one could filter out the interesting threads manually...

JonasNorlinder · 2025-06-13T17:10:54Z

The string deduplication thread, which is to some degree also a GC helper thread, is not considered here. Not sure if it should, slightly tending to add it.

I can see arguments for both ways, but do not have a strong opinion. Unless someone object I will add it.

dholmes-ora

This seems reasonable though I'm unclear on some details. Could you give a high-level description of what times we capture for what threads and when, and the calculations involved. Thanks.

Some minor style nits.

src/hotspot/share/gc/shared/vtimeScope.hpp

src/hotspot/share/gc/shared/vtimeScope.inline.hpp

src/hotspot/share/gc/parallel/parallelScavengeHeap.cpp

src/hotspot/share/gc/serial/serialHeap.cpp

tschatzl · 2025-06-16T08:23:47Z

src/hotspot/share/gc/z/zCollectedHeap.cpp

+    if (thread->is_ConcurrentGC_thread() ||
+        strstr(thread->name(), "ZWorker") != nullptr) {
+      Atomic::add(&_vtime, os::thread_cpu_time(thread));
+    }
+  }


But then runtime tasks performed by the GC worker threads when using them for runtime work is differently counted towards those GCs that do this sharing/shadowing.

I looked a bit what they are doing, after JDK-8329488 they are only used by heap inspection and heap dumping. Which seems to be solely GC related task, so I kind of think they should be counted against GC.
At least make the accounting uniform across collectors.

So one option is duplicating these workers in G1/Parallel too, and fix https://bugs.openjdk.org/browse/JDK-8277394. Since we can't share GC workers and these "runtime workers" any more due to this change, the safepoint workers management should probably be moved to CollectedHeap, and they shouldn't be advertised as general purpose workers everyone can hook into.

Or just let ZGC's _runtime_threads also count towards GC time. After all both of these VM operations are VM_GC_Operations.

src/hotspot/share/gc/shared/collectedHeap.hpp

src/hotspot/share/runtime/vmThread.cpp

JonasNorlinder · 2025-06-16T11:49:56Z

This seems reasonable though I'm unclear on some details. Could you give a
high-level description of what times we capture for what threads and when, and
the calculations involved. Thanks.

What threads:

All GC threads e.g. director thread, sample threads, worker threads, driver
threads (except for ZGC where I currently exclude the "runtime" threads which
we may decide to change, see @tschatzl comments above)
The VM thread, but only include VM operations related to GC
The string deduplication thread

What times:

CPU time for all the threads above
CPU time for the entire process

When:

VM thread:
- If VM operation is GC and -Xlog:gc or more is enabled, sample start, and end CPU time
  for the GC operation. Add end - start to gc_vm_vtime.
GC threads:
- If -Xlog:gc or more is enabled, sample all GC threads and store the CPU
  time sum to gc_threads_vtime during VM exit.
String deduplication thread:
- If string deduplication is enabled and if -Xlog:gc or more is enabled
  sample CPU time during VM exit for that thread and store to string_deduplication_vtime
Process:
- If -Xlog:gc or more is enabled, sample the CPU time for the entire process

Log accumulated CPU time gc_threads_vtime + gc_vm_vtime + string_deduplication_vtime
as a percentage of the total process CPU time. We sample the total process CPU
time with the new method os::elapsed_process_vtime.

It should be noted from the above that calling elapsed_gc_vtime assumes that
-Xlog:gc or more is enabled. If one breaks this assumption the GC CPU time will
not include VM thread and may be horribly wrong (e.g. if one runs Serial).

JonasNorlinder · 2025-06-16T21:13:07Z

After all both of these VM operations are VM_GC_Operations.

@tschatzl Thanks for pointing that out, that's an excellent point!

Your comment caused me to reflect over that I have currently defined all operations as GC operations if they inherit from VM_GC_Sync_Operation. However the following should probably not be strictly counted as a GC activity: VM_GC_HeapInspection, VM_PopulateDynamicDumpSharedSpace, VM_Verify, VM_PopulateDumpSharedSpace. I will update the PR.

So one option is duplicating these workers in G1/Parallel too, and fix https://bugs.openjdk.org/browse/JDK-8277394.

Thanks for the suggestion, I agree that we should do that to fix the root issue.

dholmes-ora · 2025-06-17T05:54:24Z

What threads: ...

Thanks for that. Once the details have stabilized I suggest adding this summary of operation to the JBS issue as well.

I will let GC folk do the formal reviews/approvals here, but this looks quite reasonable to me.

dholmes-ora · 2025-06-17T05:56:46Z

src/hotspot/share/gc/shared/vtimeScope.inline.hpp

+#include "gc/shared/collectedHeap.inline.hpp"
+#include "logging/log.hpp"
+#include "memory/universe.hpp"
+#include "runtime/vmThread.hpp"


Nit: if you put this in the .hpp file you can do away with the forward decl for VMThread that is in there.

…ll CPUTimeCounters without indirection

…cost->usage

openjdk · 2025-06-23T13:08:54Z

@JonasNorlinder Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information.

JonasNorlinder · 2025-06-23T13:20:34Z

Changes:

Fixes after feedback from @stefank
Re-based on master so that the integration of JDK-8360024 is included (apologies for the force-push)
Removed conditional sampling per @tschatzl request

While CPU time may be slightly incorrect if heap dumping or heap inspection occurs, the risk is small that the number would be skewed in a significant way and we have a plan on how to mitigate with that going forward, therefore I suggest we can go ahead and integrate this now.

albertnetymk · 2025-06-23T14:07:56Z

src/hotspot/share/gc/shared/collectedHeap.hpp

@@ -240,12 +242,15 @@ class CollectedHeap : public CHeapObj<mtGC> {
  virtual void post_initialize();

  // Stop any onging concurrent work and prepare for exit.
-  virtual void stop() {}
+  virtual void stop();


I find it a bit odd to call log_gc_vtime inside this method, given the comment and its name. I wonder if print_tracing_info can be used instead, which is invoked before exit as well.

I think it makes sense. We want to log the total GC CPU time before exiting. The latest point we can do that is right before we terminate threads, which we do in when we call ConcurrentGCThread::stop. I am open for any suggestions to rename log_gc_vtime in case that would help.

openjdk bot added hotspot-gc [email protected] hotspot-runtime [email protected] labels Jun 12, 2025

JonasNorlinder changed the title ~~8359110: Log GC CPU cost upon VM exit~~ 8359110: Log accumulated GC and process CPU time upon VM exit Jun 12, 2025

JonasNorlinder marked this pull request as ready for review June 12, 2025 12:28

openjdk bot added the rfr Pull request is ready for review label Jun 12, 2025

tschatzl suggested changes Jun 12, 2025

View reviewed changes

src/hotspot/share/gc/g1/g1CollectedHeap.cpp Outdated Show resolved Hide resolved

tschatzl reviewed Jun 12, 2025

View reviewed changes

src/hotspot/share/gc/shared/collectedHeap.cpp Outdated Show resolved Hide resolved

JonasNorlinder requested a review from tschatzl June 12, 2025 15:21

tschatzl suggested changes Jun 13, 2025

View reviewed changes

dholmes-ora reviewed Jun 16, 2025

View reviewed changes

src/hotspot/share/gc/shared/vtimeScope.hpp Outdated Show resolved Hide resolved

src/hotspot/share/gc/shared/vtimeScope.inline.hpp Outdated Show resolved Hide resolved

src/hotspot/share/gc/shared/vtimeScope.inline.hpp Outdated Show resolved Hide resolved

tschatzl suggested changes Jun 16, 2025

View reviewed changes

JonasNorlinder commented Jun 16, 2025

View reviewed changes

src/hotspot/share/runtime/vmThread.cpp Outdated Show resolved Hide resolved

dholmes-ora reviewed Jun 17, 2025

View reviewed changes

JonasNorlinder and others added 16 commits June 23, 2025 14:55

Log GC vtime on VM exit

9ebd28a

Add elapsed_process_vtime

b600901

Refactor vtime logic in evaluate_operation into a stack object and ca…

3f91b87

…ll CPUTimeCounters without indirection

Remove unused bool

aafb91a

Remove unnecessary assert

13cdd8b

Refactor shared logic into CollectedHeap, remove nominal logging and …

2a35786

…cost->usage

Add bug fix after refactor and fixes for review

77e3be0

Replace calls to log_gc_vtime with super-class method calls

b275c9e

Add CPU time tracking for string deduplication to log_gc_vtime

81db6d8

Remove explicit super call and minor fixes

21f2844

Only sample if needed

6e56027

operation_is_gc -> is_gc_operation per @stefank suggestion

a47ca98

Remove extra whitespace

571b00d

Fixes after feedback from @stefank

1c65416

More clean up, remove virtual

3095e70

Remove incorrect is_gc_operation call after rebase

553edc4

JonasNorlinder force-pushed the gc_cpu_time branch from b20c974 to 553edc4 Compare June 23, 2025 13:08

JonasNorlinder requested a review from tschatzl June 23, 2025 13:20

albertnetymk reviewed Jun 23, 2025

View reviewed changes

8359110: Log accumulated GC and process CPU time upon VM exit #25779

Are you sure you want to change the base?

8359110: Log accumulated GC and process CPU time upon VM exit #25779

Conversation

JonasNorlinder commented Jun 12, 2025 • edited by openjdk bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Progress

Issue

Reviewing

Uh oh!

JonasNorlinder commented Jun 12, 2025

Uh oh!

bridgekeeper bot commented Jun 12, 2025

Uh oh!

openjdk bot commented Jun 12, 2025

Uh oh!

openjdk bot commented Jun 12, 2025

Uh oh!

JonasNorlinder commented Jun 12, 2025

Uh oh!

mlbridge bot commented Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Webrevs

Uh oh!

JonasNorlinder commented Jun 12, 2025

Uh oh!

tschatzl commented Jun 12, 2025

Uh oh!

Uh oh!

tschatzl commented Jun 12, 2025

Uh oh!

Uh oh!

JonasNorlinder commented Jun 12, 2025

Uh oh!

JonasNorlinder commented Jun 12, 2025

Uh oh!

tschatzl left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tschatzl commented Jun 13, 2025

Uh oh!

JonasNorlinder commented Jun 13, 2025

Uh oh!

JonasNorlinder commented Jun 13, 2025

Uh oh!

tschatzl commented Jun 13, 2025

Uh oh!

JonasNorlinder commented Jun 13, 2025

Uh oh!

dholmes-ora left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JonasNorlinder commented Jun 16, 2025

Uh oh!

JonasNorlinder commented Jun 12, 2025 •

edited by openjdk bot

Loading

mlbridge bot commented Jun 12, 2025 •

edited

Loading