-
Notifications
You must be signed in to change notification settings - Fork 6.1k
8359110: Log accumulated GC and process CPU time upon VM exit #25779
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
/label add hotspot-gc hotspot-runtime |
👋 Welcome back JonasNorlinder! A progress list of the required criteria for merging this PR into |
❗ This change is not yet ready to be integrated. |
@JonasNorlinder The |
This PR is ready for review |
Webrevs
|
Refactored code per @kstefanj suggestions |
Fwiw, I would prefer to have one message containing all the information, and add the |
Also reduces the amount of parsing needed in scripts etc. (I.e. three regexps vs. one). These three values are not really too much to digest for human readers. Another problem seems to be the large amount of digits after the comma for the times; maybe use a different time scale (ms/us). |
Thank you for sharing your concern. I'm OK with putting the CPU times currently in trace into the Another option could be to not log the nominal values at all. If one have the percentage and measure process CPU time with e.g. perf one could calculate it yourself anyways. What do you think about that?
Thanks for pointing that out, I will fix that. |
FYI; I removed nominal logging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The string deduplication thread, which is to some degree also a GC helper thread, is not considered here. Not sure if it should, slightly tending to add it.
if (thread->is_ConcurrentGC_thread() || | ||
strstr(thread->name(), "ZWorker") != nullptr) { | ||
Atomic::add(&_vtime, os::thread_cpu_time(thread)); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this exclude threads like the ZDirector
and other ZGC background threads? That thread seems to clearly be relevant to ZGC operation, doing so would make the measurement incomplete.
The change does not exclude e.g. some random G1 "director" threads either, even if they do not contribute much to the result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it does not exclude ZDirector, etc. Adding printf("%s\n", thread->name());
to prove my point results in:
ZDirector
ZDriverMajor
ZDriverMinor
ZStat
ZUncommitter#0
ZWorkerYoung#0
ZWorkerYoung#1
ZWorkerYoung#2
ZWorkerYoung#3
ZWorkerOld#0
ZWorkerOld#1
ZWorkerOld#2
ZWorkerOld#3
This code is working around the fact that ZCollectedHeap::gc_threads_do
also calls _runtime_workers.threads_do
which I believe do not participate in GC related work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both Parallel and G1 reuse the same gc-worker threads in safepoint_workers()
for non-gc work, just fyi.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW; I confirmed with @stefank that _runtime_workers
shoud not be accounted for GC CPU time for ZGC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then runtime tasks performed by the GC worker threads when using them for runtime work is differently counted towards those GCs that do this sharing/shadowing.
I looked a bit what they are doing, after JDK-8329488 they are only used by heap inspection and heap dumping. Which seems to be solely GC related task, so I kind of think they should be counted against GC.
At least make the accounting uniform across collectors.
So one option is duplicating these workers in G1/Parallel too, and fix https://bugs.openjdk.org/browse/JDK-8277394. Since we can't share GC workers and these "runtime workers" any more due to this change, the safepoint workers management should probably be moved to CollectedHeap
, and they shouldn't be advertised as general purpose workers everyone can hook into.
Or just let ZGC's _runtime_threads
also count towards GC time. After all both of these VM operations are VM_GC_Operations
.
Okay, these can be re-added if needed. I also see your point about that this is just one message at VM exit, so we do not need the "exit" label. I would prefer if it had, so I won't insist on it given that others do not mind either. It would fit the purpose of the The argument that "the user could forget specifying it", is somewhat weak imo - in that case one could argue why there are those labels, and I kind of doubt that GC cpu usage at the end only is that important to have for everyone every time. I.e. if there is need to monitor it, only printing it at the end seems insufficient, as that kind of monitoring is continuous. It helps benchmarking though. |
I disagree, it is equally important as reporting pre and post compaction heap usage like we do now with -Xlog:gc. Users who are not experts in GC may underestimate the CPU cost of GC at a given heap max. Even experts in academia tend to run with too small heap. I maintain my position that adding it at the end is crucial. |
Additionally, if we want to we can also add capabilities to track it continuously with JFR and/or MXBeans. But that may introduce a performance penalty as sampling may not be free so I want to still keep logging it at the end as a base case. |
I do not disagree about the usefulness of the message (it is - I even liked the nominal output), I only somewhat disagree about making a message purposefully printed at VM exit, to state the cpu usage at exit, not having the "exit" label. We do not need to agree on everything 100%.
That's fine. Fwiw, there has even been interest from me (https://bugs.openjdk.org/browse/JDK-8349476) about regularly printing these statistics at even higher detail. However as far as I understand, there are jstat/perf counters already for that, and they are in use (in industry). There is the jcmd command (even before that) that prints per-thread cpu usage for some time now - one could filter out the interesting threads manually... |
I can see arguments for both ways, but do not have a strong opinion. Unless someone object I will add it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems reasonable though I'm unclear on some details. Could you give a high-level description of what times we capture for what threads and when, and the calculations involved. Thanks.
Some minor style nits.
if (thread->is_ConcurrentGC_thread() || | ||
strstr(thread->name(), "ZWorker") != nullptr) { | ||
Atomic::add(&_vtime, os::thread_cpu_time(thread)); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But then runtime tasks performed by the GC worker threads when using them for runtime work is differently counted towards those GCs that do this sharing/shadowing.
I looked a bit what they are doing, after JDK-8329488 they are only used by heap inspection and heap dumping. Which seems to be solely GC related task, so I kind of think they should be counted against GC.
At least make the accounting uniform across collectors.
So one option is duplicating these workers in G1/Parallel too, and fix https://bugs.openjdk.org/browse/JDK-8277394. Since we can't share GC workers and these "runtime workers" any more due to this change, the safepoint workers management should probably be moved to CollectedHeap
, and they shouldn't be advertised as general purpose workers everyone can hook into.
Or just let ZGC's _runtime_threads
also count towards GC time. After all both of these VM operations are VM_GC_Operations
.
What threads:
What times:
When:
Log accumulated CPU time It should be noted from the above that calling elapsed_gc_vtime assumes that |
@tschatzl Thanks for pointing that out, that's an excellent point! Your comment caused me to reflect over that I have currently defined all operations as GC operations if they inherit from
Thanks for the suggestion, I agree that we should do that to fix the root issue. |
Thanks for that. Once the details have stabilized I suggest adding this summary of operation to the JBS issue as well. I will let GC folk do the formal reviews/approvals here, but this looks quite reasonable to me. |
#include "gc/shared/collectedHeap.inline.hpp" | ||
#include "logging/log.hpp" | ||
#include "memory/universe.hpp" | ||
#include "runtime/vmThread.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: if you put this in the .hpp file you can do away with the forward decl for VMThread that is in there.
…ll CPUTimeCounters without indirection
b20c974
to
553edc4
Compare
@JonasNorlinder Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information. |
Changes:
While CPU time may be slightly incorrect if heap dumping or heap inspection occurs, the risk is small that the number would be skewed in a significant way and we have a plan on how to mitigate with that going forward, therefore I suggest we can go ahead and integrate this now. |
@@ -240,12 +242,15 @@ class CollectedHeap : public CHeapObj<mtGC> { | |||
virtual void post_initialize(); | |||
|
|||
// Stop any onging concurrent work and prepare for exit. | |||
virtual void stop() {} | |||
virtual void stop(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it a bit odd to call log_gc_vtime
inside this method, given the comment and its name. I wonder if print_tracing_info
can be used instead, which is invoked before exit as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense. We want to log the total GC CPU time before exiting. The latest point we can do that is right before we terminate threads, which we do in when we call ConcurrentGCThread::stop
. I am open for any suggestions to rename log_gc_vtime
in case that would help.
Add support to log CPU cost for GC during VM exit with
-Xlog:gc
.Additionally, detailed information may be retrieved with
-Xlog:gc=trace
Progress
Issue
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/25779/head:pull/25779
$ git checkout pull/25779
Update a local copy of the PR:
$ git checkout pull/25779
$ git pull https://git.openjdk.org/jdk.git pull/25779/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 25779
View PR using the GUI difftool:
$ git pr show -t 25779
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/25779.diff
Using Webrev
Link to Webrev Comment