Skip to content

JDK24 JVMTI serviceability GetThreadState thrstat01 #21408

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
babsingh opened this issue Mar 19, 2025 · 6 comments
Closed

JDK24 JVMTI serviceability GetThreadState thrstat01 #21408

babsingh opened this issue Mar 19, 2025 · 6 comments

Comments

@babsingh
Copy link
Contributor

babsingh commented Mar 19, 2025

https://openj9-jenkins.osuosl.org/job/Test_openjdk24_j9_extended.openjdk_aarch64_mac_Personal/1

TEST: serviceability/jvmti/thread/GetThreadState/thrstat01/thrstat01.java

-XX:+YieldPinnedVirtualThreads (JEP491) is needed to reproduce the below failure.

17:37:41  STDOUT:
17:37:41  Agent_OnLoad started
17:37:41  Agent_OnLoad finished
17:37:41  >>> ThreadStart: "main"
17:37:41  >>> ThreadStart: "Finalizer thread"
17:37:41  >>> ThreadStart: "Attach API initializer"
17:37:41  >>> ThreadStart: "Attach API wait loop"
17:37:41  >>> ThreadStart: "MainThread"
17:37:41  >>> ThreadStart: "VirtualThread-unblocker"
17:37:41  >>> ThreadStart: "ForkJoinPool-1-worker-1"
17:37:41  >>> ThreadStart: "tested_thread_thr1"
17:37:41  >>> ThreadStart: "tested_thread_thr1", 0x0x13d042788
17:37:41  native method checkStatus started
17:37:41  Testing thread: "tested_thread_thr1"
17:37:41  >>> thread "tested_thread_thr1" (0x0x13d042788) state:  ALIVE RUNNABLE (5)
17:37:41  >>> thread "tested_thread_thr1" (0x0x13d042788) state:  ALIVE RUNNABLE (5)
17:37:41  native method checkStatus finished
17:37:41  native method checkStatus started
17:37:41  Testing thread: "tested_thread_thr1"
17:37:41  >>> thread "tested_thread_thr1" (0x0x13d042788) state:  ALIVE RUNNABLE (5)
17:37:41  STDERR:
17:37:41  Unhandled exception
17:37:41  Type=Segmentation error vmState=0x00000000
17:37:41  J9Generic_Signal_Number=00000018 Signal_Number=0000000b Error_Value=00000000 Signal_Code=00000002
17:37:41  Handler1=0000000100394F08 Handler2=00000001006C22DC InaccessibleAddress=0000000000000000
17:37:41  x0=000000013B04D100 x1=0000000293781FA8 x2=0000000000000000 x3=000000013C833741
17:37:41  x4=0000000293781ED0 x5=0000000000000000 x6=000000000000006D x7=0000000000000000
17:37:41  x8=0000000000000000 x9=0000000000000120 x10=0000000000000048 x11=0000000293781F10
17:37:41  x12=0000000000000000 x13=0000000000000000 x14=0000000293782D00 x15=0000000000000000
17:37:41  x16=000000019D046940 x17=000000020B64DAB0 x18=00000001004D25B4 x19=0000000293781FA8
17:37:41  x20=000000013B04D100 x21=0000000000000002 x22=000000011A8478A0 x23=0000000000000001
17:37:41  x24=0000000000000000 x25=000000011A822470 x26=0000000167D47A70 x27=000000011A8478B8
17:37:41  x28=0000000000000001 x29(FP)=0000000293781E90 x30(LR)=00000001003CDF10 x31(SP)=0000000293781E40
17:37:41  PC=00000001003F2AA8 SP=0000000293781E40
17:37:41  v0=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v1=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v2=00000000a337a02e (f: 2738331648.000000, d: 1.352916e-314)
17:37:41  v3=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v4=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v5=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v6=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v7=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v8=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v9=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v10=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v11=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v12=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v13=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v14=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v15=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v16=bfd0000000000000 (f: 0.000000, d: -2.500000e-01)
17:37:41  v17=3fd5406292977555 (f: 2459399424.000000, d: 3.320548e-01)
17:37:41  v18=3f74e518054b74bc (f: 88831168.000000, d: 5.101293e-03)
17:37:41  v19=3fe62e42fefa39ef (f: 4277811712.000000, d: 6.931472e-01)
17:37:41  v20=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v21=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v22=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v23=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v24=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v25=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v26=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v27=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v28=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v29=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v30=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  v31=0000000000000000 (f: 0.000000, d: 0.000000e+00)
17:37:41  Module=/Users/jenkins/workspace/Test_openjdk24_j9_extended.openjdk_aarch64_mac_Personal/jdkbinary/j2sdk-image/lib/default/libj9vm29.dylib
17:37:41  Module_base_address=0000000100370000 Symbol=walkFrameMonitorEnterRecords
17:37:41  Symbol_address=00000001003F29B4
17:37:41  Target=2_90_20250319_50 (Mac OS X 11.7.1)
17:37:41  CPU=aarch64 (8 logical CPUs) (0x400000000 RAM)
17:37:41  ----------- Stack Backtrace -----------
17:37:41  ---------------------------------------
17:37:41  JVMDUMP039I Processing dump event "gpf", detail "" at 2025/03/19 08:37:16 - please wait.
17:37:41  JVMDUMP032I JVM requested System dump using '/Users/jenkins/workspace/Test_openjdk24_j9_extended.openjdk_aarch64_mac_Personal/aqa-tests/TKG/output_17423328528053/serviceability_jvmti_j9_1/work/scratch/1/core.20250319.083716.31272.0001.dmp' in response to an event
17:37:41  JVMDUMP010I System dump written to /Users/jenkins/workspace/Test_openjdk24_j9_extended.openjdk_aarch64_mac_Personal/aqa-tests/TKG/output_17423328528053/serviceability_jvmti_j9_1/work/scratch/1/core.20250319.083716.31272.0001.dmp
17:37:41  JVMDUMP032I JVM requested Java dump using '/Users/jenkins/workspace/Test_openjdk24_j9_extended.openjdk_aarch64_mac_Personal/aqa-tests/TKG/output_17423328528053/serviceability_jvmti_j9_1/work/scratch/1/javacore.20250319.083716.31272.0002.txt' in response to an event
17:37:41  JVMDUMP010I Java dump written to /Users/jenkins/workspace/Test_openjdk24_j9_extended.openjdk_aarch64_mac_Personal/aqa-tests/TKG/output_17423328528053/serviceability_jvmti_j9_1/work/scratch/1/javacore.20250319.083716.31272.0002.txt
17:37:41  JVMDUMP032I JVM requested Snap dump using '/Users/jenkins/workspace/Test_openjdk24_j9_extended.openjdk_aarch64_mac_Personal/aqa-tests/TKG/output_17423328528053/serviceability_jvmti_j9_1/work/scratch/1/Snap.20250319.083716.31272.0003.trc' in response to an event
17:37:41  JVMDUMP010I Snap dump written to /Users/jenkins/workspace/Test_openjdk24_j9_extended.openjdk_aarch64_mac_Personal/aqa-tests/TKG/output_17423328528053/serviceability_jvmti_j9_1/work/scratch/1/Snap.20250319.083716.31272.0003.trc
17:37:41  JVMDUMP032I JVM requested JIT dump using '/Users/jenkins/workspace/Test_openjdk24_j9_extended.openjdk_aarch64_mac_Personal/aqa-tests/TKG/output_17423328528053/serviceability_jvmti_j9_1/work/scratch/1/jitdump.20250319.083716.31272.0004.dmp' in response to an event
17:37:41  JVMDUMP051I JIT dump occurred in 'ForkJoinPool-1-worker-1' thread 0x000000013B04D100
17:37:41  JVMDUMP013I Processed dump event "gpf", detail "".
@babsingh babsingh added this to the Java 24 (0.50) milestone Mar 19, 2025
@babsingh babsingh changed the title JDK24 serviceability GetThreadState thrstat01 JDK24 JVMTI serviceability GetThreadState thrstat01 Mar 19, 2025
@tajila
Copy link
Contributor

tajila commented Mar 20, 2025

@theresa-m Please take a look at this

@theresa-m
Copy link
Contributor

theresa-m commented Mar 21, 2025

This is possibly a JIT crash. I haven't been able to reproduce this failure, instead each time I've tried to run the test it times out instead even with increased test time. I'm working on getting a dump of the timeout run while its running since there may stuck somewhere.
edit: the hang is at the same place in Java code as this crash. My core dump did not yield any further insights from what I could see:

at thrstat01.meth(thrstat01.java:119)

With the core file from the original jenkins job using lldb I was able to see a short backtrace from the crashing thread:

* thread #5, stop reason = ESR_EC_DABORT_EL0 (fault address: 0x16e2e7ff8)
  * frame #0: 0x000000019d046a1c libsystem_platform.dylib`_platform_memmove + 220
    frame #1: 0x000000019cf0857c libsystem_c.dylib`__sfvwrite + 324
    frame #2: 0x0000000104df6598 libj9jit29.dylib`OMR::ValuePropagation::propagateConstraint(TR::Node*, int, OMR::ValuePropagation::Relationship*, OMR::ValuePropagation::Relationship*, TR_HedgeTree<OMR::ValuePropagation::ValueConstraint>*) + 212

There doesn't seem to be a clear mapping between the lldb output and javacore but I can see that this is at least not a vm thread. In the lldb backtrace information:

thread 17 is "main"
thread 29 is "MainThread"
thread 30 is "tested_thread_thr1"
thread 31 is where the system dump is processed.

Here is the full output from lldb: lldb_output.txt

@babsingh
Copy link
Contributor Author

babsingh commented Mar 25, 2025

In jdmpview, you can inspect what the virtual threads are doing using the commands below. This will help us understand the context in which the virtual threads are crashing.

  • Use !vthreads to list all virtual threads.
  • For unmounted virtual threads, use !continuationstack 0x... to view the virtual thread’s stack.
  • For mounted virtual threads:
    • !continuationstack 0x... will show the stack of the J9VMThread on which the virtual thread is mounted.
    • !stackslots J9VMThread will show the virtual thread’s actual stack.
  • If J9VMThread->threadObject and J9VMThread->carrierThreadObject differ, the virtual thread is mounted on a carrier thread.
  • You may also want to inspect the VirtualThread.state field: !j9object 0x000000070037D688 | grep state
> !vthreads
!continuationstack 0x00007f5294064ac0 !j9vmcontinuation 0x00007f5294064ac0 !j9object 0x000000070037D750 (Continuation) !j9object 0x000000070037D688 (VThread) -
...

@theresa-m
Copy link
Contributor

There are no vthreads listed at the point of the crash. The ForkJoinPool-1-worker-1 thread stack cannot be read properly.

> !vthreads
> !threads
	!stack 0x13d051900	!j9vmthread 0x13d051900	!j9thread 0x13b00c850	tid 0x3a7baa0d (981182989) // (main)
	!stack 0x13d068500	!j9vmthread 0x13d068500	!j9thread 0x13b00d260	tid 0x3a7bb342 (981185346) // (JIT Compilation Thread-000)
	!stack 0x13d073b00	!j9vmthread 0x13d073b00	!j9thread 0x119008250	tid 0x3a7bb344 (981185348) // (JIT Compilation Thread-001 Suspended)
	!stack 0x13a80cb00	!j9vmthread 0x13a80cb00	!j9thread 0x119008758	tid 0x3a7bb345 (981185349) // (JIT Compilation Thread-002 Suspended)
	!stack 0x13a814900	!j9vmthread 0x13a814900	!j9thread 0x119008c60	tid 0x3a7bb347 (981185351) // (JIT Compilation Thread-003 Suspended)
	!stack 0x11a80af00	!j9vmthread 0x11a80af00	!j9thread 0x119009450	tid 0x3a7bb348 (981185352) // (JIT Compilation Thread-004 Suspended)
	!stack 0x11a80f900	!j9vmthread 0x11a80f900	!j9thread 0x119009958	tid 0x3a7bb34a (981185354) // (JIT Compilation Thread-005 Suspended)
	!stack 0x11900c100	!j9vmthread 0x11900c100	!j9thread 0x119009e60	tid 0x3a7bb34c (981185356) // (JIT Compilation Thread-006 Suspended)
	!stack 0x13c025700	!j9vmthread 0x13c025700	!j9thread 0x119012850	tid 0x3a7bb34d (981185357) // (JIT Diagnostic Compilation Thread-007 Suspended)
	!stack 0x13a825300	!j9vmthread 0x13a825300	!j9thread 0x119012d58	tid 0x3a7bb8d2 (981186770) // (JIT-SamplerThread)
	!stack 0x13a82d500	!j9vmthread 0x13a82d500	!j9thread 0x119013260	tid 0x3a7bb8d4 (981186772) // (IProfiler)
	!stack 0x12a813f00	!j9vmthread 0x12a813f00	!j9thread 0x12a80f250	tid 0x3a7bbb1c (981187356) // (Common-Cleaner)
	!stack 0x11a824500	!j9vmthread 0x11a824500	!j9thread 0x12a80fc60	tid 0x3a7bbb87 (981187463) // (Concurrent Mark Helper)
	!stack 0x12a842500	!j9vmthread 0x12a842500	!j9thread 0x12a83f850	tid 0x3a7bbb8a (981187466) // (GC Worker)
	!stack 0x12a84bf00	!j9vmthread 0x12a84bf00	!j9thread 0x12a83fd58	tid 0x3a7bbb8b (981187467) // (GC Worker)
	!stack 0x13c809500	!j9vmthread 0x13c809500	!j9thread 0x12a840260	tid 0x3a7bbb8d (981187469) // (GC Worker)
	!stack 0x11a82e300	!j9vmthread 0x11a82e300	!j9thread 0x12a852c50	tid 0x3a7bbb8f (981187471) // (GC Worker)
	!stack 0x13c817900	!j9vmthread 0x13c817900	!j9thread 0x12a853158	tid 0x3a7bbb92 (981187474) // (GC Worker)
	!stack 0x13c03ed00	!j9vmthread 0x13c03ed00	!j9thread 0x12a853660	tid 0x3a7bbb94 (981187476) // (GC Worker)
	!stack 0x11a836700	!j9vmthread 0x11a836700	!j9thread 0x13c043e50	tid 0x3a7bbb96 (981187478) // (GC Worker)
	!stack 0x11a816100	!j9vmthread 0x11a816100	!j9thread 0x13c044358	tid 0x3a7bbbf6 (981187574) // (Finalizer thread)
	!stack 0x13c045100	!j9vmthread 0x13c045100	!j9thread 0x13c047450	tid 0x3a7bbc90 (981187728) // (Attach API wait loop)
	!stack 0x13d0d3300	!j9vmthread 0x13d0d3300	!j9thread 0x13c044860	tid 0x3a7bbd9d (981187997) // (MainThread)
	!stack 0x13b02fb00	!j9vmthread 0x13b02fb00	!j9thread 0x13c047958	tid 0x3a7bbeaf (981188271) // (VirtualThread-unblocker)
	!stack 0x13b04d100	!j9vmthread 0x13b04d100	!j9thread 0x13c047e60	tid 0x3a7bbeb3 (981188275) // (ForkJoinPool-1-worker-1)
> !stack 0x13b04d100
<13b04d100> 	                        Generic special frame
<13b04d100> 	!j9method 0x0000000130F42358   TestedThreadThr1.run()V
<13b04d100> Aborting walk due to unknown special frame type

@babsingh
Copy link
Contributor Author

A lot of code changes went in last week. The original issue might have been resolved by #21459. Please try running the test with the latest OpenJ9 changes to see if the original failure still occurs.

The timeouts were happening because waiting threads were not being notified. Several updates also went in last week to address these timeout issues. If the timeouts persist, you can follow @fengxue-IS's guidance from #20705 (comment) to investigate this further.

@theresa-m
Copy link
Contributor

theresa-m commented Mar 31, 2025

Thanks. I reran this test and am now seeing the same error as #21525. I will try again once the changes for it have been verified.
https://hyc-runtimes-jenkins.swg-devops.com/view/Test_grinder/job/Grinder/49165/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants