Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JFRTest#jfrPerfTest fails with Quarkus main #236

Closed
zakkak opened this issue Dec 7, 2023 · 20 comments · Fixed by quarkusio/quarkus#37633
Closed

JFRTest#jfrPerfTest fails with Quarkus main #236

zakkak opened this issue Dec 7, 2023 · 20 comments · Fixed by quarkusio/quarkus#37633
Assignees

Comments

@zakkak
Copy link
Collaborator

zakkak commented Dec 7, 2023

How to reproduce

git clone https://github.com/quarkusio/quarkus.git
cd quarkus
./mvnw -Dquickly
cd ../
git clone https://github.com/Karm/mandrel-integration-tests
cd mandrel-integration-tests
mvn clean verify -Dquarkus.version=999-SNAPSHOT -Ptestsuite -Dtest=JFRTest#jfrPerfTest

Failure:

[ERROR] Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 78.177 s <<< FAILURE! - in org.graalvm.tests.integration.JFRTest
[ERROR] jfrPerfTest{TestInfo}  Time elapsed: 77.691 s  <<< FAILURE!
org.opentest4j.AssertionFailedError: Timeout 10s was reached. Empty webpage does not contain string: `hello' ==> expected: <true> but was: <false>
	at org.graalvm.tests.integration.JFRTest.runBenchmarkForApp(JFRTest.java:355)
	at org.graalvm.tests.integration.JFRTest.startComparisonForBenchmark(JFRTest.java:254)
	at org.graalvm.tests.integration.JFRTest.jfrPerfTestRun(JFRTest.java:214)
	at org.graalvm.tests.integration.JFRTest.jfrPerfTest(JFRTest.java:176)

Additional info

The regression is not caused by a GraalVM change, building with Mandrel 23.1.1.0 also doesn't work.
Setting quarkus.version to 3.6.0 makes the test fail in a different stage (possibly a local issue?)

@zakkak
Copy link
Collaborator Author

zakkak commented Dec 7, 2023

First known bad commit seems to be 61edae46e0287ecaee7f5b3ef728ab4a6d465c02 in https://github.com/graalvm/mandrel/actions/runs/7026828114/job/19120880487

(Interestingly this started happening with the HELIDON_QUICKSTART failures #226)

Last known working commit seems to be 670b43cac8587d27d7810af41c22ca3cf161df23 in https://github.com/graalvm/mandrel/actions/runs/7012992753/job/19078995046

@zakkak zakkak self-assigned this Dec 7, 2023
@zakkak
Copy link
Collaborator Author

zakkak commented Dec 7, 2023

The regression was introduced with quarkusio/quarkus#37347 and specifically with the runtime re-initialization of io.netty.util.internal.PlatformDependent0.

@zakkak
Copy link
Collaborator Author

zakkak commented Dec 7, 2023

Inspecting the static fields of io.netty.util.internal.PlatformDependent0 when build-time-initialized and when run-time-initialized I don't see any difference that could explain the failure (at least not at first sight).

Java mode

maxDirectMemory= 15 GB
logger = JdkLogger(io.netty.util.internal.PlatformDependent0)
ADDRESS_FIELD_OFFSET = 16
BYTE_ARRAY_BASE_OFFSET = 16
INT_ARRAY_BASE_OFFSET = 16
INT_ARRAY_INDEX_SCALE = 4
LONG_ARRAY_BASE_OFFSET = 16
LONG_ARRAY_INDEX_SCALE = 8
DIRECT_BUFFER_CONSTRUCTOR = null
EXPLICIT_NO_UNSAFE_CAUSE = null
ALLOCATE_ARRAY_METHOD = null
ALIGN_SLICE = public final java.nio.ByteBuffer java.nio.ByteBuffer.alignedSlice(int)
JAVA_VERSION = 21
IS_ANDROID = false
STORE_FENCE_AVAILABLE = true
UNSAFE_UNAVAILABILITY_CAUSE = null
INTERNAL_UNSAFE = null
RUNNING_IN_NATIVE_IMAGE = false
IS_EXPLICIT_TRY_REFLECTION_SET_ACCESSIBLE = false
UNSAFE = sun.misc.Unsafe@31cefde0
HASH_CODE_ASCII_SEED = -1028477387
HASH_CODE_C1 = -862048943
HASH_CODE_C2 = 461845907
UNSAFE_COPY_THRESHOLD = 1048576
UNALIGNED = true
BITS_MAX_DIRECT_MEMORY = 16634609664
$assertionsDisabled = true

Native mode build-time-initialized

maxDirectMemory= 22 GB
logger = JdkLogger(io.netty.util.internal.PlatformDependent0)
ADDRESS_FIELD_OFFSET = 24
BYTE_ARRAY_BASE_OFFSET = 16
INT_ARRAY_BASE_OFFSET = 16
INT_ARRAY_INDEX_SCALE = 4
LONG_ARRAY_BASE_OFFSET = 16
LONG_ARRAY_INDEX_SCALE = 8
DIRECT_BUFFER_CONSTRUCTOR = null
EXPLICIT_NO_UNSAFE_CAUSE = null
ALLOCATE_ARRAY_METHOD = null
ALIGN_SLICE = public final java.nio.ByteBuffer java.nio.ByteBuffer.alignedSlice(int)
JAVA_VERSION = 21
IS_ANDROID = false
STORE_FENCE_AVAILABLE = true
UNSAFE_UNAVAILABILITY_CAUSE = null
INTERNAL_UNSAFE = null
RUNNING_IN_NATIVE_IMAGE = true
IS_EXPLICIT_TRY_REFLECTION_SET_ACCESSIBLE = true
UNSAFE = sun.misc.Unsafe@65e6a946
HASH_CODE_ASCII_SEED = -1028477387
HASH_CODE_C1 = -862048943
HASH_CODE_C2 = 461845907
UNSAFE_COPY_THRESHOLD = 1048576
UNALIGNED = true
BITS_MAX_DIRECT_MEMORY = -1
$assertionsDisabled = true

Native mode run-time-initialized

maxDirectMemory= 49 GB
logger = JdkLogger(io.netty.util.internal.PlatformDependent0)
ADDRESS_FIELD_OFFSET = 24
BYTE_ARRAY_BASE_OFFSET = 16
INT_ARRAY_BASE_OFFSET = 16
INT_ARRAY_INDEX_SCALE = 4
LONG_ARRAY_BASE_OFFSET = 16
LONG_ARRAY_INDEX_SCALE = 8
DIRECT_BUFFER_CONSTRUCTOR = null
EXPLICIT_NO_UNSAFE_CAUSE = null
ALLOCATE_ARRAY_METHOD = null
ALIGN_SLICE = public final java.nio.ByteBuffer java.nio.ByteBuffer.alignedSlice(int)
JAVA_VERSION = 21
IS_ANDROID = false
STORE_FENCE_AVAILABLE = true
UNSAFE_UNAVAILABILITY_CAUSE = null
INTERNAL_UNSAFE = null
RUNNING_IN_NATIVE_IMAGE = true
IS_EXPLICIT_TRY_REFLECTION_SET_ACCESSIBLE = true
UNSAFE = sun.misc.Unsafe@4417167b
HASH_CODE_ASCII_SEED = -1028477387
HASH_CODE_C1 = -862048943
HASH_CODE_C2 = 461845907
UNSAFE_COPY_THRESHOLD = 1048576
UNALIGNED = true
BITS_MAX_DIRECT_MEMORY = -1
$assertionsDisabled = true

Diff

diff --git a/native-build-time-init-mode.txt b/native-run-time-init-mode.txt
index 6444b24..d0fa1ef 100644
--- a/native-build-time-init-mode.txt
+++ b/native-run-time-init-mode.txt
@@ -1,4 +1,4 @@
-maxDirectMemory= 22 GB
+maxDirectMemory= 49 GB
 logger = JdkLogger(io.netty.util.internal.PlatformDependent0)
 ADDRESS_FIELD_OFFSET = 24
 BYTE_ARRAY_BASE_OFFSET = 16
@@ -17,7 +17,7 @@ UNSAFE_UNAVAILABILITY_CAUSE = null
 INTERNAL_UNSAFE = null
 RUNNING_IN_NATIVE_IMAGE = true
 IS_EXPLICIT_TRY_REFLECTION_SET_ACCESSIBLE = true
-UNSAFE = sun.misc.Unsafe@65e6a946
+UNSAFE = sun.misc.Unsafe@4417167b
 HASH_CODE_ASCII_SEED = -1028477387
 HASH_CODE_C1 = -862048943
 HASH_CODE_C2 = 461845907

@zakkak
Copy link
Collaborator Author

zakkak commented Dec 7, 2023

Although still not clear why, setting io.netty.tryReflectionSetAccessible=false, at build-time (i.e. "reverting" netty/netty#12638) seems to resolve (or work-around) the issue.

Note however, that setting io.netty.tryReflectionSetAccessible=false at run-time doesn't. Hinting that some build-time initialization effects remain.

@zakkak
Copy link
Collaborator Author

zakkak commented Dec 7, 2023

Faster reproducer:

git clone https://github.com/quarkusio/quarkus.git
cd quarkus
./mvnw -Dquickly
cd ../
git clone https://github.com/Karm/mandrel-integration-tests
cd mandrel-integration-tests/apps/jfr-native-image-performance
mvn package -Pnative -Dquarkus.version=999-SNAPSHOT -Dquarkus.native.monitoring=jfr
./target/jfr-runner

In another terminal run curl http://localhost:8080/hello. Instead of hello we get:

curl: (52) Empty reply from server

Note that without -Dquarkus.native.monitoring=jfr (which essentially passes --enable-monitoring=heapdump,jfr to native-image) the server acts as expected. cc @roberttoyonaga

Enabling DEBUG logging doesn't show anything interesting and setting it to ALL results in a build failure (see quarkusio/quarkus#37598).

@roberttoyonaga
Copy link
Collaborator

Hi @zakkak I can look into this

@zakkak
Copy link
Collaborator Author

zakkak commented Dec 7, 2023

Hi @zakkak I can look into this

@roberttoyonaga no need to jump in yet, just pinging you FYI. I will keep looking further and let you know if needed. Thanks!

@zakkak
Copy link
Collaborator Author

zakkak commented Dec 7, 2023

Inspecting the static fields of io.netty.util.internal.PlatformDependent0 and io.netty.util.internal.PlatformDependent with JFR monitoring disabled versus with it enabled I see the following differences:

Native mode build-time-initialized

diff --git a/native-build-time-init-mode.txt b/native-build-time-init-mode-jfr.txt
index f2e49be..3cea102 100644
--- a/native-build-time-init-mode.txt
+++ b/native-build-time-init-mode-jfr.txt
@@ -6,18 +6,18 @@ PlatformDependent0.INT_ARRAY_BASE_OFFSET = 16
 PlatformDependent0.INT_ARRAY_INDEX_SCALE = 4
 PlatformDependent0.LONG_ARRAY_BASE_OFFSET = 16
 PlatformDependent0.LONG_ARRAY_INDEX_SCALE = 8
-PlatformDependent0.DIRECT_BUFFER_CONSTRUCTOR = null
+PlatformDependent0.DIRECT_BUFFER_CONSTRUCTOR = private java.nio.DirectByteBuffer(long,long)
 PlatformDependent0.EXPLICIT_NO_UNSAFE_CAUSE = null
-PlatformDependent0.ALLOCATE_ARRAY_METHOD = null
+PlatformDependent0.ALLOCATE_ARRAY_METHOD = public java.lang.Object jdk.internal.misc.Unsafe.allocateUninitializedArray(java.lang.Class,int)
 PlatformDependent0.ALIGN_SLICE = public final java.nio.ByteBuffer java.nio.ByteBuffer.alignedSlice(int)
 PlatformDependent0.JAVA_VERSION = 21
 PlatformDependent0.IS_ANDROID = false
 PlatformDependent0.STORE_FENCE_AVAILABLE = true
 PlatformDependent0.UNSAFE_UNAVAILABILITY_CAUSE = null
-PlatformDependent0.INTERNAL_UNSAFE = null
+PlatformDependent0.INTERNAL_UNSAFE = jdk.internal.misc.Unsafe@438c5457
 PlatformDependent0.RUNNING_IN_NATIVE_IMAGE = true
 PlatformDependent0.IS_EXPLICIT_TRY_REFLECTION_SET_ACCESSIBLE = true
-PlatformDependent0.UNSAFE = sun.misc.Unsafe@970d61f
+PlatformDependent0.UNSAFE = sun.misc.Unsafe@62bf4323
 PlatformDependent0.HASH_CODE_ASCII_SEED = -1028477387
 PlatformDependent0.HASH_CODE_C1 = -862048943
 PlatformDependent0.HASH_CODE_C2 = 461845907
@@ -31,7 +31,7 @@ PlatformDependent.MAYBE_SUPER_USER = false
 PlatformDependent.CAN_ENABLE_TCP_NODELAY_BY_DEFAULT = true
 PlatformDependent.UNSAFE_UNAVAILABILITY_CAUSE = null
 PlatformDependent.DIRECT_BUFFER_PREFERRED = true
-PlatformDependent.MAX_DIRECT_MEMORY = 24660934656
+PlatformDependent.MAX_DIRECT_MEMORY = 24593825792
 PlatformDependent.MPSC_CHUNK_SIZE = 1024
 PlatformDependent.MIN_MAX_MPSC_CAPACITY = 2048
 PlatformDependent.MAX_ALLOWED_MPSC_CAPACITY = 1073741824
@@ -40,22 +40,22 @@ PlatformDependent.TMPDIR = /tmp
 PlatformDependent.BIT_MODE = 64
 PlatformDependent.NORMALIZED_ARCH = x86_64
 PlatformDependent.NORMALIZED_OS = linux
-PlatformDependent.ALLOWED_LINUX_OS_CLASSIFIERS = [Ljava.lang.String;@52d3654e
+PlatformDependent.ALLOWED_LINUX_OS_CLASSIFIERS = [Ljava.lang.String;@20d2ffcc
 PlatformDependent.LINUX_OS_CLASSIFIERS = [fedora]
 PlatformDependent.IS_WINDOWS = false
 PlatformDependent.IS_OSX = false
 PlatformDependent.IS_J9_JVM = false
 PlatformDependent.IS_IVKVM_DOT_NET = false
 PlatformDependent.ADDRESS_SIZE = 8
-PlatformDependent.USE_DIRECT_BUFFER_NO_CLEANER = false
-PlatformDependent.DIRECT_MEMORY_COUNTER = null
-PlatformDependent.DIRECT_MEMORY_LIMIT = 24660934656
-PlatformDependent.RANDOM_PROVIDER = io.netty.util.internal.PlatformDependent$2@71b6d77f
-PlatformDependent.CLEANER = io.netty.util.internal.CleanerJava9@164d4f40
-PlatformDependent.UNINITIALIZED_ARRAY_ALLOCATION_THRESHOLD = -1
-PlatformDependent.OS_RELEASE_FILES = [Ljava.lang.String;@47c35bd3
+PlatformDependent.USE_DIRECT_BUFFER_NO_CLEANER = true
+PlatformDependent.DIRECT_MEMORY_COUNTER = 0
+PlatformDependent.DIRECT_MEMORY_LIMIT = 24593825792
+PlatformDependent.RANDOM_PROVIDER = io.netty.util.internal.PlatformDependent$2@15f1832f
+PlatformDependent.CLEANER = io.netty.util.internal.CleanerJava9@3d0d87db
+PlatformDependent.UNINITIALIZED_ARRAY_ALLOCATION_THRESHOLD = 1024
+PlatformDependent.OS_RELEASE_FILES = [Ljava.lang.String;@59fc3392
 PlatformDependent.LINUX_ID_PREFIX = ID=
 PlatformDependent.LINUX_ID_LIKE_PREFIX = ID_LIKE=
 PlatformDependent.BIG_ENDIAN_NATIVE_ORDER = false
-PlatformDependent.NOOP = io.netty.util.internal.PlatformDependent$1@e515370
+PlatformDependent.NOOP = io.netty.util.internal.PlatformDependent$1@26842a93
 PlatformDependent.$assertionsDisabled = true

Native mode run-time-initialized

diff --git a/native-run-time-init-mode.txt b/native-run-time-init-mode-jfr.txt
index 5a1aa1f..c31a84c 100644
--- a/native-run-time-init-mode.txt
+++ b/native-run-time-init-mode-jfr.txt
@@ -6,7 +6,7 @@ PlatformDependent0.INT_ARRAY_BASE_OFFSET = 16
 PlatformDependent0.INT_ARRAY_INDEX_SCALE = 4
 PlatformDependent0.LONG_ARRAY_BASE_OFFSET = 16
 PlatformDependent0.LONG_ARRAY_INDEX_SCALE = 8
-PlatformDependent0.DIRECT_BUFFER_CONSTRUCTOR = null
+PlatformDependent0.DIRECT_BUFFER_CONSTRUCTOR = private java.nio.DirectByteBuffer(long,long)
 PlatformDependent0.EXPLICIT_NO_UNSAFE_CAUSE = null
 PlatformDependent0.ALLOCATE_ARRAY_METHOD = null
 PlatformDependent0.ALIGN_SLICE = public final java.nio.ByteBuffer java.nio.ByteBuffer.alignedSlice(int)
@@ -14,10 +14,10 @@ PlatformDependent0.JAVA_VERSION = 21
 PlatformDependent0.IS_ANDROID = false
 PlatformDependent0.STORE_FENCE_AVAILABLE = true
 PlatformDependent0.UNSAFE_UNAVAILABILITY_CAUSE = null
-PlatformDependent0.INTERNAL_UNSAFE = null
+PlatformDependent0.INTERNAL_UNSAFE = jdk.internal.misc.Unsafe@147a7def
 PlatformDependent0.RUNNING_IN_NATIVE_IMAGE = true
 PlatformDependent0.IS_EXPLICIT_TRY_REFLECTION_SET_ACCESSIBLE = true
-PlatformDependent0.UNSAFE = sun.misc.Unsafe@744c41dd
+PlatformDependent0.UNSAFE = sun.misc.Unsafe@37286f61
 PlatformDependent0.HASH_CODE_ASCII_SEED = -1028477387
 PlatformDependent0.HASH_CODE_C1 = -862048943
 PlatformDependent0.HASH_CODE_C2 = 461845907
@@ -40,22 +40,22 @@ PlatformDependent.TMPDIR = /tmp
 PlatformDependent.BIT_MODE = 64
 PlatformDependent.NORMALIZED_ARCH = x86_64
 PlatformDependent.NORMALIZED_OS = linux
-PlatformDependent.ALLOWED_LINUX_OS_CLASSIFIERS = [Ljava.lang.String;@656445c5
+PlatformDependent.ALLOWED_LINUX_OS_CLASSIFIERS = [Ljava.lang.String;@48221138
 PlatformDependent.LINUX_OS_CLASSIFIERS = [fedora]
 PlatformDependent.IS_WINDOWS = false
 PlatformDependent.IS_OSX = false
 PlatformDependent.IS_J9_JVM = false
 PlatformDependent.IS_IVKVM_DOT_NET = false
 PlatformDependent.ADDRESS_SIZE = 8
-PlatformDependent.USE_DIRECT_BUFFER_NO_CLEANER = false
-PlatformDependent.DIRECT_MEMORY_COUNTER = null
+PlatformDependent.USE_DIRECT_BUFFER_NO_CLEANER = true
+PlatformDependent.DIRECT_MEMORY_COUNTER = 0
 PlatformDependent.DIRECT_MEMORY_LIMIT = 53206843392
-PlatformDependent.RANDOM_PROVIDER = io.netty.util.internal.PlatformDependent$2@301cc3e2
-PlatformDependent.CLEANER = io.netty.util.internal.CleanerJava9@8905ee9
+PlatformDependent.RANDOM_PROVIDER = io.netty.util.internal.PlatformDependent$2@6f6af829
+PlatformDependent.CLEANER = io.netty.util.internal.CleanerJava9@140b865c
 PlatformDependent.UNINITIALIZED_ARRAY_ALLOCATION_THRESHOLD = -1
-PlatformDependent.OS_RELEASE_FILES = [Ljava.lang.String;@5b464f2a
+PlatformDependent.OS_RELEASE_FILES = [Ljava.lang.String;@70835f0e
 PlatformDependent.LINUX_ID_PREFIX = ID=
 PlatformDependent.LINUX_ID_LIKE_PREFIX = ID_LIKE=
 PlatformDependent.BIG_ENDIAN_NATIVE_ORDER = false
-PlatformDependent.NOOP = io.netty.util.internal.PlatformDependent$1@2cd8284d
+PlatformDependent.NOOP = io.netty.util.internal.PlatformDependent$1@5d61e8ac
 PlatformDependent.$assertionsDisabled = true

FTR I am using https://github.com/zakkak/issue-reproducers/tree/2023-12-07-mandrel-it-tests-236 to get these.

@zakkak
Copy link
Collaborator Author

zakkak commented Dec 7, 2023

Enabling DEBUG logging doesn't show anything interesting

I take that back...

When jfr monitoring is enabled I see

java.nio.Bits.unaligned: available, true

While when it is disabled I see:

java.nio.Bits.unaligned: unavailable, true, Unable to make static boolean java.nio.Bits.unaligned() accessible: module java.base does not "opens java.nio" to unnamed module @21bcffb5

which indicates that enabling jfr monitoring somehow results in java.base opening java.nio to the unnamed module used by Quarkus.

@zakkak
Copy link
Collaborator Author

zakkak commented Dec 7, 2023

@roberttoyonaga does this ^^ ring any bells?

@roberttoyonaga
Copy link
Collaborator

does this ^^ ring any bells?

hmm not really

@roberttoyonaga
Copy link
Collaborator

@zakkak So just to confirm, you're seeing that build-time init of io.netty.util.internal.PlatformDependent0 is fine (with and without JFR enabled). But runtime init/re-init causes issues?

The only part of the diff you shared here that immediately strikes me as suspicious is that build time init with JFR has:

-PlatformDependent0.ALLOCATE_ARRAY_METHOD = null
+PlatformDependent0.ALLOCATE_ARRAY_METHOD = public java.lang.Object jdk.internal.misc.Unsafe.allocateUninitializedArray(java.lang.Class,int)

While runtime init has no diff: PlatformDependent0.ALLOCATE_ARRAY_METHOD = null

Maybe this is not an issue, but I've fixed bugs in the past related to allocation events causing issues at startup (when current_thread hasn't yet been assigned or other things are not set up yet). Maybe something along those lines is happening since it seems like quarkus is failing to respond to the first request. It's just a hunch though. I can try to look into it more after the serviceability meetup.

@zakkak
Copy link
Collaborator Author

zakkak commented Dec 8, 2023

@zakkak So just to confirm, you're seeing that build-time init of io.netty.util.internal.PlatformDependent0 is fine (with and without JFR enabled). But runtime init/re-init causes issues?

Correct, but only when JFR is enabled.

The only part of the diff you shared here that immediately strikes me as suspicious is that build time init with JFR has:

Yes that's interesting, but it still used to work with that diff (we used to build-time-initialize these classes without observing CI failures, but now we run-time-initialize them and we get this CI failure).

The comparison of run-time-init (current approach of Quarkus for these classes) with and without JFR in #236 (comment) is also pretty interesting:

-PlatformDependent0.DIRECT_BUFFER_CONSTRUCTOR = null
+PlatformDependent0.DIRECT_BUFFER_CONSTRUCTOR = private java.nio.DirectByteBuffer(long,long)
-PlatformDependent0.INTERNAL_UNSAFE = null
+PlatformDependent0.INTERNAL_UNSAFE = jdk.internal.misc.Unsafe@147a7def
-PlatformDependent.USE_DIRECT_BUFFER_NO_CLEANER = false
+PlatformDependent.USE_DIRECT_BUFFER_NO_CLEANER = true
-PlatformDependent.DIRECT_MEMORY_COUNTER = null
+PlatformDependent.DIRECT_MEMORY_COUNTER = 0

My next step would be to check what is enabled in GraalVM / Mandrel when enabling jfr monitoring and why it affects netty's run-time initialization.

@zakkak
Copy link
Collaborator Author

zakkak commented Dec 8, 2023

which indicates that enabling jfr monitoring somehow results in java.base opening java.nio to the unnamed module used by Quarkus.

Confirmed, it happens here https://github.com/oracle/graal/blob/4dd8a9d611b78668bb84116cf7ff8cc7e4146907/substratevm/src/com.oracle.svm.core/src/com/oracle/svm/core/jfr/JfrFeature.java#L156

I will try to do the same in Quarkus to see how it will behave.

zakkak added a commit to zakkak/quarkus that referenced this issue Dec 8, 2023
Since we moved `io.netty.util.internal.PlatformDependent`'s and
`io.netty.util.internal.PlatformDependent0`' initialization at runtime
we need to register `java.nio.DirectByteBuffer`'s constructors for
reflective access to avoid miss-configurations due to not being able to
access the constructor in
https://github.com/netty/netty/blob/5db037beedca8aa5b6a486fa515cc6af013ced74/common/src/main/java/io/netty/util/internal/PlatformDependent0.java#L290-L292

More classes and methods need to be registered as discussed in
quarkusio#37626

Closes Karm/mandrel-integration-tests#236
zakkak added a commit to zakkak/quarkus that referenced this issue Dec 8, 2023
Since we moved `io.netty.util.internal.PlatformDependent`'s and
`io.netty.util.internal.PlatformDependent0`' initialization at runtime
we need to register `java.nio.DirectByteBuffer`'s constructors for
reflective access to avoid miss-configurations due to not being able to
access the constructor in
https://github.com/netty/netty/blob/5db037beedca8aa5b6a486fa515cc6af013ced74/common/src/main/java/io/netty/util/internal/PlatformDependent0.java#L290-L292

More classes and methods need to be registered as discussed in
quarkusio#37626

Closes Karm/mandrel-integration-tests#236
@zakkak
Copy link
Collaborator Author

zakkak commented Dec 8, 2023

I found what's the route cause, see quarkusio/quarkus#37626, but I still have not a clear understanding of why (which code-flow it triggers) we see failures only with JFR enabled.

@roberttoyonaga
Copy link
Collaborator

Nice! And I guess setting the workaround of io.netty.tryReflectionSetAccessible=false prevents it from trying to continue without the correct configuration. Yeah it doesn't yet make sense how adding JFR contributes to this problem.

zakkak added a commit to zakkak/quarkus that referenced this issue Dec 8, 2023
In quarkusio#37347 we moved
`PlatformDependent` and `PlatformDependent0` classes to
run-time-initialization as they are platform-dependent and thus need to
be initialized on the target platform. It looks like there are more
classes like these that need to be runtime initialized as they
transitively rely on platform dependent values.

Closes Karm/mandrel-integration-tests#236

Supersedes quarkusio#37628
@zakkak
Copy link
Collaborator Author

zakkak commented Dec 9, 2023

FTR The stack trace in Quarkus leading to the failure is:

2023-12-09 00:36:04,630 TRACE [io.ver.cor.htt.imp.HttpServerImpl] (vert.x-eventloop-thread-0) Connection failure: java.lang.Error: java.lang.NullPointerException
	at io.netty.util.internal.PlatformDependent0.newDirectBuffer(PlatformDependent0.java:620)
	at io.netty.util.internal.PlatformDependent0.allocateDirectNoCleaner(PlatformDependent0.java:579)
	at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:774)
	at io.netty.buffer.UnpooledUnsafeNoCleanerDirectByteBuf.allocateDirect(UnpooledUnsafeNoCleanerDirectByteBuf.java:30)
	at io.netty.buffer.UnpooledByteBufAllocator$InstrumentedUnpooledUnsafeNoCleanerDirectByteBuf.allocateDirect(UnpooledByteBufAllocator.java:186)
	at io.netty.buffer.UnpooledDirectByteBuf.<init>(UnpooledDirectByteBuf.java:64)
	at io.netty.buffer.UnpooledUnsafeDirectByteBuf.<init>(UnpooledUnsafeDirectByteBuf.java:41)
	at io.netty.buffer.UnpooledUnsafeNoCleanerDirectByteBuf.<init>(UnpooledUnsafeNoCleanerDirectByteBuf.java:25)
	at io.netty.buffer.UnpooledByteBufAllocator$InstrumentedUnpooledUnsafeNoCleanerDirectByteBuf.<init>(UnpooledByteBufAllocator.java:181)
	at io.netty.buffer.UnpooledByteBufAllocator.newDirectBuffer(UnpooledByteBufAllocator.java:91)
	at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:188)
	at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:179)
	at io.netty.buffer.Unpooled.directBuffer(Unpooled.java:128)
	at io.netty.handler.codec.http.HttpObjectEncoder.<clinit>(HttpObjectEncoder.java:59)
	at [email protected]/java.lang.Class.ensureInitialized(DynamicHub.java:595)
	at [email protected]/java.lang.Class.ensureInitialized(DynamicHub.java:595)
	at io.vertx.core.http.impl.HttpServerWorker.configureHttp1OrH2C(HttpServerWorker.java:276)
	at io.vertx.core.http.impl.HttpServerWorker.handleHttp1(HttpServerWorker.java:209)
	at io.vertx.core.http.impl.HttpServerWorker.access$200(HttpServerWorker.java:50)
	at io.vertx.core.http.impl.HttpServerWorker$1.configure(HttpServerWorker.java:180)
	at io.vertx.core.http.impl.Http1xOrH2CHandler.end(Http1xOrH2CHandler.java:59)
	at io.vertx.core.http.impl.Http1xOrH2CHandler.channelRead(Http1xOrH2CHandler.java:38)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
	at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:442)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
	at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
	at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
	at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
	at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)
	at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:788)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:724)
	at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:650)
	at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:562)
	at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
	at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
	at [email protected]/java.lang.Thread.runWith(Thread.java:1596)
	at [email protected]/java.lang.Thread.run(Thread.java:1583)
	at org.graalvm.nativeimage.builder/com.oracle.svm.core.thread.PlatformThreads.threadStartRoutine(PlatformThreads.java:832)
	at org.graalvm.nativeimage.builder/com.oracle.svm.core.posix.thread.PosixPlatformThreads.pthreadStartRoutine(PosixPlatformThreads.java:211)
Caused by: java.lang.NullPointerException
	at io.netty.util.internal.PlatformDependent0.newDirectBuffer(PlatformDependent0.java:614)
	... 44 more

Which is the result of PlatformDependent.USE_DIRECT_BUFFER_NO_CLEANER being set (and observed by other classes) at build time while PlatformDependent0.DIRECT_BUFFER_CONSTRUCTOR is null at run time (in Quarkus, because in my tests above it appears to be set :/)

DEBUG [io.net.uti.int.PlatformDependent0] (Thread-3) java.nio.DirectByteBuffer.<init>(long, {int,long}): unavailable

@roberttoyonaga
Copy link
Collaborator

roberttoyonaga commented Dec 9, 2023

but I still have not a clear understanding of why (which code-flow it triggers) we see failures only with JFR enabled.

Is it possible that java.base opening java.nio to the unnamed module is the reason that it only fails with JFR included at build time? When JFR is enabled, PlatformDependent0.DIRECT_BUFFER_CONSTRUCTOR = private java.nio.DirectByteBuffer(long,long), which causes PlatformDependent.USE_DIRECT_BUFFER_NO_CLEANER to be set, resulting the problems at runtime due to missing reflection config. Maybe when JFR is not enabled and java.base doesn't open java.nio, PlatformDependent0.DIRECT_BUFFER_CONSTRUCTOR must remain null (as shown in your diff) because java.nio.DirectByteBuffer can't be reflectively accessed. That would result in avoiding the problem you explained above.

If I remove https://github.com/oracle/graal/blob/4dd8a9d611b78668bb84116cf7ff8cc7e4146907/substratevm/src/com.oracle.svm.core/src/com/oracle/svm/core/jfr/JfrFeature.java#L156 , then everything seems to work fine (I get the expected "hello" response).

@zakkak
Copy link
Collaborator Author

zakkak commented Dec 9, 2023

Is it possible that java.base opening java.nio to the unnamed module is the reason that it only fails with JFR included at build time?

Yes, although it's still not clear to me why. Thanks for confirming this with a modified GraalVM version.

zakkak added a commit to zakkak/quarkus that referenced this issue Dec 11, 2023
In quarkusio#37347 we moved
`PlatformDependent` and `PlatformDependent0` classes to
run-time-initialization as they are platform-dependent and thus need to
be initialized on the target platform. It looks like there are more
classes like these that need to be runtime initialized as they
transitively rely on platform dependent values.

Closes Karm/mandrel-integration-tests#236

Supersedes quarkusio#37628
@zakkak
Copy link
Collaborator Author

zakkak commented Dec 14, 2023

Closed by quarkusio/quarkus#37633

I will create a new issue for the discrepancies when using --enable-monitoring=jfr

@zakkak zakkak closed this as completed Dec 14, 2023
gsmet pushed a commit to gsmet/quarkus that referenced this issue Jan 15, 2024
In quarkusio#37347 we moved
`PlatformDependent` and `PlatformDependent0` classes to
run-time-initialization as they are platform-dependent and thus need to
be initialized on the target platform. It looks like there are more
classes like these that need to be runtime initialized as they
transitively rely on platform dependent values.

Closes Karm/mandrel-integration-tests#236

Supersedes quarkusio#37628

(cherry picked from commit 72d7b79)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants