Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enabling JFR monitoring has side effects in Netty's initialization/configuration (and possibly in other libraries as well) that possibly skew the observed metrics. #8020

Closed
zakkak opened this issue Dec 14, 2023 · 5 comments

Comments

@zakkak
Copy link
Collaborator

zakkak commented Dec 14, 2023

Describe the issue
When enabling JFR monitoring (--enable-monitoring=jfr) we observe that some constants in io.netty.util.internal.PlatformDependent and io.netty.util.internal.PlatformDependent0 are initialized to a different value.

Steps to reproduce the issue

cd /tmp
git clone --branch 2023-12-07-mandrel-it-tests-236 https://github.com/zakkak/issue-reproducers reproducer
cd reproducer
mvn package
java -agentlib:native-image-agent=config-output-dir=src/main/resources/META-INF/native-image -jar target/reproducer-1.0-SNAPSHOT.jar
mvn package
native-image -jar target/reproducer-1.0-SNAPSHOT.jar
./reproducer-1.0-SNAPSHOT | tee no-jfr.txt
native-image -jar target/reproducer-1.0-SNAPSHOT.jar --enable-monitoring=jfr
./reproducer-1.0-SNAPSHOT | tee jfr.txt
diff no-jfr.txt jfr.txt

This will result in a diff like the following:

9c9
< PlatformDependent0.DIRECT_BUFFER_CONSTRUCTOR = null
---
> PlatformDependent0.DIRECT_BUFFER_CONSTRUCTOR = private java.nio.DirectByteBuffer(long,long)
17c17
< PlatformDependent0.INTERNAL_UNSAFE = null
---
> PlatformDependent0.INTERNAL_UNSAFE = jdk.internal.misc.Unsafe@67141b8a
20c20
< PlatformDependent0.UNSAFE = sun.misc.Unsafe@41c5ea82
---
> PlatformDependent0.UNSAFE = sun.misc.Unsafe@4b7a3a11
43c43
< PlatformDependent.ALLOWED_LINUX_OS_CLASSIFIERS = [Ljava.lang.String;@1851c3b0
---
> PlatformDependent.ALLOWED_LINUX_OS_CLASSIFIERS = [Ljava.lang.String;@7e8b3bed
50,51c50,51
< PlatformDependent.USE_DIRECT_BUFFER_NO_CLEANER = false
< PlatformDependent.DIRECT_MEMORY_COUNTER = null
---
> PlatformDependent.USE_DIRECT_BUFFER_NO_CLEANER = true
> PlatformDependent.DIRECT_MEMORY_COUNTER = 0
53,54c53,54
< PlatformDependent.RANDOM_PROVIDER = io.netty.util.internal.PlatformDependent$2@39d85d2a
< PlatformDependent.CLEANER = io.netty.util.internal.CleanerJava9@74559a6f
---
> PlatformDependent.RANDOM_PROVIDER = io.netty.util.internal.PlatformDependent$2@1118c43
> PlatformDependent.CLEANER = io.netty.util.internal.CleanerJava9@227e5b5e
56c56
< PlatformDependent.OS_RELEASE_FILES = [Ljava.lang.String;@302f813f
---
> PlatformDependent.OS_RELEASE_FILES = [Ljava.lang.String;@4b74c330
60c60
< PlatformDependent.NOOP = io.netty.util.internal.PlatformDependent$1@63c6470
---
> PlatformDependent.NOOP = io.netty.util.internal.PlatformDependent$1@300279ad

Where the interesting parts are:

9c9
< PlatformDependent0.DIRECT_BUFFER_CONSTRUCTOR = null
---
> PlatformDependent0.DIRECT_BUFFER_CONSTRUCTOR = private java.nio.DirectByteBuffer(long,long)
17c17
< PlatformDependent0.INTERNAL_UNSAFE = null
---
> PlatformDependent0.INTERNAL_UNSAFE = jdk.internal.misc.Unsafe@67141b8a
50,51c50,51
< PlatformDependent.USE_DIRECT_BUFFER_NO_CLEANER = false
< PlatformDependent.DIRECT_MEMORY_COUNTER = null
---
> PlatformDependent.USE_DIRECT_BUFFER_NO_CLEANER = true
> PlatformDependent.DIRECT_MEMORY_COUNTER = 0

i.e. in one case Netty has access to the java.nio.DirectByteBuffer while in the other it doesn't, same for jdk.internal.misc.Unsafe, leading to different code paths' being taken when using JFR than when not. This might result in missleading observations when using JFR.

Describe GraalVM and your environment:

  • GraalVM version (latest snapshot builds can be found here), or commit id if built from source: dcbe8f4
  • JDK major version: 22
  • OS: Fedora 39
  • Architecture: AMD64

More details

Our so far understanding is that this is caused by the opening of java.base/java.nio to the unnamed module.

According to @roberttoyonaga's expirement if we remove that particular line we no longer observe the issue.

FTR: This issue was first detected and studied in Karm/mandrel-integration-tests#236

cc @christianhaeubl @roberttoyonaga

@roberttoyonaga
Copy link
Collaborator

Maybe we can scope it down and not open all of java.base, only what we really need. I can look into this.

@roberttoyonaga
Copy link
Collaborator

It seems like opening jdk.jfr and java.base has no real impact. When I comment out the relevant code, all the native unittests still pass. Manual inspection of snapshots in JMC also shows no issues.

When I look at the git history com.oracle.svm.core.jfr.JfrFeature it seems like we've been opening jdk.jfr and java.base to the unnamed module since inception. Perhaps this is outdated legacy code and can be removed now. @christianhaeubl do you know if this is the case?

The only other thing I can think of is that this is necessary for using JFR at build time. I'll look into this more.

@roberttoyonaga
Copy link
Collaborator

The only other thing I can think of is that this is necessary for using JFR at build time. I'll look into this more.

Just tested by forcing JFR to run during image build without opening jdk.jfr and java.base to the unnamed module. It seems to work fine.

@jerboaa
Copy link
Collaborator

jerboaa commented Dec 21, 2023

@roberttoyonaga @zakkak Can this be closed now? Should be fixed with #8037, no?

@roberttoyonaga
Copy link
Collaborator

@roberttoyonaga @zakkak Can this be closed now? Should be fixed with #8037, no?

Yes this can be closed now! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants