Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[apache-ci] nightly native build-and-test job for kogito-apps fails on data-audit-quarkus-jpa-service #1992

Open
domhanak opened this issue Feb 19, 2024 · 7 comments

Comments

@domhanak
Copy link

domhanak commented Feb 19, 2024

Describe the bug

Nightly native build job of kogito-apps [1] fails consistently for past week.

The failure i caused by data-audit-quarkus-jpa-service project.
Relevant output from logs:

[2024-02-19T02:57:14.709Z] ========================================================================================================================
[2024-02-19T02:57:14.709Z] GraalVM Native Image: Generating 'data-audit-quarkus-jpa-service-999-SNAPSHOT-runner' (executable)...
[2024-02-19T02:57:14.709Z] ========================================================================================================================
[2024-02-19T02:57:14.709Z] For detailed information and explanations on the build output, visit:
[2024-02-19T02:57:14.709Z] https://github.com/oracle/graal/blob/master/docs/reference-manual/native-image/BuildOutput.md
[2024-02-19T02:57:14.709Z] ------------------------------------------------------------------------------------------------------------------------
[2024-02-19T02:57:14.709Z] Warning: The host machine does not support all features of 'x86-64-v3'. Falling back to '-march=compatibility' for best compatibility.
[2024-02-19T02:57:19.239Z] [1/8] Initializing...                                                                                   (11.9s @ 0.60GB)
[2024-02-19T02:57:19.239Z]  Java version: 17.0.10+7, vendor version: Mandrel-23.0.3.0-Final
[2024-02-19T02:57:19.239Z]  Graal compiler: optimization level: 2, target machine: compatibility
[2024-02-19T02:57:19.239Z]  C compiler: gcc (redhat, x86_64, 8.5.0)
[2024-02-19T02:57:19.239Z]  Garbage collector: Serial GC (max heap size: 80% of RAM)
[2024-02-19T02:57:19.239Z]  9 user-specific feature(s)
[2024-02-19T02:57:19.239Z]  - com.oracle.svm.thirdparty.gson.GsonFeature
[2024-02-19T02:57:19.239Z]  - io.quarkus.caffeine.runtime.graal.CacheConstructorsFeature
[2024-02-19T02:57:19.239Z]  - io.quarkus.hibernate.orm.runtime.graal.DisableLoggingFeature: Disables INFO logging during the analysis phase
[2024-02-19T02:57:19.239Z]  - io.quarkus.jdbc.h2.runtime.H2Reflections: Support for H2 Database's extended data types
[2024-02-19T02:57:19.239Z]  - io.quarkus.jdbc.postgresql.runtime.graal.SQLXMLFeature
[2024-02-19T02:57:19.239Z]  - io.quarkus.runner.Feature: Auto-generated class by Quarkus from the existing extensions
[2024-02-19T02:57:19.239Z]  - io.quarkus.runtime.graal.DisableLoggingFeature: Disables INFO logging during the analysis phase
[2024-02-19T02:57:19.239Z]  - org.eclipse.angus.activation.nativeimage.AngusActivationFeature
[2024-02-19T02:57:19.239Z]  - org.hibernate.graalvm.internal.GraalVMStaticFeature: Hibernate ORM's static reflection registrations for GraalVM
[2024-02-19T02:57:20.643Z] 02:57:20,274 WARN  [org.hib.orm.deprecation] HHH90000025: H2Dialect does not need to be specified explicitly using 'hibernate.dialect' (remove the property setting and it will be selected by default)
[2024-02-19T02:58:39.732Z] [2/8] Performing analysis...  [*****]                                                                   (69.4s @ 2.96GB)
[2024-02-19T02:58:39.732Z]   27,503 (90.27%) of 30,469 types reachable
[2024-02-19T02:58:39.732Z]   42,872 (65.46%) of 65,494 fields reachable
[2024-02-19T02:58:39.732Z]  140,077 (58.21%) of 240,639 methods reachable
[2024-02-19T02:58:39.732Z]    7,716 types,   318 fields, and 4,693 methods registered for reflection
[2024-02-19T02:58:39.732Z]       64 types,    70 fields, and    55 methods registered for JNI access
[2024-02-19T02:58:39.732Z]        4 native libraries: dl, pthread, rt, z
[2024-02-19T02:58:39.732Z] 
[2024-02-19T02:58:39.732Z] Error: Detected an instance of Random/SplittableRandom class in the image heap. Instances created during image generation have cached seed values and don't behave as expected.  To see how this object got instantiated use --trace-object-instantiation=java.util.Random. The object was probably created by a class initializer and is reachable from a static field. You can request class initialization at image runtime by using the option --initialize-at-run-time=<class-name>. Or you can write your own initialization methods and call them explicitly from your main entry point.
[2024-02-19T02:58:39.732Z] Detailed message:
[2024-02-19T02:58:39.732Z] Trace: Object was reached by
[2024-02-19T02:58:39.732Z]   trying to constant fold static field org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.RNG
[2024-02-19T02:58:39.732Z]     at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.access$100(SaslClientAuthenticator.java:77)
[2024-02-19T02:58:39.732Z]   parsing method org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.access$100(SaslClientAuthenticator.java:77) reachable via the parsing context
[2024-02-19T02:58:39.732Z]     at static root method.(Unknown Source)
[2024-02-19T02:58:39.732Z] 
[2024-02-19T02:58:39.732Z] 
[2024-02-19T02:58:39.732Z] com.oracle.svm.core.util.UserError$UserException: Detected an instance of Random/SplittableRandom class in the image heap. Instances created during image generation have cached seed values and don't behave as expected.  To see how this object got instantiated use --trace-object-instantiation=java.util.Random. The object was probably created by a class initializer and is reachable from a static field. You can request class initialization at image runtime by using the option --initialize-at-run-time=<class-name>. Or you can write your own initialization methods and call them explicitly from your main entry point.
[2024-02-19T02:58:39.732Z] Detailed message:
[2024-02-19T02:58:39.732Z] Trace: Object was reached by
[2024-02-19T02:58:39.732Z]   trying to constant fold static field org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.RNG
[2024-02-19T02:58:39.732Z]     at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.access$100(SaslClientAuthenticator.java:77)
[2024-02-19T02:58:39.732Z]   parsing method org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.access$100(SaslClientAuthenticator.java:77) reachable via the parsing context
[2024-02-19T02:58:39.732Z]     at static root method.(Unknown Source)
[2024-02-19T02:58:39.732Z] 
[2024-02-19T02:58:39.732Z] 
[2024-02-19T02:58:39.732Z] 	at org.graalvm.nativeimage.builder/com.oracle.svm.core.util.UserError.abort(UserError.java:85)
[2024-02-19T02:58:39.732Z] 	at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.FallbackFeature.reportAsFallback(FallbackFeature.java:248)
[2024-02-19T02:58:39.732Z] 	at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.NativeImageGenerator.runPointsToAnalysis(NativeImageGenerator.java:798)
[2024-02-19T02:58:39.732Z] 	at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.NativeImageGenerator.doRun(NativeImageGenerator.java:582)
[2024-02-19T02:58:39.732Z] 	at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.NativeImageGenerator.run(NativeImageGenerator.java:539)
[2024-02-19T02:58:39.732Z] 	at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.NativeImageGeneratorRunner.buildImage(NativeImageGeneratorRunner.java:408)
[2024-02-19T02:58:39.732Z] 	at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.NativeImageGeneratorRunner.build(NativeImageGeneratorRunner.java:612)
[2024-02-19T02:58:39.732Z] 	at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.NativeImageGeneratorRunner.start(NativeImageGeneratorRunner.java:134)
[2024-02-19T02:58:39.732Z] 	at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.NativeImageGeneratorRunner.main(NativeImageGeneratorRunner.java:94)
[2024-02-19T02:58:39.732Z] Caused by: com.oracle.graal.pointsto.constraints.UnsupportedFeatureException: Detected an instance of Random/SplittableRandom class in the image heap. Instances created during image generation have cached seed values and don't behave as expected.  To see how this object got instantiated use --trace-object-instantiation=java.util.Random. The object was probably created by a class initializer and is reachable from a static field. You can request class initialization at image runtime by using the option --initialize-at-run-time=<class-name>. Or you can write your own initialization methods and call them explicitly from your main entry point.
[2024-02-19T02:58:39.732Z] Detailed message:
[2024-02-19T02:58:39.732Z] Trace: Object was reached by
[2024-02-19T02:58:39.732Z]   trying to constant fold static field org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.RNG
[2024-02-19T02:58:39.732Z]     at org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.access$100(SaslClientAuthenticator.java:77)
[2024-02-19T02:58:39.732Z]   parsing method org.apache.kafka.common.security.authenticator.SaslClientAuthenticator.access$100(SaslClientAuthenticator.java:77) reachable via the parsing context
[2024-02-19T02:58:39.732Z]     at static root method.(Unknown Source)
[2024-02-19T02:58:39.732Z] 
[2024-02-19T02:58:39.732Z] 
[2024-02-19T02:58:39.732Z] 	at org.graalvm.nativeimage.pointsto/com.oracle.graal.pointsto.constraints.UnsupportedFeatures.report(UnsupportedFeatures.java:126)
[2024-02-19T02:58:39.732Z] 	at org.graalvm.nativeimage.builder/com.oracle.svm.hosted.NativeImageGenerator.runPointsToAnalysis(NativeImageGenerator.java:793)
[2024-02-19T02:58:39.732Z] 	... 6 more
[2024-02-19T02:58:39.732Z] ------------------------------------------------------------------------------------------------------------------------
[2024-02-19T02:58:39.732Z]                         5.4s (6.4% of total time) in 36 GCs | Peak RSS: 5.45GB | CPU load: 13.71
[2024-02-19T02:58:39.732Z] ------------------------------------------------------------------------------------------------------------------------

[1] https://ci-builds.apache.org/job/KIE/job/kogito/job/main/job/nightly.native/job/kogito-apps.build-and-test/

Expected behavior

The job will be passing and no test failures:
https://ci-builds.apache.org/job/KIE/job/kogito/job/main/job/nightly.native/job/kogito-apps.build-and-test/lastCompletedBuild/testReport/

Actual behavior

Job si failing and there are test failures.
https://ci-builds.apache.org/job/KIE/job/kogito/job/main/job/nightly.native/job/kogito-apps.build-and-test/lastCompletedBuild/testReport/

How to Reproduce?

Execute https://ci-builds.apache.org/job/KIE/job/kogito/job/main/job/nightly.native/job/kogito-apps.build-and-test/

Output of uname -a or ver

No response

Output of java -version

No response

GraalVM version (if different from Java)

No response

Kogito version or git rev (or at least Quarkus version if you are using Kogito via Quarkus platform BOM)

No response

Build tool (ie. output of mvnw --version or gradlew --version)

No response

Additional information

No response

@domhanak domhanak changed the title [apache-ci] [apache-ci] nightly native build-and-test jobs for kogito-apps fails on data-audit-quarkus-jpa-service Feb 19, 2024
@domhanak domhanak changed the title [apache-ci] nightly native build-and-test jobs for kogito-apps fails on data-audit-quarkus-jpa-service [apache-ci] nightly native build-and-test job for kogito-apps fails on data-audit-quarkus-jpa-service Feb 19, 2024
@jstastny-cz
Copy link
Contributor

@domhanak did you try reproduce locally? From this report it seems it's CI-relevant issue, since you're providing Jenkins job as a reproducer, but I believe this is reproducible also on local - in which case it would be great to put mvn command there.

@domhanak
Copy link
Author

domhanak commented Feb 19, 2024

@jstastny-cz yes I did, other than it freezing my laptop couple of times it worked, but didn't investigate.
Not sure if the job was ever working after the migration, last success it looks to be Dec 6th.
Needs to be taken care off, started this issue to track it.

There are some intermittent test failures that could be caused by environment issues.

7 test failures that repeat for past 10 builds ( Feb 16 for example.) however could be related to some changes.

The problem already mentioned above, again, could be misconfiguration or real issue.

I did a review of some available builds logs and logging it here:
https://gist.github.com/domhanak/c99d8ec98e84dfee0f1d72053b208347

@jstastny-cz
Copy link
Contributor

2 different things IMHO, there are unstable tests. But the native compilation failure is fairly new after quarkus 3 upgrade.

Can you confirm that the native compilation error does not occur on your local? Also please note down mvn command used, it might be the difference to the CI environment.

@domhanak
Copy link
Author

domhanak commented Feb 20, 2024

I used the same command that is used on CI, only difference is settings.xml.

[command] /usr/local/bin/mvn dependency:tree clean install -Dquickly -s /home/jenkins/workspace/KIE/kogito/main/nightly.native/kogito-apps.build-and-test@tmp/config13028030101733848056tmp -Dmaven.wagon.http.ssl.insecure=true -Dmaven.test.failure.ignore=true -nsu -ntp -fae -e -Dhttp.keepAlive=false -Dmaven.wagon.http.pool=false -Dmaven.wagon.httpconnectionManager.ttlSeconds=120 -Dmaven.wagon.http.retryHandler.count=3 -Dorg.slf4j.simpleLogger.log.org.apache.maven.cli.transfer.Slf4jMavenTransferListener=warn -B

Focusing the issue on the one failure of building the module on CI, Once that is fixed, tests are next.

Currently, latest build fails because of different reason. Most probably this PR - apache/incubator-kie-kogito-runtimes#3401 But just a guess.

Will check back later

@pefernan
Copy link
Contributor

@domhanak yea, that PR affected the nightlies. It should be fixed as soon as we have new artifacts published..
If we cannot have them soon I'll send a PR to just adjust the extensions to use the old id's..

@jstastny-cz
Copy link
Contributor

I used the same command that is used on CI, only difference is settings.xml.

[command] /usr/local/bin/mvn dependency:tree clean install -Dquickly -s /home/jenkins/workspace/KIE/kogito/main/nightly.native/kogito-apps.build-and-test@tmp/config13028030101733848056tmp -Dmaven.wagon.http.ssl.insecure=true -Dmaven.test.failure.ignore=true -nsu -ntp -fae -e -Dhttp.keepAlive=false -Dmaven.wagon.http.pool=false -Dmaven.wagon.httpconnectionManager.ttlSeconds=120 -Dmaven.wagon.http.retryHandler.count=3 -Dorg.slf4j.simpleLogger.log.org.apache.maven.cli.transfer.Slf4jMavenTransferListener=warn -B

Focusing the issue on the one failure of building the module on CI, Once that is fixed, tests are next.

Currently, latest build fails because of different reason. Most probably this PR - apache/incubator-kie-kogito-runtimes#3401 But just a guess.

Will check back later

That's a different command, there's no native in there. This is probably a preceding mvn build of drools, runtimes, or so, but the actual kogito-apps native build should be later on.

@domhanak
Copy link
Author

domhanak commented Feb 20, 2024

@domhanak yea, that PR affected the nightlies. It should be fixed as soon as we have new artifacts published.. If we cannot have them soon I'll send a PR to just adjust the extensions to use the old id's..

No worries, handle it as planned on your issue.

@jstastny-cz yea, so it passed because I forgot to check the command I run locally the second time. Issues between keyboard and chair solved ✔️ and the build failure is reproducible locally.

I am trying with following properties as per - quarkusio/quarkus#26835

-Dquarkus.native.additional-build-args="--initialize-at-run-time=org.apache.kafka.common.security.authenticator.SaslClientAuthenticator"

and get back with results

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 📋 Backlog
Development

No branches or pull requests

3 participants