Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NativeImageTest fails because the LauncherUtil is unable to determine the status of the running process #17767

Closed
abrenn opened this issue Jun 8, 2021 · 84 comments
Labels
area/testing kind/bug Something isn't working
Milestone

Comments

@abrenn
Copy link

abrenn commented Jun 8, 2021

Describe the bug

After upgrading from Quarkus 1.12 to 1.13 not even a native image smoke test works anymore.
When Using the @NativeImageTest the test failes before the first test step is executed.
The exceptions says:

java.lang.IllegalStateException: Unable to determine the status of the running process. See the above logs for details
        at io.quarkus.test.common.LauncherUtil.waitForCapturedListeningData(LauncherUtil.java:73)

This is happening (according to LauncherUtil) when the Quarkus Application is not reporting its address within 10 seconds.
According to the logs the application started and I can see the address and port like that:

0.0.1 native (powered by Quarkus 1.13.6.Final) started in 0.029s. Listening on: http://0.0.0.0:43529"

Expected behavior

The tests is starting.

Actual behavior

The test is not starting but failing with the described exception.

Output of uname -a or ver

Linux  5.12.2-1-MANJARO #1 SMP PREEMPT Fri May 7 17:53:15 UTC 2021 x86_64 GNU/Linux

Output of java -version

openjdk version "11.0.6" 2020-01-14
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.6+10)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 11.0.6+10, mixed mode)

Quarkus version or git rev

1.13.6

Build tool (ie. output of mvnw --version or gradlew --version)

gradle 7.0.2

@abrenn abrenn added the kind/bug Something isn't working label Jun 8, 2021
@gsmet gsmet changed the title NativeImageTest failes because the LauncherUtil is unable to determine the status of the running process NativeImageTest fails because the LauncherUtil is unable to determine the status of the running process Jun 9, 2021
@gsmet
Copy link
Member

gsmet commented Jun 9, 2021

Could you share a small reproducer? That would allow us to check if it's environment related.

@gsmet gsmet added triage/needs-reproducer We are waiting for a reproducer. area/testing and removed triage/needs-triage labels Jun 9, 2021
@quarkus-bot
Copy link

quarkus-bot bot commented Jun 9, 2021

/cc @geoand

@geoand
Copy link
Contributor

geoand commented Jun 9, 2021

Can you also try with 2.0.0.CR3?
We have fixed a few issues around this

@abrenn
Copy link
Author

abrenn commented Jun 9, 2021

@geoand I will try that with our current project but I think that an update to 2.0.0 will break a couple of other things.
If that is the case I will try to create a small reproducer.

@abrenn
Copy link
Author

abrenn commented Jun 22, 2021

@gsmet Sorry, that it took a while, but here we go:

https://github.com/GAlexMES/quarkus-native-testing

You should see the error when you run ./gradlew clean nativeTest in the root dir.

I figured out, that the problem has something to do with org.slf4j:slf4j-api:2.0.0-alpha1 which we forcefully use.
When you remove this resolution strategie or switch to a different provider in the application.yml it works.

However: It worked in this configuration before I updated to 1.13.7. I will try to investigate if we can work around that specific slf4j version in the next days. Nevertheless I think Quarkus should be compatible to newer slf4j versions.

@geoand I was not able to update Quarkus to 2.0.0. in this service because there are too many broken dependencies. That seems like a topic for another day ;)

@geoand
Copy link
Contributor

geoand commented Jun 22, 2021

Undertstood

@abrenn
Copy link
Author

abrenn commented Jun 23, 2021

@gsmet
To give you guys more context:
We need native building and we need to configure the application log style using logback XMLs due to company restrictions.

Therefor we added the set the logging provider to slf4j -Dorg.jboss.logging.provider=slf4j.
Sadly that was not working out of the box because quarkus was not picking up the logback.xml files for application logging.

Thats why we switched to the new slf4j version which also requires the new logback version. That construct then worked for quite some time until the 1.13 quarkus update.

I hope that helps to figure it out. If you need anything else let me now :)

@geoand
Copy link
Contributor

geoand commented Jun 23, 2021

Unfortunately unless have an actual test application, it will be very hard to figure out the problem

@abrenn
Copy link
Author

abrenn commented Jun 23, 2021

@geoand I added a link to the github where I added the application where the native test is failing yesterday.

What exactly do you need that is not in that repository?

@geoand
Copy link
Contributor

geoand commented Jun 23, 2021

Ah, I hadn't seen that.
I'll need to check it out

@geoand
Copy link
Contributor

geoand commented Jun 23, 2021

So it seems like your application doesn't write to the logs until it's closed.
You can see that when running the native image standalone.

Now given that what you are doing with logging is non standard, I don't really know how you could get around the problem

@abrenn
Copy link
Author

abrenn commented Jun 23, 2021

I don't see why using a different logging provided is "non standard".
JBoss Logging offers this option and it should therefore be supported by a fromework using JBoss, shouldn't it?
However: you are right, there is no logging, because there is no logging provider.

I added one to the Repo and added a LOG.info() to the Main. This is printed into the console. The quarkus application log is still missing and the native test is still failling.

@geoand
Copy link
Contributor

geoand commented Jun 23, 2021

It seems that the native application does not dump the logs to the file, whether you log manually or not (I tried this by removing the logging code in your main method).

Other than that, @NativeImageTest does not work with applications that don't listen on an http port.

What are you trying to test exactly?

@abrenn
Copy link
Author

abrenn commented Jun 24, 2021

@geoand In this example I am not testing anything, just reproducing the issue.

In the real application I am doing integration testing.
Read Data from Kafka, store to database, get it from an http endpoint. You can see the exception and that the application is listening on an http port in my first initial post.

All of that worked in 1.12.x and also works in 1.13.x. Just the tests are failing.

@abrenn
Copy link
Author

abrenn commented Jul 1, 2021

@geoand any news on this topic?

@geoand
Copy link
Contributor

geoand commented Jul 1, 2021

Not really no. I don't know what's going an the application doesn't dump the logs to the file

@abrenn
Copy link
Author

abrenn commented Jul 1, 2021

Do you need any more support or are you just dropping it?

@geoand
Copy link
Contributor

geoand commented Jul 1, 2021

I personally I am not looking into it (as the application does not use standard logging - other applications I tried worked just fine), others are obiously free to check

@simon-hlx
Copy link

Any update about it, i have the same issue :/

@geoand
Copy link
Contributor

geoand commented Jul 16, 2021

I would encourage folks to test 2.1.0.CR1 which has some related fixes

@simon-hlx
Copy link

simon-hlx commented Jul 16, 2021

I take simpe starter on quarkus site
i put the version to 2.1.0.CR1
<?xml version="1.0"?> <project xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd" xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <modelVersion>4.0.0</modelVersion> <groupId>org.acme</groupId> <artifactId>code-with-quarkus</artifactId> <version>1.0.0-SNAPSHOT</version> <properties> <compiler-plugin.version>3.8.1</compiler-plugin.version> <maven.compiler.parameters>true</maven.compiler.parameters> <maven.compiler.source>11</maven.compiler.source> <maven.compiler.target>11</maven.compiler.target> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding> <quarkus.platform.artifact-id>quarkus-universe-bom</quarkus.platform.artifact-id> <quarkus.platform.group-id>io.quarkus</quarkus.platform.group-id> <quarkus.platform.version>2.1.0.CR1</quarkus.platform.version> <surefire-plugin.version>3.0.0-M5</surefire-plugin.version> </properties> <dependencyManagement> <dependencies> <dependency> <groupId>${quarkus.platform.group-id}</groupId> <artifactId>${quarkus.platform.artifact-id}</artifactId> <version>${quarkus.platform.version}</version> <type>pom</type> <scope>import</scope> </dependency> </dependencies> </dependencyManagement> <dependencies> <dependency> <groupId>io.quarkus</groupId> <artifactId>quarkus-resteasy</artifactId> </dependency> <dependency> <groupId>io.quarkus</groupId> <artifactId>quarkus-arc</artifactId> </dependency> <dependency> <groupId>io.quarkus</groupId> <artifactId>quarkus-junit5</artifactId> <scope>test</scope> </dependency> <dependency> <groupId>io.rest-assured</groupId> <artifactId>rest-assured</artifactId> <scope>test</scope> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>${quarkus.platform.group-id}</groupId> <artifactId>quarkus-maven-plugin</artifactId> <version>${quarkus.platform.version}</version> <extensions>true</extensions> <executions> <execution> <goals> <goal>build</goal> <goal>generate-code</goal> <goal>generate-code-tests</goal> </goals> </execution> </executions> </plugin> <plugin> <artifactId>maven-compiler-plugin</artifactId> <version>${compiler-plugin.version}</version> <configuration> <parameters>${maven.compiler.parameters}</parameters> </configuration> </plugin> <plugin> <artifactId>maven-surefire-plugin</artifactId> <version>${surefire-plugin.version}</version> <configuration> <systemPropertyVariables> <java.util.logging.manager>org.jboss.logmanager.LogManager</java.util.logging.manager> <maven.home>${maven.home}</maven.home> </systemPropertyVariables> </configuration> </plugin> </plugins> </build> <profiles> <profile> <id>native</id> <activation> <property> <name>native</name> </property> </activation> <build> <plugins> <plugin> <artifactId>maven-failsafe-plugin</artifactId> <version>${surefire-plugin.version}</version> <executions> <execution> <goals> <goal>integration-test</goal> <goal>verify</goal> </goals> <configuration> <systemPropertyVariables> <native.image.path>${project.build.directory}/${project.build.finalName}-runner</native.image.path> <java.util.logging.manager>org.jboss.logmanager.LogManager</java.util.logging.manager> <maven.home>${maven.home}</maven.home> </systemPropertyVariables> </configuration> </execution> </executions> </plugin> </plugins> </build> <properties> <quarkus.package.type>native</quarkus.package.type> </properties> </profile> </profiles> </project>
my graalvm_home and java_home is set to this version
/Library/Java/JavaVirtualMachines/graalvm-ce-java11-21.0.0.2/Contents/Home/
gu --version
GraalVM Updater 21.0.0.2
java --version
openjdk 11.0.9.1 2020-11-04
OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.9.1+1)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 11.0.9.1+1, mixed mode)

MacOs BigSur version 11.4

Logs after /mvnw verify -Pnative

`
.......
[INFO] [io.quarkus.deployment.pkg.steps.NativeImageBuildRunner] objcopy --strip-debug code-with-quarkus-1.0.0-SNAPSHOT-runner
[INFO] [io.quarkus.deployment.QuarkusAugmentor] Quarkus augmentation completed in 55269ms
[INFO]
[INFO] --- maven-failsafe-plugin:3.0.0-M5:integration-test (default) @ code-with-quarkus ---
[INFO]
[INFO] -------------------------------------------------------
[INFO] T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.acme.NativeGreetingResourceIT
Executing [/Users/simon.huleux/Documents/projects/code-with-quarkus/target/code-with-quarkus-1.0.0-SNAPSHOT-runner, -Dquarkus.http.port=8081, -Dquarkus.http.ssl-port=8444, -Dtest.url=http://localhost:8081, -Dquarkus.log.file.path=/Users/simon.huleux/Documents/projects/code-with-quarkus/target/quarkus.log, -Dquarkus.log.file.enable=true]
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 63.415 s <<< FAILURE! - in org.acme.NativeGreetingResourceIT
[ERROR] org.acme.NativeGreetingResourceIT.testHelloEndpoint Time elapsed: 0.009 s <<< ERROR!
java.lang.IllegalStateException: Unable to determine the status of the running process. See the above logs for details

[INFO]
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR] NativeGreetingResourceIT.testHelloEndpoint » IllegalState Unable to determine ...
[INFO]
[ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0
[INFO]
[INFO]
[INFO] --- maven-failsafe-plugin:3.0.0-M5:verify (default) @ code-with-quarkus ---
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 02:07 min
[INFO] Finished at: 2021-07-16T08:58:43-04:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-failsafe-plugin:3.0.0-M5:verify (default) on project code-with-quarkus: There are test failures.
[ERROR]
[ERROR] Please refer to /Users/simon.huleux/Documents/projects/code-with-quarkus/target/failsafe-reports for the individual test results.
[ERROR] Please refer to dump files (if any exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream.
[ERROR] -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
`

@geoand
Copy link
Contributor

geoand commented Jul 16, 2021

What does /Users/simon.huleux/Documents/projects/code-with-quarkus/target/quarkus.log look like when the test fails?

@simon-hlx
Copy link

There no here , what i see in
NativeImageLauncher.java line 110
it delete this file and create the parent directory if needed
Files.deleteIfExists(logFile);
Files.createDirectories(logFile.getParent());
But at the end (after the test fail) there no log file exist.

@geoand
Copy link
Contributor

geoand commented Jul 16, 2021

Are you using any peculiar logging settings?

@geoand
Copy link
Contributor

geoand commented Sep 13, 2021

This is my failsafe configuration. The tests fail around 20seconds into the plugin executing. Regardless of my settings.

With the latest quarkus version?
Can you please also post any logs you have?

@bvarner
Copy link
Contributor

bvarner commented Sep 13, 2021

Quarkus 2.2.1.Final

The only logs I get are what's listed in: #17767 (comment)
Which I am reposting here:

Executing [java, -javaagent:/app/.m2/repository/org/jacoco/org.jacoco.agent/0.8.7/org.jacoco.agent-0.8.7-runtime.jar=destfile=/app/target/jacoco-quarkus.exec,append=true,exclclassloader=*QuarkusClassLoader, -Dquarkus.http.port=8081, -Dquarkus.http.ssl-port=8444, -Dtest.url=http://localhost:8081, -Dquarkus.log.file.path=/app/target/quarkus.log, -Dquarkus.log.file.enable=true, -Dquarkus.profile=it, -jar, /app/target/quarkus-app/quarkus-run.jar]
Picked up JAVA_TOOL_OPTIONS: -XX:+UseContainerSupport -Xmx671m -XX:CICompilerCount=2 -Dfile.encoding=UTF-8 
Log file '/app/target/quarkus.log' was not created.
[ERROR] Tests run: 8, Failures: 0, Errors: 1, Skipped: 7, Time elapsed: 14.484 s <<< FAILURE! - in com.foo.bar.ThingResourceIT
[ERROR] com.foo.bar.ThingResourceIT.getEndpointNotFound  Time elapsed: 0.025 s  <<< ERROR!
java.lang.RuntimeException: java.lang.IllegalStateException: Unable to determine the status of the running process. See the above logs for details
Caused by: java.lang.IllegalStateException: Unable to determine the status of the running process. See the above logs for details

@geoand
Copy link
Contributor

geoand commented Sep 13, 2021

Thanks.

And you said it always fails after 20 seconds?

@bvarner
Copy link
Contributor

bvarner commented Sep 13, 2021

It seems to be consistently around that timeframe. The log reports elapsed times that vary from what I see on a stop-watch when I try to time it.

In real life it seems like it's closer to 25-30s after failsafe first starts logging that it's running, but I've not been super scientific about it. Really it's pretty bothersome to reproduce since I have to throttle the tar out of my system, or use an oversubscribed cloud-based build.

@bvarner
Copy link
Contributor

bvarner commented Sep 13, 2021

Also confirmed on 2.2.2.Final.

@geoand
Copy link
Contributor

geoand commented Sep 13, 2021

Understood, thanks

@geoand
Copy link
Contributor

geoand commented Sep 14, 2021

One last think (hopefully), when you run:

java -javaagent:/app/.m2/repository/org/jacoco/org.jacoco.agent/0.8.7/org.jacoco.agent-0.8.7-runtime.jar=destfile=/app/target/jacoco-quarkus.exec,append=true,exclclassloader=*QuarkusClassLoader -Dquarkus.http.port=8081 -Dquarkus.http.ssl-port=8444 -Dtest.url=http://localhost:8081 -Dquarkus.log.file.path=/app/target/quarkus.log -Dquarkus.log.file.enable=true -Dquarkus.profile=it -jar, /app/target/quarkus-app/quarkus-run.jar

manually, how does the application behave?

@bvarner
Copy link
Contributor

bvarner commented Sep 14, 2021

When my system is throttled, it takes a while to start, but it starts and runs just fine.

@geoand
Copy link
Contributor

geoand commented Sep 14, 2021

#20129 should likely take care of it

@bvarner
Copy link
Contributor

bvarner commented Sep 14, 2021

I'm building / testing now from main.

Any chance this can end up in a maintenance release on 2.2 soon?

@geoand
Copy link
Contributor

geoand commented Sep 14, 2021

This fix will likely be part of 2.2.4, hopefully to come out next week.

@bvarner
Copy link
Contributor

bvarner commented Sep 14, 2021

I've locally branched off the upstream 2.2, cherrypicked in e1bb4f2, and I'm still seeing the same behavior.
Is there some debug log setting I can use to make things a bit more verbose without having to pour through ./mvnw -X output?

@bvarner
Copy link
Contributor

bvarner commented Sep 14, 2021

One last think (hopefully), when you run:

java -javaagent:/app/.m2/repository/org/jacoco/org.jacoco.agent/0.8.7/org.jacoco.agent-0.8.7-runtime.jar=destfile=/app/target/jacoco-quarkus.exec,append=true,exclclassloader=*QuarkusClassLoader -Dquarkus.http.port=8081 -Dquarkus.http.ssl-port=8444 -Dtest.url=http://localhost:8081 -Dquarkus.log.file.path=/app/target/quarkus.log -Dquarkus.log.file.enable=true -Dquarkus.profile=it -jar, /app/target/quarkus-app/quarkus-run.jar

manually, how does the application behave?

I get exceptions that the JVM failed to be created -- because the paths (absolute with /app as the prefix) don't match my local system. If I adjust those to be absolute and actual resolve to my system, the jar will launch as expected.

Again, if the system isn't constrained, things work as expected, too. But under load (sysload > ~3) things go sideways.

@geoand
Copy link
Contributor

geoand commented Sep 15, 2021

I've locally branched off the upstream 2.2, cherrypicked in e1bb4f2, and I'm still seeing the same behavior.
Is there some debug log setting I can use to make things a bit more verbose without having to pour through ./mvnw -X output?

You can set the Quarkus logging to DEBUG in your application.properties and run the test with that

@bvarner
Copy link
Contributor

bvarner commented Sep 15, 2021

Setting Quarkus logging to DEBUG didn't change the outcome.
I'm still getting...

Executing [java, -javaagent:/.../.m2/repository/org/jacoco/org.jacoco.agent/0.8.7/org.jacoco.agent-0.8.7-runtime.jar=destfile=/.../target/jacoco-quarkus.exec,append=true,exclclassloader=*QuarkusClassLoader, -Dquarkus.http.port=8081, -Dquarkus.http.ssl-port=8444, -Dtest.url=http://localhost:8081, -Dquarkus.log.file.path=/.../target/quarkus.log, -Dquarkus.log.file.enable=true, -Dquarkus.profile=it, -jar, /.../target/quarkus-app/quarkus-run.jar]
Log file '/home/bvarner/Documents/work/AVT/spcd-rest-api/target/quarkus.log' was not created.

Shouldn't -Dquarkus.test.wait-time=120 be included in the execution argument if my pom.xml is specifying:

 <plugin>
                <artifactId>maven-failsafe-plugin</artifactId>
                <version>${surefire-plugin.version}</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>integration-test</goal>
                            <goal>verify</goal>
                        </goals>
                        <configuration>
                            <forkedProcessTimeoutInSeconds>180</forkedProcessTimeoutInSeconds>
                            <forkedProcessExitTimeoutInSeconds>60</forkedProcessExitTimeoutInSeconds>
                            <skip>${skipTests}</skip>
                            <systemPropertyVariables>
                                <java.util.logging.manager>org.jboss.logmanager.LogManager</java.util.logging.manager>
                                <maven.home>${maven.home}</maven.home>
                                <quarkus.test.wait-time>120</quarkus.test.wait-time>
                                <quarkus.test.native-image-profile>it</quarkus.test.native-image-profile>
                                <quarkus.test.arg-line>${argLine}</quarkus.test.arg-line>
                            </systemPropertyVariables>
                        </configuration>
                    </execution>
                </executions>
            </plugin>

@geoand
Copy link
Contributor

geoand commented Sep 15, 2021

#20101 fixed the issue of quarkus.test.wait-time not being taken into account.
Can you try using the legacy quarkus.test.jar-wait-time in the meantime?

@bvarner
Copy link
Contributor

bvarner commented Sep 15, 2021

I do not see wait-time arguments when running the integration tests using either or both of those properties.
I'm using a local build of quarkus 999-SNAPSHOT off the main branches of quarkus-platform and quarkus.

@geoand
Copy link
Contributor

geoand commented Sep 15, 2021

I don't think you are running from main, because the Executing... format is a little different in main

@bvarner
Copy link
Contributor

bvarner commented Sep 15, 2021

My local quarkus platform build was using 2.2.1.Final...

@bvarner
Copy link
Contributor

bvarner commented Sep 15, 2021

Ok, finally getting everything to use 999-SNAPSHOT, and I can confirm this solves the issue.

@geoand
Copy link
Contributor

geoand commented Sep 15, 2021

Awesome 👍.

Thanks for checking!

@geoand geoand closed this as completed Sep 15, 2021
@geoand geoand added this to the 2.3 - main milestone Sep 15, 2021
@famod famod removed the triage/needs-reproducer We are waiting for a reproducer. label Oct 4, 2021
gsmet pushed a commit to gsmet/quarkus that referenced this issue Dec 1, 2021
This was already done for native launcher, but it was
mistakenly left out for these launch modes.

Relates to quarkusio#17767

(cherry picked from commit bdfb7bf)
gsmet pushed a commit to gsmet/quarkus that referenced this issue Dec 1, 2021
gsmet pushed a commit to gsmet/quarkus that referenced this issue Dec 1, 2021
This was already done for native launcher, but it was
mistakenly left out for these launch modes.

Relates to quarkusio#17767

(cherry picked from commit bdfb7bf)
gsmet pushed a commit to gsmet/quarkus that referenced this issue Dec 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/testing kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants