Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SwaggerUiProcessor leaves behind non-empty temporary directories in dev mode #12577

Closed
famod opened this issue Oct 7, 2020 · 22 comments · Fixed by #12652
Closed

SwaggerUiProcessor leaves behind non-empty temporary directories in dev mode #12577

famod opened this issue Oct 7, 2020 · 22 comments · Fixed by #12652

Comments

@famod
Copy link
Member

famod commented Oct 7, 2020

Describe the bug
If swagger-ui extension is active, each app start creates a quarkus-swagger-ui_* folder in /tmp, e.g. quarkus-swagger-ui_31789176944326801422565934992530930, containing all the webjar files:

-rw-r--r--. 1 jenkins jenkins     665 Oct  6 19:00 favicon-16x16.png
-rw-r--r--. 1 jenkins jenkins     628 Oct  6 19:00 favicon-32x32.png
-rw-r--r--. 1 jenkins jenkins    1421 Oct  6 19:00 index.html
-rw-r--r--. 1 jenkins jenkins     639 Oct  6 19:00 index.html.gz
-rw-r--r--. 1 jenkins jenkins    2431 Oct  6 19:00 oauth2-redirect.html
-rw-r--r--. 1 jenkins jenkins     864 Oct  6 19:00 oauth2-redirect.html.gz
-rw-r--r--. 1 jenkins jenkins  986342 Oct  6 19:00 swagger-ui-bundle.js
-rw-r--r--. 1 jenkins jenkins  300057 Oct  6 19:00 swagger-ui-bundle.js.gz
-rw-r--r--. 1 jenkins jenkins 4309533 Oct  6 19:00 swagger-ui-bundle.js.map
-rw-r--r--. 1 jenkins jenkins  142933 Oct  6 19:00 swagger-ui.css
-rw-r--r--. 1 jenkins jenkins   22648 Oct  6 19:00 swagger-ui.css.gz
-rw-r--r--. 1 jenkins jenkins  281392 Oct  6 19:00 swagger-ui.css.map
-rw-r--r--. 1 jenkins jenkins  424281 Oct  6 19:00 swagger-ui-es-bundle-core.js
-rw-r--r--. 1 jenkins jenkins  118554 Oct  6 19:00 swagger-ui-es-bundle-core.js.gz
-rw-r--r--. 1 jenkins jenkins 1533198 Oct  6 19:00 swagger-ui-es-bundle-core.js.map
-rw-r--r--. 1 jenkins jenkins  985915 Oct  6 19:00 swagger-ui-es-bundle.js
-rw-r--r--. 1 jenkins jenkins  299885 Oct  6 19:00 swagger-ui-es-bundle.js.gz
-rw-r--r--. 1 jenkins jenkins 4308096 Oct  6 19:00 swagger-ui-es-bundle.js.map
-rw-r--r--. 1 jenkins jenkins  424485 Oct  6 19:00 swagger-ui.js
-rw-r--r--. 1 jenkins jenkins  118616 Oct  6 19:00 swagger-ui.js.gz
-rw-r--r--. 1 jenkins jenkins 1532710 Oct  6 19:00 swagger-ui.js.map
-rw-r--r--. 1 jenkins jenkins  311804 Oct  6 19:00 swagger-ui-standalone-preset.js
-rw-r--r--. 1 jenkins jenkins   97541 Oct  6 19:00 swagger-ui-standalone-preset.js.gz
-rw-r--r--. 1 jenkins jenkins 1384709 Oct  6 19:00 swagger-ui-standalone-preset.js.map

Such folders are not removed when the app is terminated (even not when terminated gracefully), filling up /tmp.

Expected behavior
Nothing should be left behind after the Quarkus app terminated.

Actual behavior
/tmp is "spammed" until all space is depleted.

To Reproduce

Steps to reproduce the behavior:

  1. git clone https://github.com/quarkusio/quarkus-quickstarts.git
  2. cd openapi-swaggerui-quickstart
  3. mvn clean package -DskipTests
  4. mvn quarkus:dev
  5. CTRL + C
  6. check /tmp

Configuration
Not relevant.

Environment (please complete the following information):

  • Output of uname -a or ver: MINGW64_NT-10.0-18363 XXX 3.0.7-338.x86_64 2019-11-21 23:07 UTC x86_64 Msys
  • Output of java -version: OpenJDK 64-Bit Server VM AdoptOpenJDK (build 11.0.6+10, mixed mode)
  • GraalVM version (if different from Java): n/a
  • Quarkus version or git rev: 1.8.0.Final
  • Build tool (ie. output of mvnw --version or gradlew --version): Apache Maven 3.6.3

Additional context

See also: https://stackoverflow.com/questions/15022219/does-files-createtempdirectory-remove-the-directory-after-jvm-exits-normally

These processors are also affected:

  • SmallRyeHealthProcessor
  • SmallRyeGraphQLProcessor

Other potentially affected, non-processor classes are mentioned here: https://quarkusio.zulipchat.com/#narrow/stream/187030-users/topic/Various.20temp.20dirs.20are.20not.20cleaned.20up.3F/near/212540339

I'm also wondering wether this:

private static final String TEMP_DIR_PREFIX = "quarkus-swagger-ui_" + System.nanoTime();

could be more something like "quarkus-swagger-ui_" + artifact.getVersion()?
This would require something like java.nio.channels.FileLock though.

@famod
Copy link
Member Author

famod commented Oct 7, 2020

/cc @phillip-kruger

@phillip-kruger phillip-kruger self-assigned this Oct 7, 2020
@phillip-kruger
Copy link
Member

Thanks @famod .

@gsmet
Copy link
Member

gsmet commented Oct 7, 2020

We probably need a shutdown hook to remove the temp directory.

Not sure it will fix the issue for ungraceful shutdown though.

@gsmet gsmet added this to the 1.9.0.Final milestone Oct 7, 2020
@gsmet
Copy link
Member

gsmet commented Oct 7, 2020

Let's try to fix this one for 1.9.0.Final. It's not pretty.

@phillip-kruger
Copy link
Member

@famod - still trying to recreate this, seems to work perfect on fedora:

tmp_folder

@famod
Copy link
Member Author

famod commented Oct 7, 2020

@phillip-kruger
Hm, maybe there is some auto-cleanup on Fedora? Which JDK version?
How did you record you console? Might do the same.

@phillip-kruger
Copy link
Member

phillip-kruger commented Oct 7, 2020

I used peek (https://github.com/phw/peek)
Java:

openjdk version "11" 2018-09-25
OpenJDK Runtime Environment 18.9 (build 11+28)
OpenJDK 64-Bit Server VM 18.9 (build 11+28, mixed mode)

I found a old Windows 7 DVD, busy installing on a vm, I want to see if I can recreate it there. That should rule out/in fedora auto cleanup

@famod
Copy link
Member Author

famod commented Oct 7, 2020

I'll try to run quarkus:dev manually on our CI server. But I guess I will be able to reproduce it there because that CentOS7 environment was the one where we spotted the problem in the first place. Only later I discovered that it also happens locally on Windows. Btw, today I deleted 4 Gigs of swagger-ui folders from my local machine. 😉

@phillip-kruger
Copy link
Member

Manual run on your CI server is a good idea. I am also download CentOS 7 to try it out there.

@famod
Copy link
Member Author

famod commented Oct 7, 2020

Manual run on your CI server is a good idea.

I just realized that this is more elaborate than I first thought because the SSH shell is not set up for mvn.

So instead I ran watch ls -ad /tmp/quarkus-* during a CI build and I could see folders come and go as long as tests were executed or the runner-jar was started, but when all the dev mode test stages were done, the following folders remained:

/tmp/quarkus-health-ui_32755998713381951105019794315279829
/tmp/quarkus-health-ui_3275600108889982581242371916449573
/tmp/quarkus-health-ui_32756251960598132840986251099457148
/tmp/quarkus-health-ui_32756254325723758270053270839474250
/tmp/quarkus-health-ui_32756520317287904532743181333763619
/tmp/quarkus-health-ui_327565296386480410729010182874422483
/tmp/quarkus-swagger-ui_327559982333431817622912455978250870
/tmp/quarkus-swagger-ui_327560004759834118333916830824179568
/tmp/quarkus-swagger-ui_32756250657866775751752355494437807
/tmp/quarkus-swagger-ui_327562540457445815062360320700734406
/tmp/quarkus-swagger-ui_327565192528549611009109708529106813
/tmp/quarkus-swagger-ui_32756529377101598288325524993058115

@famod
Copy link
Member Author

famod commented Oct 7, 2020

Btw, this is how our CI is terminating the dev mode instances which should be equivalent to CTRL+C:

[2020-10-07T19:52:14.972Z] + pkill -SIGINT -e -f some-generated-tag
[2020-10-07T19:52:14.972Z] mvn killed (pid 77199)
[2020-10-07T19:52:14.972Z] java killed (pid 77201)
[2020-10-07T19:52:14.972Z] java killed (pid 77353)

PS: some-generated-tag is a unique tag we pass to every process via -DprocessTag=some-generated-tag to kill the right process at the right time.

@famod
Copy link
Member Author

famod commented Oct 7, 2020

Ha! The plot thickens: I was able to start a dev mode instance manually on CentOS7 and it does not leave behind a folder after CTRL+C!
It even does not do so when I use kill -SIGINT (or pkill -SIGINT).

The difference I can see now is that with the above the usual shutdown message is printed:

INFO  [io.quarkus] (Quarkus Main Thread) foo stopped in 0.012s

I cannot find this message in our regular CI logs. This leads me to the assumption that we mistakenly kill those instances forcefully. I think this has something to do with mvn being killed first 🤔 (see log output in my previous comment).

So false alarm for Linux I would say (unless you see a chance to clean up for ungraceful shutdown)!

And guess what? The message is also missing on Windows, but for CTRL+C.
Hm, I hope it is not a shell specific problem... I'll have a colleague try it out tomorrow and I'll report back.

Thanks a lot for your support so far, @phillip-kruger!

@phillip-kruger
Copy link
Member

No worries @famod - happy to help, we need to get to the bottom of this. I am also still looking on my side to try and recreate this. So during your build you do a mvn quarkus:dev ? From another mvn process ?? Can you give me more details on how you run the dev mode test ? That might help in recreating it. Thanks :)

@phillip-kruger
Copy link
Member

@famod - I recreated it on Windows ! Let me see if I can fix that, and then we test that fix against your CI server.

@famod
Copy link
Member Author

famod commented Oct 8, 2020

@phillip-kruger

So during your build you do a mvn quarkus:dev ? From another mvn process ?? Can you give me more details on how you run the dev mode test ?

The unfiltered truth 😉 :

mvn -Dquarkus.http.port=0 -Dquarkus.http.ssl-port=0 -DprocessTag=some-generated-tag -Dstyle.color=always -Dchangelist=-build-00436 -f catalog/api/pro-fe -Pqdev -Ddebug=false -Dmaven.main.skip -Dmaven.resources.skip

I don't think any of those parameters are relevant for this specific problem.

The qdev profile looks as follows:

        <profile>
            <id>qdev</id>
            <properties>
                <enforcer.skip>true</enforcer.skip>
                <asciidoctor.skip>true</asciidoctor.skip>
                <skipTests>true</skipTests>
                <spotbugs.skip>true</spotbugs.skip>
                <quarkus.prepare.skip>true</quarkus.prepare.skip>
            </properties>
            <dependencies>
                <dependency>
                    <groupId>io.quarkus</groupId>
                    <artifactId>quarkus-jdbc-h2</artifactId>
                </dependency>
            </dependencies>
            <build>
                <defaultGoal>quarkus:dev</defaultGoal>
                <plugins>
                    <plugin>
                        <groupId>io.quarkus</groupId>
                        <artifactId>quarkus-maven-plugin</artifactId>
                        <configuration>
                            <!-- note: processTag is used on Jenkins but it does no harm locally (if set to some value) -->
                            <jvmArgs>-DprocessTag=${processTag}</jvmArgs>
                            <!-- we want to use devmode in modules that are not "built" as quarkus apps -->
                            <enforceBuildGoal>false</enforceBuildGoal>
                        </configuration>
                    </plugin>
                </plugins>
            </build>
        </profile>

so in the end this is just quarkus:dev with some optimizations and project specific adjustments.

@famod
Copy link
Member Author

famod commented Oct 8, 2020

Ok, so two colleagues double-checked this on their Windows 10 machines and for no setup dev mode was terminated gracefully:

  • IntelliJ console (basically cmd.exe)
  • ConEmu + Git-Bash
  • ConEmu + plain cmd.exe

@phillip-kruger
Copy link
Member

So the code use a shutdown hook to delete the folder:
https://docs.oracle.com/javase/8/docs/api/java/lang/Runtime.html#addShutdownHook-java.lang.Thread-

From the Javadoc:

In rare circumstances the virtual machine may abort, that is, stop running without shutting down cleanly. This occurs when the virtual machine is terminated externally, for example with the SIGKILL signal on Unix or the TerminateProcess call on Microsoft Windows. The virtual machine may also abort if a native method goes awry by, for example, corrupting internal data structures or attempting to access nonexistent memory. If the virtual machine aborts then no guarantee can be made about whether or not any shutdown hooks will be run.

I am still investigating other options

@famod
Copy link
Member Author

famod commented Oct 8, 2020

So the code use a shutdown hook

Ah, you're right. It's just not directly obvious since the hook is operating on cached.cachedDirectory, not directly on tempDir.

I am still investigating other options

Well, first and foremost it would be a good improvement if Quarkus dev mode would terminate gracefully upon CTRL+C on Windows.

@phillip-kruger
Copy link
Member

Well, first and foremost it would be a good improvement if Quarkus dev mode would terminate gracefully upon CTRL+C on Windows.

Agree. For me it's not consistent. Sometimes it actually do remove the directory. Weird.

@famod
Copy link
Member Author

famod commented Oct 8, 2020

Sometimes it actually do remove the directory. Weird.

Do you see the shutdown message in these cases?

@phillip-kruger
Copy link
Member

phillip-kruger commented Oct 8, 2020

Yes I do. - *Edit - No I don't. At least not anymore. Maybe I remember incorrectly. It does however works fine in Netbeans. And seems like When you do TASKKILL /PID %PID% it works. But not Ctrl-C

@phillip-kruger
Copy link
Member

@gsmet @famod I have a fix for this. Basically, as discussed, there is no need for a unique folder, as the groupId/arifactId/version is unique enough. I have tested this with multiple apps running and all works fine. see #12652
Let me know what you think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants