Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak with Resteasy byte response? #20822

Closed
suarkf opened this issue Oct 18, 2021 · 11 comments · Fixed by #20907
Closed

Memory leak with Resteasy byte response? #20822

suarkf opened this issue Oct 18, 2021 · 11 comments · Fixed by #20907
Labels
area/resteasy-classic kind/bug Something isn't working
Milestone

Comments

@suarkf
Copy link

suarkf commented Oct 18, 2021

Describe the bug

I implement a rest API that should return binary content.
After calling the service a bunch of times (depends on the response size), it runs out of memory and returns an OOM exception: "org.jboss.resteasy.spi.UnhandledException: java.lang.OutOfMemoryError: Direct buffer memory".

Expected behavior

I expect the used DirectBufferMemory to be freed at some point.

Actual behavior

The DirectBufferMemory is not released and therefore grows with each request.

Sending requests with curl (without doing anything with the response) leads to memory increase.

I investigated some heap dumps and it seems like netty.PoolChunk instances prevent the DirectByteBuffer objects from being removed.

How to Reproduce?

I reproduced the behavior with some small changes to the getting-started project's GreetingResource. It now produces "image/png" instead of "application/json" and basically loads a file and returns it as byte array. The code of the GreetingResource.java is the following:

package org.acme.getting.started;

import javax.ws.rs.GET;
import javax.ws.rs.Path;
import javax.ws.rs.Produces;
import javax.ws.rs.core.Response;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Paths;

@Path("/hello")
public class GreetingResource {

    @GET()
    @Produces({"image/png", "application/json"})
    public Response hello() {
        try {
            // The big.png file is a 30MB binary file.
            byte[] fileContent = Files.readAllBytes(Paths.get(".", "target", "classes", "files", "big.png"));
            return Response.ok(fileContent).build();
        } catch (IOException e) {
            e.printStackTrace();
        }
        return Response.noContent().build();
    }
}

After starting Quarkus dev mode on my local computer I can send approximately 300 curl requests (that do just fire and forget) until the service runs out of memory:

curl --request GET "http://localhost:8080/hello" -w 'http_status: %{http_code}\n' -s

Output of uname -a or ver

No response

Output of java -version

11.0.7

GraalVM version (if different from Java)

No response

Quarkus version or git rev

2.3.0

Build tool (ie. output of mvnw --version or gradlew --version)

Apache Maven 3.8.1

Additional information

Quarkus-getting-started.zip

@suarkf suarkf added the kind/bug Something isn't working label Oct 18, 2021
@geoand
Copy link
Contributor

geoand commented Oct 18, 2021

Any change you can attach a ready made project that reproduces this behavior?

@suarkf
Copy link
Author

suarkf commented Oct 18, 2021

Sure, I added the zipped Maven project to the original post.

@geoand
Copy link
Contributor

geoand commented Oct 18, 2021

I assume that the OOM happen when the service is under load, correct?

@geoand
Copy link
Contributor

geoand commented Oct 19, 2021

From what I can see, this is not due to a memory leak, but do to file being large and too much memory being allocated when the concurrency is high enough.

I also tried the same example using quarkus-resteasy-reactive and things worked properly, so my suggestion is to move to it.

@geoand geoand closed this as completed Oct 19, 2021
@geoand geoand added the triage/wontfix This will not be worked on label Oct 19, 2021
@suarkf
Copy link
Author

suarkf commented Oct 19, 2021

I can reproduce the same thing with smaller responses and no load on the system.

So do you suggest that I try using quarkus-resteasy-reactive instead?

@geoand
Copy link
Contributor

geoand commented Oct 19, 2021

I can reproduce the same thing with smaller responses and no load on the system.

I was not able to reproduce this without load.

So do you suggest that I try using quarkus-resteasy-reactive instead?

Yes, please do and let us know how it goes.

@hacst
Copy link
Contributor

hacst commented Oct 19, 2021

I think the "-w" part is important because it makes curl fail to read the response. With that I can reproduce the leak.

I see the following curl output for the exact query (with -v output) posted:

< Content-Type: image/png
<
* Failed writing body (0 != 8191)
* Failed writing data
* Closing connection 0

instead of for a normal curl that succeeds

< HTTP/1.1 200 OK
< transfer-encoding: chunked
< Content-Type: image/png
<
{ [8199 bytes data]

And then on the quarkus side in the log for the first one there is:

2021-10-19 16:32:20,920 DEBUG [io.qua.resteasy] (vert.x-eventloop-thread-6) IO Exception : io.vertx.core.VertxException: Connection was closed

2021-10-19 16:32:20,967 DEBUG [org.jbo.res.res.i18n] (executor-thread-0) RESTEASY003770: Response is committed, can't handle exception

@geoand
Copy link
Contributor

geoand commented Oct 20, 2021

I still could not reproduce the problem

@suarkf
Copy link
Author

suarkf commented Oct 20, 2021

Did you also use curl to send the requests? And how many did you send?

@hacst
Copy link
Contributor

hacst commented Oct 20, 2021

Strange. I tried the reproducer on Windows with Coretto 11 as well as Microsoft OpenJDK 17 runtimes. Curl was curl 7.55.1 (Windows) libcurl/7.55.1 WinSSL. I also tried it on Ubuntu 20.04 (WSL) with OpenJDK 14.0.1 and curl 7.68.0 (x86_64-pc-linux-gnu) libcurl/7.68.0. All had the same behavior.

In my try right now on windows with Coretto 11 I started it using

corretto-11.0.7\bin\java -jar target/quarkus-app/quarkus-run.jar

and after 274 consecutive requests with curl from powershell

for($num =1;$num -le 1000;$num++) { echo $num; curl -v --request GET http://localhost:8080/hello -w 'http_status: %{http_code}\n' -s }

I got the messages as shown above and memory use climbed steadily to ~10GB before I started to get the OOM exception

2021-10-20 09:59:03,529 ERROR [io.qua.ver.htt.run.QuarkusErrorHandler] (executor-thread-0) HTTP Request to /hello failed, error id: 524739ca-2724-4347-8d8d-c0e5627dea53-1: java.lang.OutOfMemoryError: Direct buffer memory
        at java.base/java.nio.Bits.reserveMemory(Bits.java:175)
        at java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:118)
        at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:317)
        at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:632)
        at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:607)
        at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:202)
        at io.netty.buffer.PoolArena.tcacheAllocateSmall(PoolArena.java:172)
        at io.netty.buffer.PoolArena.allocate(PoolArena.java:134)
        at io.netty.buffer.PoolArena.allocate(PoolArena.java:126)
        at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:395)
        at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:188)
        at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:179)
        at io.quarkus.resteasy.runtime.standalone.ResteasyStandaloneRecorder$ResteasyVertxAllocator.allocateBuffer(ResteasyStandaloneRecorder.java:90)
        at io.quarkus.resteasy.runtime.standalone.ResteasyStandaloneRecorder$ResteasyVertxAllocator.allocateBuffer(ResteasyStandaloneRecorder.java:79)
        at io.quarkus.resteasy.runtime.standalone.ResteasyStandaloneRecorder$ResteasyVertxAllocator.allocateBuffer(ResteasyStandaloneRecorder.java:74)
        at io.quarkus.resteasy.runtime.standalone.VertxOutputStream.asyncWrite(VertxOutputStream.java:175)
        at org.jboss.resteasy.util.CommitHeaderAsyncOutputStream.asyncWrite(CommitHeaderAsyncOutputStream.java:94)
        at org.jboss.resteasy.spi.AsyncOutputStream.asyncWrite(AsyncOutputStream.java:23)
        at org.jboss.resteasy.plugins.providers.ByteArrayProvider.asyncWriteTo(ByteArrayProvider.java:66)
        at org.jboss.resteasy.plugins.providers.ByteArrayProvider.asyncWriteTo(ByteArrayProvider.java:27)
        at org.jboss.resteasy.core.interception.jaxrs.ServerWriterInterceptorContext.writeTo(ServerWriterInterceptorContext.java:87)
        at org.jboss.resteasy.core.interception.jaxrs.AbstractWriterInterceptorContext.asyncProceed(AbstractWriterInterceptorContext.java:203)
        at org.jboss.resteasy.core.interception.jaxrs.AbstractWriterInterceptorContext.getStarted(AbstractWriterInterceptorContext.java:166)
        at org.jboss.resteasy.core.interception.jaxrs.ServerWriterInterceptorContext.lambda$getStarted$0(ServerWriterInterceptorContext.java:73)
        at org.jboss.resteasy.core.interception.jaxrs.ServerWriterInterceptorContext.aroundWriteTo(ServerWriterInterceptorContext.java:93)
        at org.jboss.resteasy.core.interception.jaxrs.ServerWriterInterceptorContext.getStarted(ServerWriterInterceptorContext.java:73)
        at org.jboss.resteasy.core.ServerResponseWriter.lambda$writeNomapResponse$3(ServerResponseWriter.java:163)
        at org.jboss.resteasy.core.interception.jaxrs.ContainerResponseContextImpl.filter(ContainerResponseContextImpl.java:404)
        at org.jboss.resteasy.core.ServerResponseWriter.executeFilters(ServerResponseWriter.java:252)
        at org.jboss.resteasy.core.ServerResponseWriter.writeNomapResponse(ServerResponseWriter.java:101)
        at org.jboss.resteasy.core.ServerResponseWriter.writeNomapResponse(ServerResponseWriter.java:74)
        at org.jboss.resteasy.core.SynchronousDispatcher.writeResponse(SynchronousDispatcher.java:594)
        at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:524)
        at org.jboss.resteasy.core.SynchronousDispatcher.lambda$invoke$4(SynchronousDispatcher.java:261)
        at org.jboss.resteasy.core.SynchronousDispatcher.lambda$preprocess$0(SynchronousDispatcher.java:161)
        at org.jboss.resteasy.core.interception.jaxrs.PreMatchContainerRequestContext.filter(PreMatchContainerRequestContext.java:364)
        at org.jboss.resteasy.core.SynchronousDispatcher.preprocess(SynchronousDispatcher.java:164)
        at org.jboss.resteasy.core.SynchronousDispatcher.invoke(SynchronousDispatcher.java:247)
        at io.quarkus.resteasy.runtime.standalone.RequestDispatcher.service(RequestDispatcher.java:73)
        at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler.dispatch(VertxRequestHandler.java:135)
        at io.quarkus.resteasy.runtime.standalone.VertxRequestHandler$1.run(VertxRequestHandler.java:90)
        at io.quarkus.vertx.core.runtime.VertxCoreRecorder$13.runWith(VertxCoreRecorder.java:543)
        at org.jboss.threads.EnhancedQueueExecutor$Task.run(EnhancedQueueExecutor.java:2449)
        at org.jboss.threads.EnhancedQueueExecutor$ThreadBody.run(EnhancedQueueExecutor.java:1478)
        at org.jboss.threads.DelegatingRunnable.run(DelegatingRunnable.java:29)
        at org.jboss.threads.ThreadLocalResettingRunnable.run(ThreadLocalResettingRunnable.java:29)
        at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
        at java.base/java.lang.Thread.run(Thread.java:834)

I left it running for a few minutes after that without any new requests but memory use did not return to normal levels.

@geoand geoand reopened this Oct 20, 2021
@geoand geoand removed the triage/wontfix This will not be worked on label Oct 20, 2021
geoand added a commit to geoand/quarkus that referenced this issue Oct 20, 2021
The idea of this change is to make the allocation of buffers only happen
when the previous write has been completed, as opposed to the previous
behavior where all the buffers where allocated upfront and could never
be deallocated if one of the writes caused an error

Fixes: quarkusio#20822
@geoand
Copy link
Contributor

geoand commented Oct 20, 2021

#20907 takes care of the issue.

@quarkus-bot quarkus-bot bot added this to the 2.5 - main milestone Oct 25, 2021
geoand added a commit that referenced this issue Oct 25, 2021
Ensure that a large response does not cause OOM in RESTEasy Classic
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/resteasy-classic kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants