Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ProviderBase class shows contention on synchronized block using LRUMap _writers instance #166

Closed
jllanes3 opened this issue Jan 17, 2023 · 11 comments

Comments

@jllanes3
Copy link

Libraries I am using:

io.dropwizard:dropwizard-core:2.14.0
jackson-jaxrs-base-2.13.1.jar

I am seeing thread contention on the following line of ProviderBase.java.
image

Thread that locked the _writers instance:

"pool-abc-3846" #3846 prio=5 os_prio=0 tid=0x00007f75f01ee800 nid=0xfd4 runnable [0x00007f74de6e9000]
   java.lang.Thread.State: RUNNABLE
	at java.lang.reflect.WeakCache.containsValue(WeakCache.java:175)
	at java.lang.reflect.Proxy.isProxyClass(Proxy.java:791)
	at sun.reflect.annotation.AnnotationInvocationHandler.asOneOfUs(AnnotationInvocationHandler.java:225)
	at sun.reflect.annotation.AnnotationInvocationHandler.equalsImpl(AnnotationInvocationHandler.java:201)
	at sun.reflect.annotation.AnnotationInvocationHandler.invoke(AnnotationInvocationHandler.java:64)
	at com.sun.proxy.$Proxy56.equals(Unknown Source)
	at com.fasterxml.jackson.jaxrs.cfg.AnnotationBundleKey._equals(AnnotationBundleKey.java:135)
	at com.fasterxml.jackson.jaxrs.cfg.AnnotationBundleKey.equals(AnnotationBundleKey.java:116)
	at java.util.HashMap.getNode(HashMap.java:573)
	at java.util.LinkedHashMap.get(LinkedHashMap.java:440)
	at com.fasterxml.jackson.jaxrs.base.ProviderBase._endpointForWriting(ProviderBase.java:681)
	- locked <0x00000003c2a0f580> (a com.fasterxml.jackson.jaxrs.util.LRUMap)
	at com.fasterxml.jackson.jaxrs.base.ProviderBase.writeTo(ProviderBase.java:557)
	at org.glassfish.jersey.message.internal.WriterInterceptorExecutor$TerminalWriterInterceptor.invokeWriteTo(WriterInterceptorExecutor.java:242)
	at org.glassfish.jersey.message.internal.WriterInterceptorExecutor$TerminalWriterInterceptor.aroundWriteTo(WriterInterceptorExecutor.java:227)
	at org.glassfish.jersey.message.internal.WriterInterceptorExecutor.proceed(WriterInterceptorExecutor.java:139)
	at io.dropwizard.jersey.errors.EofExceptionWriterInterceptor.aroundWriteTo(EofExceptionWriterInterceptor.java:39)
	at org.glassfish.jersey.message.internal.WriterInterceptorExecutor.proceed(WriterInterceptorExecutor.java:139)
	at org.glassfish.jersey.server.internal.JsonWithPaddingInterceptor.aroundWriteTo(JsonWithPaddingInterceptor.java:85)
	at org.glassfish.jersey.message.internal.WriterInterceptorExecutor.proceed(WriterInterceptorExecutor.java:139)
	at org.glassfish.jersey.server.internal.MappableExceptionWrapperInterceptor.aroundWriteTo(MappableExceptionWrapperInterceptor.java:61)
	at org.glassfish.jersey.message.internal.WriterInterceptorExecutor.proceed(WriterInterceptorExecutor.java:139)
	at org.glassfish.jersey.message.internal.MessageBodyFactory.writeTo(MessageBodyFactory.java:1116)
	at org.glassfish.jersey.server.ServerRuntime$Responder.writeResponse(ServerRuntime.java:642)
	at org.glassfish.jersey.server.ServerRuntime$Responder.processResponse(ServerRuntime.java:373)
	at org.glassfish.jersey.server.ServerRuntime$Responder.process(ServerRuntime.java:363)
	at org.glassfish.jersey.server.ServerRuntime$AsyncResponder$3.run(ServerRuntime.java:864)
	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248)
	at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244)
	at org.glassfish.jersey.internal.Errors.process(Errors.java:292)
	at org.glassfish.jersey.internal.Errors.process(Errors.java:274)
	at org.glassfish.jersey.internal.Errors.process(Errors.java:244)
	at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265)
	at org.glassfish.jersey.server.ServerRuntime$AsyncResponder.resume(ServerRuntime.java:896)
	at org.glassfish.jersey.server.ServerRuntime$AsyncResponder.resume(ServerRuntime.java:852)
        ...

Thread (1 of many!) that is waiting for such lock:

	"pool-abc-3478" #3478 prio=5 os_prio=0 tid=0x00007f768c903000 nid=0xe3b waiting for monitor entry [0x00007f74f4343000]
	java.lang.Thread.State: BLOCKED (on object monitor)
	 at com.fasterxml.jackson.jaxrs.base.ProviderBase._endpointForWriting(ProviderBase.java:680)
	 - waiting to lock <0x00000003c2a0f580> (a com.fasterxml.jackson.jaxrs.util.LRUMap)
	 at com.fasterxml.jackson.jaxrs.base.ProviderBase.writeTo(ProviderBase.java:557)
	 at org.glassfish.jersey.message.internal.WriterInterceptorExecutor$TerminalWriterInterceptor.invokeWriteTo(WriterInterceptorExecutor.java:242)
	 at org.glassfish.jersey.message.internal.WriterInterceptorExecutor$TerminalWriterInterceptor.aroundWriteTo(WriterInterceptorExecutor.java:227)
	 at org.glassfish.jersey.message.internal.WriterInterceptorExecutor.proceed(WriterInterceptorExecutor.java:139)
	 at io.dropwizard.jersey.errors.EofExceptionWriterInterceptor.aroundWriteTo(EofExceptionWriterInterceptor.java:39)
	 at org.glassfish.jersey.message.internal.WriterInterceptorExecutor.proceed(WriterInterceptorExecutor.java:139)
	 at org.glassfish.jersey.server.internal.JsonWithPaddingInterceptor.aroundWriteTo(JsonWithPaddingInterceptor.java:85)
	 at org.glassfish.jersey.message.internal.WriterInterceptorExecutor.proceed(WriterInterceptorExecutor.java:139)
	 at org.glassfish.jersey.server.internal.MappableExceptionWrapperInterceptor.aroundWriteTo(MappableExceptionWrapperInterceptor.java:61)
	 at org.glassfish.jersey.message.internal.WriterInterceptorExecutor.proceed(WriterInterceptorExecutor.java:139)
	 at org.glassfish.jersey.message.internal.MessageBodyFactory.writeTo(MessageBodyFactory.java:1116)
	 at org.glassfish.jersey.server.ServerRuntime$Responder.writeResponse(ServerRuntime.java:642)
	 at org.glassfish.jersey.server.ServerRuntime$Responder.processResponse(ServerRuntime.java:373)
	 at org.glassfish.jersey.server.ServerRuntime$Responder.process(ServerRuntime.java:363)
	 at org.glassfish.jersey.server.ServerRuntime$AsyncResponder$3.run(ServerRuntime.java:864)
	 at org.glassfish.jersey.internal.Errors$1.call(Errors.java:248)
	 at org.glassfish.jersey.internal.Errors$1.call(Errors.java:244)
	 at org.glassfish.jersey.internal.Errors.process(Errors.java:292)
	 at org.glassfish.jersey.internal.Errors.process(Errors.java:274)
	 at org.glassfish.jersey.internal.Errors.process(Errors.java:244)
	 at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:265)
	 at org.glassfish.jersey.server.ServerRuntime$AsyncResponder.resume(ServerRuntime.java:896)
	 at org.glassfish.jersey.server.ServerRuntime$AsyncResponder.resume(ServerRuntime.java:852)
         ....

JMC for a 1 min JFR:
image

Full code:

    protected EP_CONFIG _endpointForWriting(Object value, Class<?> type, Type genericType,
            Annotation[] annotations, MediaType mediaType, MultivaluedMap<String,Object> httpHeaders)
    {
        // 29-Jun-2016, tatu: As per [jaxrs-providers#86] allow skipping caching
        if (!isEnabled(JaxRSFeature.CACHE_ENDPOINT_WRITERS)) {
            return _configForWriting(locateMapper(type, mediaType), annotations, _defaultWriteView);
        }

        EP_CONFIG endpoint;
        AnnotationBundleKey key = new AnnotationBundleKey(annotations, type);
        synchronized (_writers) {
            endpoint = _writers.get(key);
        }
        // not yet resolved (or not cached any more)? Resolve!
        if (endpoint == null) {
            MAPPER mapper = locateMapper(type, mediaType);
            endpoint = _configForWriting(mapper, annotations, _defaultWriteView);
            // and cache for future reuse
            synchronized (_writers) {
                _writers.put(key.immutableKey(), endpoint);
            }
        }
        return endpoint;
    }

Are you aware if this is expected say when moving to Dropwizard async request handling (i.e. using AsyncResponder.resume(...))?

If this is by design, are you aware of any workarounds when using Dropwizard + Jersey + Jackson (in async mode) that could help with this contention?

@pjfanning
Copy link
Member

@cowtowncoder since jackson v2.14 and the changes in LRUMap to use a ConcurrentLinkedHashMap implementation, the synchronization around _writers and _readers in ProviderBase may not be needed.

@cowtowncoder
Copy link
Member

@pjfanning You are absolutely right -- since implementation is synchronized, sync here should not be needed (and ditto for jackson-jakarta-rs-providers).

@cowtowncoder
Copy link
Member

Ok, fixed via #167, to be included in 2.14.2 / 2.15.0.

@cowtowncoder cowtowncoder changed the title ProviderBase class shows contention on synchronized block using LRUMap _writers instance ProviderBase class shows contention on synchronized block using LRUMap _writers instance Jan 18, 2023
@jllanes3
Copy link
Author

Thanks for fixing this super fast. Do you know when we could expect to see 2.14.2 out? I really need the fix, and I want to plan around the release date if possible.

@cowtowncoder
Copy link
Member

cowtowncoder commented Jan 20, 2023

I am hoping to get 2.14.2 released in couple of weeks; might be very soon but cannot promise due to fluctuating time I have: each full patch release takes 2.5-3 hours of my time.

@e-hubert-opti
Copy link

@cowtowncoder I think the assumption the used LRUMap to be thread-safe because being backed by a concurrent implementation was not correct here and causes unbound growth of this map. Could it be that you confused com.fasterxml.jackson.jaxrs.util (used by ProviderBase) with com.fasterxml.jackson.databind.util.LRUMap? Am going to file a bug against Jersey 2.41, but it looks like they only ported this change into their code base. So likely someone should also file an issue with Jackson.

@cowtowncoder
Copy link
Member

@e-hubert-opti Whoa! You are right: I had forgotten about copy-pasted version of LRUMap. Yes, I think we need to change JAX-RS/Jakarta-RS providers to use jackson-databind version, which was rewritten in 2.14.

@e-hubert-opti
Copy link

Thank you very much for this ultra fast response @cowtowncoder. I adjusted my unit test to validate the behavior of the LRUMap implementation and changed the import to com.fasterxml.jackson.databind.util.LRUMap (version 2.16.0) and it also fails:

@Test
void test_concurrent_usage_of_lru_map() {
    LookupCache<Integer, Integer> lruMap = new LRUMap<>(16, 120);
    ExecutorService executorService = Executors.newFixedThreadPool(10);

    for (int i = 0; i < 100000; i++) {
        int tmp = i;
        executorService.submit(() -> lruMap.put(tmp, tmp));
    }

    assertThat(lruMap.size()).isLessThanOrEqualTo(120);
}

java.lang.AssertionError:
Expecting actual:
5242
to be less than or equal to:
120

Synchronizing the map with Collections.synchronizedMap() turns it green. I will try to have a closer look at the LRUMap implementation of jackson-databind, but wanted to quickly provide feedback.

@e-hubert-opti
Copy link

e-hubert-opti commented Dec 16, 2023

Ah, all good. I had a look in the implementation of PrivateMaxEntriesMap and noticed the async draining, so it is just the test which needs to be adjusted for this.

@cowtowncoder
Copy link
Member

Yeah older versions of LRUMap suffered from various problems; original (one copied here) was not thread-safe wrt mutations; next variation was safe, had flush but no actual LRU logic. Latest version is bit of an overkill, but should work correctly and efficiently, even if not very memory efficiently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants