Skip to content
This repository has been archived by the owner on May 30, 2024. It is now read-only.

LDClient doesn't shutdown properly in the streaming mode #93

Closed
dnetrebenko-smartling opened this issue Apr 27, 2017 · 16 comments
Closed

Comments

@dnetrebenko-smartling
Copy link

dnetrebenko-smartling commented Apr 27, 2017

Hi!

We have a web application under Tomcat server which uses Java LDClient (v. 2.1.0, streaming mode) to access feature flags.
We have a problem with shutdown of our application. It doesn't stop within 60 seconds.
Our shutdown routine looks like:

  1. Stop Tomcat
  2. Wait 60 seconds for application termination
  3. Make a threaddump and kill the process if application hasn't stopped.

Almost all threaddumps contain following threads:

"OkHttp ConnectionPool" daemon prio=10 tid=0x00007fd7c0027800 nid=0x13e in Object.wait() [0x00007fd76b4f3000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x0000000795ab28a0> (a com.launchdarkly.shaded.okhttp3.ConnectionPool)
        at java.lang.Object.wait(Object.java:461)
        at com.launchdarkly.shaded.okhttp3.ConnectionPool$1.run(ConnectionPool.java:66)
        - locked <0x0000000795ab28a0> (a com.launchdarkly.shaded.okhttp3.ConnectionPool)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

"okhttp-eventsource-0" prio=10 tid=0x00007fd7c4df6000 nid=0x13b runnable [0x00007fd76b5f3000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:152)
        at java.net.SocketInputStream.read(SocketInputStream.java:122)
        at sun.security.ssl.InputRecord.readFully(InputRecord.java:442)
        at sun.security.ssl.InputRecord.read(InputRecord.java:480)
        at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:934)
        - locked <0x0000000795ae8d10> (a java.lang.Object)
        at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:891)
        at sun.security.ssl.AppInputStream.read(AppInputStream.java:102)
        - locked <0x0000000795ae8d28> (a sun.security.ssl.AppInputStream)
        at com.launchdarkly.shaded.okio.Okio$2.read(Okio.java:138)
        at com.launchdarkly.shaded.okio.AsyncTimeout$2.read(AsyncTimeout.java:236)
        at com.launchdarkly.shaded.okio.RealBufferedSource.request(RealBufferedSource.java:66)
        at com.launchdarkly.shaded.okio.RealBufferedSource.require(RealBufferedSource.java:59)
        at com.launchdarkly.shaded.okio.RealBufferedSource.readHexadecimalUnsignedLong(RealBufferedSource.java:284)
        at com.launchdarkly.shaded.okhttp3.internal.http1.Http1Codec$ChunkedSource.readChunkSize(Http1Codec.java:444)
        at com.launchdarkly.shaded.okhttp3.internal.http1.Http1Codec$ChunkedSource.read(Http1Codec.java:425)
        at com.launchdarkly.shaded.okio.RealBufferedSource.read(RealBufferedSource.java:45)
        at com.launchdarkly.shaded.okio.RealBufferedSource.indexOf(RealBufferedSource.java:325)
        at com.launchdarkly.shaded.okio.RealBufferedSource.indexOf(RealBufferedSource.java:314)
        at com.launchdarkly.shaded.okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:210)
        at com.launchdarkly.shaded.com.launchdarkly.eventsource.EventSource.connect(EventSource.java:148)
        at com.launchdarkly.shaded.com.launchdarkly.eventsource.EventSource.access$600(EventSource.java:29)
        at com.launchdarkly.shaded.com.launchdarkly.eventsource.EventSource$1.run(EventSource.java:74)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)

If we switch the LDClient to events mode the application stops within a few seconds.

EventSource.java:

            for (String line; !Thread.currentThread().isInterrupted() && (line = bs.readUtf8LineStrict()) != null; ) 
              parser.line(line);

It looks like we never leave bs.readUtf8LineStrict().

You can reproduce the issue using very simple console application:

import com.launchdarkly.client.LDClient;
import com.launchdarkly.client.LDConfig;

public class LDShutdownProto
{
    public static void main(String[] args) throws Exception
    {
        new LDShutdownProto().doIt();
    }

    private void doIt() throws Exception
    {
        final LDConfig config = new LDConfig.Builder()
                .offline(false)
                .socketTimeoutMillis(2000)
                .connectTimeoutMillis(2000)
                .build();
        final LDClient client = new LDClient("sdk key here", config);

        Thread.sleep(10000);

        client.close();

        Thread.sleep(60000);

        // Make a threaddump or add breakpoint here.
        // Mentioned threads are still alive.
    }
}

The threads will die after ~175 seconds delay.

@arun251
Copy link
Contributor

arun251 commented May 9, 2017

Hi @dnetrebenko-smartling , we fixed an issue with a background connection recently. Do you mind checking version 2.2.2 to see if that resolves the issue you're seeing?

@dnetrebenko-smartling
Copy link
Author

Hi!
It seems that LD client stopped to work at all for me after upgrade to v. 2.2.2

[INFO] 13:17:09,540 (LDClient.java:<init>:75)Enabling streaming API 
[INFO] 13:17:09,587 (EventSource.java:start:90)Starting EventSource client using URI: https://stream.launchdarkly.com/flags 
[INFO] 13:17:09,588 (LDClient.java:<init>:85)Waiting up to 5000 milliseconds for LaunchDarkly client to start... 
[INFO] 13:17:10,546 (EventSource.java:connect:174)Connected to Event Source stream. 
[ERROR] 13:17:14,590 (LDClient.java:<init>:89)Timeout encountered waiting for LaunchDarkly client initialization 
[WARN] 13:17:14,596 (LDClient.java:evaluate:337)Evaluation called before Client has been initialized for feature flag .......; returning default value 

LD client hasn't been initialized.

I don't understand what's wrong...

@pkaeding
Copy link
Contributor

@dnetrebenko-smartling I just released 2.2.3 which should resolve the initialization issue. Please give it a shot, and let me know if you have any issues.

@dnetrebenko-smartling
Copy link
Author

dnetrebenko-smartling commented May 15, 2017

Hi pkaeding !
Thank you for your response. The initialization issue has gone after upgrade to v 2.2.3.
But I still have issue with proper shutdown of LDClient. In my test application after ldClient.close() and waiting for 60 seconds I still see Okio threads:

image

image

image

image

@pkaeding
Copy link
Contributor

Thanks @dnetrebenko-smartling, I will look into this today.

@pkaeding
Copy link
Contributor

@dnetrebenko-smartling It looks like these threads are managed by OkHttp, and they should be released automatically if they are idle for long enough: http://square.github.io/okhttp/3.x/okhttp/index.html?okhttp3/OkHttpClient.html (grep for "Shutdown isn't necessary").

That said, I can add the calls mentioned in those docs to explicitly release those resources when the LDClient is closed, to make the shutdown more clean (it would be impossible for these resources to be used again after the LDClient is shutdown, so there is no point in waiting for OkHttp to clean them up).

@dnetrebenko-smartling
Copy link
Author

@pkaeding Yes, you're right. Those threads will die after ~175 seconds. It's acceptable for a console application. When the main thread dies the application will shutdown independently of state of those threads. But for Tomcat + Spring web application it's not acceptable. Because those threads prevent correct shutdown (restart) of it.
You can add configuration parameter that means whether to release resources or not when LDClient is closing.

@dnetrebenko-smartling
Copy link
Author

Hi! any news about the issue?

@pkaeding
Copy link
Contributor

pkaeding commented Jun 8, 2017

Hi @dnetrebenko-smartling Yes, I started digging into this, but I was still seeing some resources stick around after adding the calls mentioned in the link above. I will continue to work on this to see if we can make some headway.

@drichelson
Copy link
Contributor

Hi @dnetrebenko-smartling I've improved resource management and narrowed down the lingering threads to this issue: square/okio#256 which unfortunately means we still have the Okio watchdog thread sticking around for 60 seconds after we call close(). The improved changes are in a snapshot release: 2.2.6-SNAPSHOT. If your package manager can't find it please add this repo: https://oss.sonatype.org/content/repositories/snapshots.

Please let us know if this is any better, and if it's still not working well enough let's move this to a support ticket by emailing [email protected]

@drichelson
Copy link
Contributor

@dnetrebenko-smartling Have you had a chance to check out the 2.2.6-SNAPSHOT client?

@dnetrebenko-smartling
Copy link
Author

Hi!
the 2.2.6-SNAPSHOT didn't help. I still see OkHttp thread after 60 second timeout after LD client close.
image

This was referenced Jun 16, 2017
@drichelson
Copy link
Contributor

@dnetrebenko-smartling I reached out to you via email to help you resolve this based on your particular config. If you didn't get it, please contact [email protected]. We'll be able to pick up where we left off here.

eli-darkly added a commit that referenced this issue Aug 25, 2018
avoid some inappropriate uses of Guava's ImmutableMap
@Vishalb20
Copy link

Vishalb20 commented Mar 12, 2019

is this issue fixed, I am using LDClient 4.1.0 and facing the issue: ERROR [okhttp-eventsource-events-[]-0] StreamProcessor - Encountered EventSource error: connect timed out", Can anyone help.

@eli-darkly
Copy link
Contributor

@Vishalb20 That seems unrelated to the issue you've posted on. This issue is about worker threads not being terminated at shutdown. What you're describing is some kind of network connection failure, and it's not possible to diagnose it from that log message without knowing more information. It may or may not have anything to do with the Java SDK in particular.

Could you please raise a support ticket by going to https://support.launchdarkly.com/hc/en-us or emailing [email protected] so that the support team can help in investigating this problem? Thanks.

@eli-darkly
Copy link
Contributor

Also, I'm closing this issue since there has been no activity on it in over a year.

(To clarify the status of the original issue, it had to do specifically with daemon threads preventing the restart of a Tomcat application; we did make some Java SDK changes related to this, but to some degree it was an OkHttp issue that could not be completely worked around. However, we've had no further reports of problems in Tomcat since then to my knowledge.)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants