Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hystrix ThreadPools are not shutdown, system hangs. #102

Closed
ghost opened this issue Feb 9, 2013 · 8 comments
Closed

Hystrix ThreadPools are not shutdown, system hangs. #102

ghost opened this issue Feb 9, 2013 · 8 comments
Milestone

Comments

@ghost
Copy link

ghost commented Feb 9, 2013

Hi,
We are having an issue with Hystrix where the ThreadPools are not shutdown, so when our process tries to terminate it can't due to treads still in waiting state. According to the logs 2 HystrixCommands ran with an exception happened in run so fallback got executed. What we had to do is to keep track of Hystrix Thread Pools and when the main process terminates (hook in via a Listener) we shutdown all Hystrix TreadPools. via com.netflix.hystrix.HystrixThreadPool.Factory.getInstance() (unfortunately Factory class is package-private which forces us to create our class in com.netflix package). Can you suggest a better way of dealing with this issue or perhaps provide a better way to gracefully shutdown all ThreadPools, would settle fo making com.netflix.hystrix.HystrixThreadPool.Factory public?
Here is the full thread dumb

2013-02-07 10:01:11
Full thread dump Java HotSpot(TM) 64-Bit Server VM (23.6-b04 mixed mode):

"Attach Listener" daemon prio=10 tid=0x0000000002dd1000 nid=0x31c2 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"DestroyJavaVM" prio=10 tid=0x00007f905db76000 nid=0x4ef0 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"hystrix-TaxCalculationGroupKey-1" prio=10 tid=0x0000000003a55800 nid=0x3486 waiting on condition [0x00007f9059952000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000079678a558> (a java.util.concurrent.SynchronousQueue$TransferStack)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:458)
        at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
        at java.util.concurrent.SynchronousQueue.take(SynchronousQueue.java:925)
        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1043)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1103)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)

"hystrix-WarehouseSourcingGroupKey-1" prio=10 tid=0x0000000003245000 nid=0x3485 waiting on condition [0x00007f9059bc2000]
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000079678d948> (a java.util.concurrent.SynchronousQueue$TransferStack)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:458)
        at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:359)
        at java.util.concurrent.SynchronousQueue.take(SynchronousQueue.java:925)
        at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1043)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1103)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:722)

"Java2D Disposer" daemon prio=10 tid=0x00007f90556a1800 nid=0x5c43 in Object.wait() [0x00007f905adee000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x000000078e2bf1b0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
        - locked <0x000000078e2bf1b0> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
        at sun.java2d.Disposer.run(Disposer.java:145)
        at java.lang.Thread.run(Thread.java:722)

"MultiThreadedHttpConnectionManager cleanup" daemon prio=10 tid=0x00007f905d601000 nid=0x535d in Object.wait() [0x00007f90605dd000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x00000007867b8750> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
        - locked <0x00000007867b8750> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
        at org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ReferenceQueueThread.run(MultiThreadedHttpConnectionManager.java:1122)

"ActiveMQ Scheduler" daemon prio=10 tid=0x00007f905d927800 nid=0x5359 in Object.wait() [0x00007f90609e1000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:503)
        at java.util.TimerThread.mainLoop(Timer.java:526)
        - locked <0x000000078aef0380> (a java.util.TaskQueue)
        at java.util.TimerThread.run(Timer.java:505)

"Service Thread" daemon prio=10 tid=0x0000000001e0e800 nid=0x4eff runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread1" daemon prio=10 tid=0x0000000001e0c800 nid=0x4efe waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C2 CompilerThread0" daemon prio=10 tid=0x0000000001e09800 nid=0x4efd waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Signal Dispatcher" daemon prio=10 tid=0x0000000001e07000 nid=0x4efc runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"Finalizer" daemon prio=10 tid=0x0000000001dbb800 nid=0x4efb in Object.wait() [0x00007f9063932000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:135)
        - locked <0x0000000780116298> (a java.lang.ref.ReferenceQueue$Lock)
        at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:151)
        at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:177)

"Reference Handler" daemon prio=10 tid=0x0000000001db4000 nid=0x4efa in Object.wait() [0x00007f9063a33000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:503)
        at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:133)
        - locked <0x0000000780115ca8> (a java.lang.ref.Reference$Lock)

"VM Thread" prio=10 tid=0x0000000001dac000 nid=0x4ef9 runnable

"GC task thread#0 (ParallelGC)" prio=10 tid=0x0000000001d28800 nid=0x4ef1 runnable

"GC task thread#1 (ParallelGC)" prio=10 tid=0x0000000001d2a800 nid=0x4ef2 runnable

"GC task thread#2 (ParallelGC)" prio=10 tid=0x0000000001d2c800 nid=0x4ef3 runnable

"GC task thread#3 (ParallelGC)" prio=10 tid=0x0000000001d2e000 nid=0x4ef4 runnable

"GC task thread#4 (ParallelGC)" prio=10 tid=0x0000000001d30000 nid=0x4ef5 runnable

"GC task thread#5 (ParallelGC)" prio=10 tid=0x0000000001d32000 nid=0x4ef6 runnable

"GC task thread#6 (ParallelGC)" prio=10 tid=0x0000000001d33800 nid=0x4ef7 runnable

"GC task thread#7 (ParallelGC)" prio=10 tid=0x0000000001d35800 nid=0x4ef8 runnable

"VM Periodic Task Thread" prio=10 tid=0x00007f905c013800 nid=0x4f00 waiting on condition

JNI global references: 234

Regards
Denis

@benjchristensen
Copy link
Contributor

At a minimum I'll add a shutdown method if there isn't something cleaner.

Interesting that it doesn't allow shutdown as I would have thought it would also prevent unit tests from shutting down. and I've never had that issue. JUnit must force a System.exit.

@ghost ghost assigned benjchristensen Feb 11, 2013
@ghost
Copy link
Author

ghost commented Feb 12, 2013

Thanks A shutdown of com.netflix.hystrix.HystrixThreadPool.Factory#threadPools would be great. Will be looking forward to the next release.

@benjchristensen
Copy link
Contributor

Please take a look at the pull request and let me know if that would solve this. If so I'll merge and release.

@benjchristensen
Copy link
Contributor

I am modifying my approach to one that is more generic so that it supports resetting anything that has state and supports possible future work that could involve other resources besides thread-pools that need to be cleaned up if an app wants a graceful shutdown rather than just being terminated.

The new command will be: Hystrix.reset()

Here is my test case:

package com.netflix.hystrix;

public class TestHystrixApp {

    public static void main(String args[]) {
        System.out.println(new TestCommand().execute());
        System.out.println("reset");

        Hystrix.reset();

        // the reset allows us to start using Hystrix again
        System.out.println(new TestCommand().execute());
        System.out.println("ending");

        // or it allows a clean shutdown
        Hystrix.reset();
    }

    private static class TestCommand extends HystrixCommand<String> {

        public TestCommand() {
            super(HystrixCommandGroupKey.Factory.asKey("test"));
        }

        @Override
        protected String run() throws Exception {
            return "hello";
        }
    }
}

It results in this with the reset() function being called:

hello
reset
hello
ending

Process finished with exit code 0

WIthout the reset() it does not exit unless System.exit() is called (or the process is killed).

benjchristensen referenced this issue Feb 15, 2013
I'm changing the design from the previous commits so it's more abstract and can handle any type of resources needing cleanup, not just threadpools.

ReactiveX/RxJava#45
@benjchristensen
Copy link
Contributor

I merged the code which adds Hystrix.reset() as a means to reset the state of all Hystrix resources including thread-pools to allow shutdown, clean testing, etc.

I'll release the code shortly.

@benjchristensen
Copy link
Contributor

Released in 1.2.8 which is making its way to Maven Central now.

https://github.com/Netflix/Hystrix/blob/master/CHANGES.md#version-128-maven-central

abersnaze pushed a commit to abersnaze/Hystrix that referenced this issue Nov 7, 2013
abersnaze pushed a commit to abersnaze/Hystrix that referenced this issue Nov 7, 2013
@robertyates
Copy link

I also ran into this, is there a reason you are not using deamon threads for the thread pool which would not have this issue, seems odd to have to explicitly close down the thread pool when exiting.

@phax
Copy link

phax commented Aug 21, 2014

Good question - is there any specific reason for the threads not being daemon threads???

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants