Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v4.2 does not reconnect to logstash in some cases #85

Closed
kretz opened this issue Mar 4, 2015 · 23 comments
Closed

v4.2 does not reconnect to logstash in some cases #85

kretz opened this issue Mar 4, 2015 · 23 comments

Comments

@kretz
Copy link

kretz commented Mar 4, 2015

I have a problem with the LogstashTcpSocketAppender - it does not detect that the logstash service has been shut down. (This could be a duplicate of #76, but since I'm not sure, I'm creating a new issue.)

If I start my service using LogstashTcpSocketAppender without a logstash service running and then start the logstash service the logs (correctly) say:

18:14:00,786 |-WARN in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Log destination 10.0.0.48:4560: connection failed. Waiting 29999ms before attempting reconnection. java.net.ConnectException: Connection refused
...
18:14:30,786 |-WARN in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Log destination 10.0.0.48:4560: connection failed. Waiting 29999ms before attempting reconnection. java.net.ConnectException: Connection refused
...
18:19:00,788 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Log destination 10.0.0.48:4560: connection established.

However, when I then shut down the logstash service there is nothing in the logs. And I guess it is because LogstashTcpSocketAppender thinks the logstash service is till there.

One thing to add here is that I close the logstash service very abruptly - it runs in a docker container which is killed - so the socket might look like it is connected from the LogstashTcpSocketAppender side.

Another thing to add is that I run on AWS with my logstash service on EC2, my services deployed via Elastic Beanstalk, communicating directly with logstash over private IPs without ELB.

How can I help to debug this?

@philsttr
Copy link
Collaborator

philsttr commented Mar 4, 2015

Are you actively logging messages? or using keep alive (via keepAliveDuration)?

The LogstashTcpSocketAppender won't detect that the socket has been disconnected until it attempts to send new events, or sends the keep alive message.

I've also noticed it might take sending a couple messages before Java realizes the socket is disconnected. Unfortunately, there's nothing that the TCP appender can do about that.

@kretz
Copy link
Author

kretz commented Mar 5, 2015

I generate logging messages with a rate at about 5 per second. (I also added keepAliveDuration set to a 1 minute limit, but with no apparent effect on my problem or logs.)

If i shut down logstash, and then start it up immediately again, the LogstashTcpSocketAppender still does not detect the disconnected socket:

# On the client connecting to logstash
$ sudo netstat -npt | grep 4560
tcp        1 376785 ::ffff:10.0.1.184:33470     ::ffff:10.0.0.48:4560       CLOSE_WAIT  8476/java 

I waited about 13 minutes, continuously generating log events at a rate of about 5 per second, and then the logs say:

01:24:58,016 |-WARN in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Log destination 10.0.0.48:4560: unable to send event: No route to host java.net.SocketException: No route to host
    at java.net.SocketException: No route to host
    at  at java.net.SocketOutputStream.socketWrite0(Native Method)
    at  at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
    at  at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
    at  at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
    at  at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
    at  at net.logstash.logback.encoder.CompositeJsonEncoder.doEncode(CompositeJsonEncoder.java:83)
    at  at net.logstash.logback.encoder.CompositeJsonEncoder.doEncode(CompositeJsonEncoder.java:33)
    at  at net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler.onEvent(AbstractLogstashTcpSocketAppender.java:294)
    at  at net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler.onEvent(AbstractLogstashTcpSocketAppender.java:188)
    at  at net.logstash.logback.encoder.com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
    at  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
    at  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
    at  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at  at java.lang.Thread.run(Thread.java:744)
01:24:58,017 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Log destination 10.0.0.48:4560: connection established.

So, it seems something got the client to detect the socket was closed. Buffer full generating a write? What could it be? Any ideas?

@philsttr
Copy link
Collaborator

philsttr commented Mar 5, 2015

The buffer should be flushed fairly often. (Way more often than 13 minutes) By default, it is flushed after every event.

The appender will only detect that the socket is closed and attempt reconnection if an exception occurs while writing to the socket. If Java's socket doesn't detect that the socket is closed, there's not much the appender can do.

I'm not sure exactly what is going on in your case. I haven't been able to reproduce this. From what I have seen, the appending thread detects the socket disconnect after an event or two (when Java's socket throws an exception on write), and then reconnects and continues appending.

My suggestion for a next step would be to take a few thread dumps after you have cycled logstash to see what the appending thread is doing. By default, the appending thread is be named logback-async-disruptor-appender-host:port-xxx. If you are using keepAliveDuration, there will be two threads: the first one will be the appending thread, the second one will be the keep alive thread.

Use jstack <pid> or kill -SIGQUIT <pid> to create several thread dumps and compare them.

Alternatively, if you are comfortable with a java debugger, you could just hook up a debugger to see what's up.

@kretz
Copy link
Author

kretz commented Mar 5, 2015

When i do kill -SIGQUIT <pid> i get the same thing every time regardless of connection to logstash server being established, close_wait or nonexistent.

"logback-async-disruptor-appender-10.0.0.48:4560-1" daemon prio=10 tid=0x00007fb8e89e4800 nid=0x212e waiting on condition [0x00007fb8f616e000]
   java.lang.Thread.State: WAITING (parking)
  at sun.misc.Unsafe.park(Native Method)
  - parking to wait for  <0x00000000f2a21038> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
  at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)
  at net.logstash.logback.encoder.com.lmax.disruptor.BlockingWaitStrategy.waitFor(BlockingWaitStrategy.java:45)
  at net.logstash.logback.encoder.com.lmax.disruptor.ProcessingSequenceBarrier.waitFor(ProcessingSequenceBarrier.java:55)
  at net.logstash.logback.encoder.com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:123)
  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:744)

"logback-async-disruptor-appender-10.0.0.48:4560-2" daemon prio=10 tid=0x00007fb8f007f000 nid=0x21ae waiting on condition [0x00007fb8f5056000]
   java.lang.Thread.State: TIMED_WAITING (parking)
  at sun.misc.Unsafe.park(Native Method)
  - parking to wait for  <0x00000000f29ee4d0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
  at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
  at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1090)
  at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:807)
  at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1068)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
  at java.lang.Thread.run(Thread.java:744)

Additionally, I ran netstat continuously on the sending side, from the time the logstash server shut down until the connection resets. I saw the Recv-Q and Send-Q go from (0, 0) to (1, 376785) respectively.

# just after logstash shut down
tcp        0 0 ::ffff:10.0.1.184:42431     ::ffff:10.0.0.48:4560       CLOSE_WAIT  8476/java
# a few minutes later
tcp        1 376785 ::ffff:10.0.1.184:42431     ::ffff:10.0.0.48:4560       CLOSE_WAIT  8476/java

So, is the socket accepting writes to a send queue even though the state is CLOSE_WAIT?

@kretz
Copy link
Author

kretz commented Mar 6, 2015

I also tried to remove logback from the equation, and only connect to the logstash service with nc - and it detects the disconnected socket and closes it.

$ nc -v 10.0.0.48 4560
Connection to 10.0.0.48 4560 port [tcp/*] succeeded!
# released immediately after logstash service is shut down

Are there any places where an error or exception on a write is not handled? Could not find any obvious problem reading the code.

@sihil
Copy link

sihil commented Mar 6, 2015

I can also vouch that we see this behaviour - after following #76, all our clients are on 4.2. Clients still get stuck in CLOSE_WAIT. We are running with a 3 node ELK cluster behind an ELB.

Happy to do some debugging with @kenoir if you need more assistance trying to track this down.

@philsttr
Copy link
Collaborator

The stack showing...

"logback-async-disruptor-appender-10.0.0.48:4560-1" daemon prio=10 tid=0x00007fb8e89e4800 nid=0x212e waiting on condition [0x00007fb8f616e000]
   java.lang.Thread.State: WAITING (parking)
  at sun.misc.Unsafe.park(Native Method)
  - parking to wait for  <0x00000000f2a21038> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
  at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
  at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2043)

...is actually perfectly normal under working conditions. With a low frequency of log messages, the thread will spend most of its time waiting on events to be added to the ringbuffer.

However, it does not rule out any problems, such as...
A) The thread is never woken up when an event is logged (and therefore never even tries to send it)
OR
B) The thread does not receive an error when it is writing the event to the socket output stream.

I've been trying this for a while using netcat as the server locally and haven't been able to reproduce the problem. Every time I shutdown netcat, the tcp appender will detect it after an event or two, and attempt to reconnect. The socket will be in CLOSE_WAIT until writing an event causes the socket to throw an exception, which causes the appender to detect the problem and go into a reconnect loop.

Can you try adding a status listener to your config to see any status messages from the appender?

<statusListener class="ch.qos.logback.core.status.OnConsoleStatusListener" />

The following message (from that listener) indicates that the appender has detected a broken socket...

21:13:06,589 |-WARN in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Log destination localhost:4560: unable to send event: Broken pipe java.net.SocketException: Broken pipe
    at java.net.SocketException: Broken pipe
    at  at java.net.SocketOutputStream.socketWrite0(Native Method)

After the broken socket is detected, if the server is still down, you should see the following message...

21:13:06,591 |-WARN in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Log destination localhost:4560: connection failed. Waiting 4999ms before attempting reconnection. java.net.ConnectException: Connection refused
    at java.net.ConnectException: Connection refused
    at  at java.net.PlainSocketImpl.socketConnect(Native Method)

If you still don't see any messages after adding the OnConsoleStatusListener, would it be possible for you to hook up a debugger on your end to see what's going on?

Specifically, I'd like to know what happens in AbstractLogstashTcpSocketAppender on Line 298 and Line 307. If neither of those lines are throwing an exception, then the appender will not detect the socket is broken.

@philsttr
Copy link
Collaborator

I've created a debug branch that has extra status messages for:

  • when an event is enqueued
  • when an event is dequeued
  • when an event is successfully sent
  • when a keepAlive is sent.

If you want to build and use that for debugging, it will show what the code is doing better.

Note that you must have the OnConsoleStatusListener configured to see the extra debug messages on stdout.

@kretz
Copy link
Author

kretz commented Mar 11, 2015

I'll give that a try. I do, however, think that we will se all events being handled and successfully written to the socket since Send-Q fills up. I think the problem is that the writer should have detected that the connection is down and stop writing.

So, are there any places to add debugging to know if any exceptions or return values from writing to the socket is handled properly?

And yep, my OnConsoleStatusListener is configured.

Thanks for you help.

@philsttr
Copy link
Collaborator

I've added debugging in the places that I'm aware of.

Writing to the socket is done inside encoder.doEncode here

If that throws an exception, then you should see the "unable to send event" status line from here

If it doesn't throw an exception, then you should see the "Sent event" status line from here

The only way that I know of to detect if the "other end" has closed the socket is to write to it, which is exactly what is happening. If it's not throwing an exception when writing, then I'm stumped. I've been trying to reproduce this, and have not been successful. If I could reproduce it, I could debug it.

@kretz
Copy link
Author

kretz commented Mar 18, 2015

Btw, I found when compiling the library, that it will not run on Java 6, since the System.lineSeparator() did not exist on that version, http://docs.oracle.com/javase/6/docs/api/java/lang/System.html.

@kretz
Copy link
Author

kretz commented Mar 19, 2015

I just ran the debug branch of the code, with printouts, while shutting down the logstash service and bringing it up again. Also note that keepAlive is not enabled here.

These printouts are around the time that the Logstash service is shut down. Events until CCC seems to be sent, but from DDD the events are never dequeued.

16:33:05,350 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Enqueued event AAA
16:33:05,350 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Dequeued event AAA
16:33:05,350 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Sent event AAA

16:33:05,350 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Enqueued event BBB
16:33:05,350 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Dequeued event BBB
16:33:05,350 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Sent event BBB

16:33:05,360 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Enqueued event CCC
16:33:05,360 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Dequeued event CCC
16:33:05,360 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Sent event CCC

16:33:08,136 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Enqueued event DDD
16:33:08,136 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Enqueued event EEE
16:33:08,151 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Enqueued event FFF
16:33:08,151 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Enqueued event GGG

These printouts are about 12 minutes later, when the appender from some reason realises it cannot send anything.

16:44:50,821 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Enqueued event QQQ
16:44:50,821 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Enqueued event RRR
16:44:50,830 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Enqueued event SSS
16:44:50,831 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Enqueued event TTT

16:44:51,739 |-WARN in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Log destination 10.0.4.135:4560: unable to send event: No route to host java.net.SocketException: No route to host
    at java.net.SocketException: No route to host
    at  at java.net.SocketOutputStream.socketWrite0(Native Method)
    at  at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
    at  at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
    at  at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
    at  at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
    at  at net.logstash.logback.encoder.CompositeJsonEncoder.doEncode(CompositeJsonEncoder.java:83)
    at  at net.logstash.logback.encoder.CompositeJsonEncoder.doEncode(CompositeJsonEncoder.java:33)
    at  at net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler.onEvent(AbstractLogstashTcpSocketAppender.java:300)
    at  at net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler.onEvent(AbstractLogstashTcpSocketAppender.java:195)
    at  at net.logstash.logback.encoder.com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
    at  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:165)
    at  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:266)
    at  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
    at  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
    at  at java.lang.Thread.run(Thread.java:679)
16:44:51,751 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Log destination 10.0.4.135:4560: connection established.

16:44:51,752 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Sent event XXX

16:44:51,752 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Dequeued event YYY
16:44:51,752 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Sent event YYY

16:44:51,752 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Dequeued event ZZZ
16:44:51,752 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Sent event ZZZ

Was the writing thread stuck in sending the event?

Does this help in any way? Is there something else I should run through?

@kretz
Copy link
Author

kretz commented Mar 19, 2015

Did another run using only keepAlive events, i.e. no other events then keepAlive events were sent from the client to the Logstash service.

# The following events were received by the Logstash service
18:13:04,621 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Dequeued event null
18:13:04,621 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Sent keepAlive
18:14:04,622 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Dequeued event null
18:14:04,622 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Sent keepAlive

# The following events were NOT received by Logstash service
18:15:04,623 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Dequeued event null
18:15:04,623 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Sent keepAlive
18:16:04,624 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Dequeued event null
18:16:04,624 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Sent keepAlive
...
18:28:04,636 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Dequeued event null
18:28:04,636 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Sent keepAlive
18:29:04,637 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Dequeued event null
18:29:04,637 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Sent keepAlive
18:29:04,637 |-WARN in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Log destination 10.0.4.135:4560: unable to send event: No route to host java.net.SocketException: No route to host
    at java.net.SocketException: No route to host
    at  at java.net.SocketOutputStream.socketWrite0(Native Method)
    at  at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
    at  at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
    at  at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
    at  at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
    at  at net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler.onEvent(AbstractLogstashTcpSocketAppender.java:311)
    at  at net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler.onEvent(AbstractLogstashTcpSocketAppender.java:195)
    at  at net.logstash.logback.encoder.com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
    at  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at  at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at  at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:165)
    at  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:266)
    at  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
    at  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
    at  at java.lang.Thread.run(Thread.java:679)
18:29:04,649 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Log destination 10.0.4.135:4560: connection established.

# The following events were received by the Logstash service
18:29:04,649 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Sent keepAlive
18:30:04,650 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Dequeued event null
18:30:04,650 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[LOGSTASH] - Sent keepAlive

@philsttr
Copy link
Collaborator

philsttr commented Apr 6, 2015

Hi @kretz,

Sorry for delay, I've been away for a while.

I've removed the call to System.lineSeparator, so java 6 should work now.

I've still been trying to reproduce this, without success. I know this isn't much help, but every time I try, the appender is able to reconnect and sends the enqueued messages. For example:

## Startup with logstash listening

17:13:39,224 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Log destination localhost:4561: connection established.
17:13:39,225 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 0
17:13:39,529 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 1
17:13:39,577 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 0
17:13:39,578 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 1
17:13:39,578 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 1
17:13:40,030 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 2
17:13:40,030 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 2
17:13:40,031 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 2
17:13:40,530 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 3
17:13:40,531 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 3
17:13:40,536 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 3
17:13:41,031 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 4
17:13:41,031 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 4
17:13:41,032 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 4
17:13:41,532 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 5
17:13:41,532 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 5
17:13:41,533 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 5
17:13:42,033 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 6
17:13:42,034 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 6
17:13:42,035 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 6
17:13:42,534 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 7
17:13:42,535 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 7
17:13:42,536 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 7
17:13:43,035 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 8
17:13:43,036 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 8


## Killed logstash


17:13:43,038 |-WARN in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Log destination localhost:4561: unable to send event: Broken pipe java.net.SocketException: Broken pipe
    at java.net.SocketException: Broken pipe
    at  at java.net.SocketOutputStream.socketWrite0(Native Method)
    at  at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
    at  at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
    at  at sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:377)
    at  at sun.security.ssl.OutputRecord.write(OutputRecord.java:363)
    at  at sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:837)
    at  at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:808)
    at  at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:122)
    at  at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
    at  at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
    at  at net.logstash.logback.encoder.CompositeJsonEncoder.doEncode(CompositeJsonEncoder.java:83)
    at  at net.logstash.logback.encoder.CompositeJsonEncoder.doEncode(CompositeJsonEncoder.java:33)
    at  at net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler.onEvent(AbstractLogstashTcpSocketAppender.java:299)
    at  at net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler.onEvent(AbstractLogstashTcpSocketAppender.java:194)
    at  at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
    at  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
    at  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
    at  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at  at java.lang.Thread.run(Thread.java:745)
17:13:43,072 |-WARN in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Log destination localhost:4561: connection failed. Waiting 4969ms before attempting reconnection. java.net.ConnectException: Connection refused
    at java.net.ConnectException: Connection refused
    at  at java.net.PlainSocketImpl.socketConnect(Native Method)
    at  at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
    at  at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
    at  at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
    at  at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at  at java.net.Socket.connect(Socket.java:579)
    at  at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:625)
    at  at net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler.openSocket(AbstractLogstashTcpSocketAppender.java:372)
    at  at net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler.reopenSocket(AbstractLogstashTcpSocketAppender.java:348)
    at  at net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler.onEvent(AbstractLogstashTcpSocketAppender.java:323)
    at  at net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler.onEvent(AbstractLogstashTcpSocketAppender.java:194)
    at  at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
    at  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
    at  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
    at  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at  at java.lang.Thread.run(Thread.java:745)
17:13:43,536 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 9
17:13:44,037 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 10
17:13:44,539 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 11
17:13:45,039 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 12
17:13:45,540 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 13
17:13:46,041 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 14
17:13:46,542 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 15
17:13:47,043 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 16
17:13:47,544 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 17
17:13:48,045 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 18
17:13:48,077 |-WARN in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Log destination localhost:4561: connection failed. Waiting 4970ms before attempting reconnection. java.net.ConnectException: Connection refused
    at java.net.ConnectException: Connection refused
    at  at java.net.PlainSocketImpl.socketConnect(Native Method)
    at  at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
    at  at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
    at  at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
    at  at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at  at java.net.Socket.connect(Socket.java:579)
    at  at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:625)
    at  at net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler.openSocket(AbstractLogstashTcpSocketAppender.java:372)
    at  at net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler.reopenSocket(AbstractLogstashTcpSocketAppender.java:348)
    at  at net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler.onEvent(AbstractLogstashTcpSocketAppender.java:323)
    at  at net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler.onEvent(AbstractLogstashTcpSocketAppender.java:194)
    at  at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
    at  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
    at  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
    at  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at  at java.lang.Thread.run(Thread.java:745)
17:13:48,546 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 19
17:13:49,047 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 20
17:13:49,548 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 21
17:13:50,049 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 22
17:13:50,550 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 23
17:13:51,051 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 24
17:13:51,552 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 25
17:13:52,053 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 26
17:13:52,554 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 27
17:13:53,058 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 28
17:13:53,067 |-WARN in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Log destination localhost:4561: connection failed. Waiting 4981ms before attempting reconnection. java.net.ConnectException: Connection refused
    at java.net.ConnectException: Connection refused
    at  at java.net.PlainSocketImpl.socketConnect(Native Method)
    at  at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339)
    at  at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200)
    at  at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182)
    at  at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at  at java.net.Socket.connect(Socket.java:579)
    at  at sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:625)
    at  at net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler.openSocket(AbstractLogstashTcpSocketAppender.java:372)
    at  at net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler.reopenSocket(AbstractLogstashTcpSocketAppender.java:348)
    at  at net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler.onEvent(AbstractLogstashTcpSocketAppender.java:323)
    at  at net.logstash.logback.appender.AbstractLogstashTcpSocketAppender$TcpSendingEventHandler.onEvent(AbstractLogstashTcpSocketAppender.java:194)
    at  at com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:128)
    at  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at  at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
    at  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
    at  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at  at java.lang.Thread.run(Thread.java:745)
17:13:53,559 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 29
17:13:54,060 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 30
17:13:54,561 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 31
17:13:55,062 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 32
17:13:55,563 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 33
17:13:56,064 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 34
17:13:56,565 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 35
17:13:57,066 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 36
17:13:57,567 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 37
17:13:58,077 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 38


## Restarted logstash


17:13:58,094 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Log destination localhost:4561: connection established.
17:13:58,363 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 8
17:13:58,364 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 9
17:13:58,364 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 9
17:13:58,364 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 10
17:13:58,365 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 10
17:13:58,365 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 11
17:13:58,366 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 11
17:13:58,366 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 12
17:13:58,367 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 12
17:13:58,367 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 13
17:13:58,367 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 13
17:13:58,367 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 14
17:13:58,368 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 14
17:13:58,368 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 15
17:13:58,369 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 15
17:13:58,369 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 16
17:13:58,370 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 16
17:13:58,370 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 17
17:13:58,370 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 17
17:13:58,370 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 18
17:13:58,371 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 18
17:13:58,371 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 19
17:13:58,372 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 19
17:13:58,372 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 20
17:13:58,373 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 20
17:13:58,373 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 21
17:13:58,373 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 21
17:13:58,373 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 22
17:13:58,374 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 22
17:13:58,374 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 23
17:13:58,375 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 23
17:13:58,375 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 24
17:13:58,376 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 24
17:13:58,376 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 25
17:13:58,376 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 25
17:13:58,376 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 26
17:13:58,377 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 26
17:13:58,377 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 27
17:13:58,385 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 27
17:13:58,385 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 28
17:13:58,385 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 28
17:13:58,386 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 29
17:13:58,386 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 29
17:13:58,386 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 30
17:13:58,387 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 30
17:13:58,387 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 31
17:13:58,388 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 31
17:13:58,388 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 32
17:13:58,389 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 32
17:13:58,389 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 33
17:13:58,389 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 33
17:13:58,389 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 34
17:13:58,390 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 34
17:13:58,390 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 35
17:13:58,391 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 35
17:13:58,391 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 36
17:13:58,392 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 36
17:13:58,392 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 37
17:13:58,392 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 37
17:13:58,392 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 38
17:13:58,393 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 38
17:13:58,577 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 39
17:13:58,577 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 39
17:13:58,578 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 39
17:13:59,078 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 40
17:13:59,079 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 40
17:13:59,080 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 40
17:13:59,579 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 41
17:13:59,580 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 41
17:13:59,581 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 41
17:14:00,080 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 42
17:14:00,080 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 42
17:14:00,082 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 42
17:14:00,581 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 43
17:14:00,581 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 43
17:14:00,583 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 43
17:14:01,082 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Enqueued event [WARN] 44
17:14:01,082 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Dequeued event [WARN] 44
17:14:01,084 |-INFO in net.logstash.logback.appender.LogstashTcpSocketAppender[tcp] - Sent event [WARN] 44

The main difference I see is that I receive java.net.SocketException: Broken pipe and you receive java.net.SocketException: No route to host. I'm not sure if that makes a difference or not.

I'm a little puzzled by your log when you are not using keepAlive. A few questions:

  • For the events that are enqueued while logstash is down (e.g. Enqueued event DDD), do you ever see corresponding dequeued messages?
  • There are some dequeue messages (e.g. Dequeued event YYY) where I do not see any corresponding enqueued messages. This is puzzling. If the messages were logged, but you did not post them, can you reproduce this test case and post the entire log?

Regarding this question...

Was the writing thread stuck in sending the event?

The writing thread will attempt reconnection by polling. The writing thread is the thread that logs the SocketException and the connection established messages. The writing thread doesn't do anything else while the connection is not established. It just continually tries to reconnect.

@kretz
Copy link
Author

kretz commented Apr 6, 2015

In this specific test I can't remember if I ever saw the DDD message being dequeued. But on other tests I've done, the buffering seems to work and all queued messages are sent on reconnection.

And the reason you do not see any YYY is because I replaced the payload of entries (it is sensitive log data). All log messages had the same payload, so I could not map what was enqueued with what was dequeued. So while message QQQ may have been the same as YYY, I was not sure. Sorry for the confusion.

kenoir referenced this issue in guardian/path-manager Apr 8, 2015
Hopefully this will stop it getting killed
@philsttr
Copy link
Collaborator

Hi @kretz, are you still having this issue? Do you have any additional details that might help me reproduce this?

@philsttr philsttr changed the title v4.2 does not reconnect to logstash v4.2 does not reconnect to logstash in some cases May 18, 2015
@kretz
Copy link
Author

kretz commented May 18, 2015

Hi,

I tried zeroing in on what the problem might be, and get additional data,
but no luck.

And recently I have not had the time to look at it, and accepted the
timeout behaviour since I had to move on to new projects.

On Mon, May 18, 2015 at 3:40 AM philsttr [email protected] wrote:

Hi @kretz https://github.com/kretz, are you still having this issue? Do
you have any additional details that might help me reproduce this?


Reply to this email directly or view it on GitHub
#85 (comment)
.

@phillro
Copy link

phillro commented May 20, 2015

I'm seeing this with 4.3 as well.

(1) If logstash restart it looses the connection immediately and does not reconnect.
(2) After a few minutes it drops the connection and never reconnects.

@philsttr
Copy link
Collaborator

Hi @phillro

Since I haven't been able to reproduce this myself, I'm going to need to gather some information from you...

  1. Add a status listener to your logback config to see all status messages from the appender
<statusListener class="ch.qos.logback.core.status.OnConsoleStatusListener" />
  1. Ensure that you are using keepAliveDuration or are actively logging events. The appender will not attempt to reconnect unless keepAliveDuration is set, or events are actively being logged.

  2. Reproduce the behaviour, and send me any status messages that you encounter.

  3. If events are actively being logged from your application, but not being received on logstash's side, send me a thread dump of your application.

In addition, let me know anything special that is on your network between your application and logstash (load balancer, etc).

Thanks

@AveryRegier
Copy link

I am experiencing similar behavior with 4.4 at AWS with logstash behind the ELB with a TCP forwarder on 4560. When I redeploy logstash, the logs stop getting to logstash until I restart the application and it reconnects. I've waited half an hour and nothing.

What about a workaround? Every few minutes, create a new connection, then replace the existing one with the new one, and terminate the old one. Then even if a connection does get stuck, it won't require human intervention to recover.

@philsttr
Copy link
Collaborator

Hi Avery, I'll consider implementing the workaround.

However, I'd really like to know the root cause. Can you help me to debug since I haven't been able to reproduce this? I've added the 4 steps that will help me in the comment above.

In addition, if you have java debugging experience, I'd really appreciate it if you could step through the application and let me know what you see. Particularly in AbstractLogstashTcpSocketAppender.TcpSendingEventHandler.onEvent.

Thanks

philsttr added a commit that referenced this issue Aug 24, 2015
…ed the end of the stream.

This fixes a condition where the server-side terminates the connection,
but the client-side wasn't detecting it, leaving the client socket in CLOSE_WAIT.
@philsttr
Copy link
Collaborator

To detect that the server-side has closed the socket, I have added a reader thread that just reads from the socket input stream (the server-side's output stream).

When the input stream reaches the end of the stream, the code now makes the assumption that the socket has been closed on the server side (rather than just being in a half-open state), and reconnects to the server. This will work as long as logstash's tcp input doesn't shutdown its output stream while keeping the input stream open.

I have tested this against Amazon's ELB, and it seems to solve the problem. i.e. the appender will reconnect when an instance behind the ELB closes. I tested this with "Connection Draining" disabled on the ELB. Not sure if that makes a difference.

The fix will be in version 4.5, which I'll release in the next week or so.

@philsttr
Copy link
Collaborator

Released in 4.5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants