-
Notifications
You must be signed in to change notification settings - Fork 361
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
heartbeat timeout of a session will cause the links of other sessions to be disconnected #461
Comments
If you have a heartbeat configured, you may want to check its configuration. The properties defined in |
heartbeat code |
Sorry for the long message, but when I looked at this it took me surprisingly long to find my way through the code. I must say I find the implementation of heartbeats very confusing. TL;DR: An immediate work-around for the reporter (@d9e7381f ) might be to set Long version: First, Apache MINA SSHD resets the session idle timeout whenever a message is received or is written. So even sending a heartbeat with So Second, heartbeats are implemented twice: once in
Heartbeats in Plus there's an additional quirk: This is a mess. There should be only one set of properties governing heartbeats. The two separate mechanisms seem to have existed for a long time, and changing that now would be an API break. The 5 minutes default value comes from commit 14bbd54, which was about SSHD-1020. Reading that discussion the issue seems to have been a low-level network I/O read timeout (NIO2_READ_TIMEOUT) of ~10 min, and HEARTBEAT_REPLY_WAIT = 0. So even though the client sent keep-alives every 90 sec, and the SSH idle timeout got reset, the session got killed because nothing was received for 10min because the server never sent any reply to these heartbeats because none was asked for. The solution was to set NIO2_READ_TIMEOUT = 0 (no timeout for low-level I/O reads) and that huge HEARTBEAT_REPLY_WAIT. An immediate work-around for the reporter might be to set But we should re-think this mechanism anyway. I propose to change this to follow the path taken by OpenSSH:
If If > 0, increment a counter whenever a heartbeat is to be sent. If the counter is then > HEARTBEAT_NO_REPLY_MAX, throw an exception (killing the session). Otherwise send the heartbeat request (as an asynchronous global request) with @gnodet : does that sound reasonable to you? |
Switch from a timeout model to the OpenSSH model: fail if there are more than a certain number of heartbeats for which no reply was received yet. Bug: apache#461
Switch from a timeout model to the OpenSSH model: fail if there are more than a certain number of heartbeats for which no reply was received yet. Bug: apache#461
Any comments on the PR #507? If none, I'll assume it's fine and will merge tomorrow. |
Switch from a timeout model to the OpenSSH model: fail if there are more than a certain number of heartbeats for which no reply was received yet. Bug: apache#461
Switch from a timeout model to the OpenSSH model: fail if there are more than a certain number of heartbeats for which no reply was received yet. Bug: apache#461
Version
2.6.0
Bug description
In my case, 20 connections were established using the same mina ssh client, 10 to A-server(mina ssh server) and the others to B-server(mina ssh server). For a period of time, the network from client to B-server was not smooth. At this time, we found that the session from client to A-server was also disconnected.
i got some log from A-server:
INFO - sshd-SshServer[150d80c4](port=8822)-timer-thread-1 [] c.s.p.t.s.s.CustomServerSessionImpl []: Disconnecting(CustomServerSessionImpl[classic-tx-beijing-01:hybrid01.classic-tx-beijing-01.8330@/10.53.4.116:12033]): SSH2_DISCONNECT_PROTOCOL_ERROR - Detected IdleTimeout after PT40.068S/PT40S ms.
I checked the mina ssh code and found that the same ssh client shares a thread for sending heartbeats. This will cause one session to send heartbeats and block other sessions from sending heartbeats, causing other sessions to be disconnected.
Actual behavior
In my case, 20 connections were established using the same mina ssh client, 10 to A-server(mina ssh server) and the others to B-server(mina ssh server). For a period of time, the network from client to B-server was not smooth. At this time, we found that the session from client to A-server was also disconnected.
i got some log from A-server:
INFO - sshd-SshServer[150d80c4](port=8822)-timer-thread-1 [] c.s.p.t.s.s.CustomServerSessionImpl []: Disconnecting(CustomServerSessionImpl[classic-tx-beijing-01:hybrid01.classic-tx-beijing-01.8330@/10.53.4.116:12033]): SSH2_DISCONNECT_PROTOCOL_ERROR - Detected IdleTimeout after PT40.068S/PT40S ms.
Expected behavior
When the network from client to B-server is not smooth, the channel from client to A-server should not be disconnected.
Relevant log output
Other information
No response
The text was updated successfully, but these errors were encountered: