-
-
Notifications
You must be signed in to change notification settings - Fork 931
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SshOperationTimeout issue while using renci library #840
Comments
Adding @drieseng |
@vivek4434 I wasn't involved when the initial version of SSH.NET was implemented. I suppose the concurrent authentication threshold was put in place because some servers reject all too many concurrent login attempts, but I can't be sure of that. Do you establish multiple connections to the same server (with the same credentials)? Can you access (SSH) logs on these routers? |
I was having the exact same problem awhile ago with Cisco gear. The problem would seem to be somewhat random; however I could keep re-trying and it would eventually work. I could SSH into the gear without issue using Putty. I downgraded to 2016.1.0 and all of the problems went away. I thought it might have to do with executing a command via the SSH library, not writing the commands to an output stream. That may, or may not, be true. Perhaps try downgrading and see if the problem goes away? What kind of gear are you using? Can you share the make, model, and version of code they are running? |
@drieseng , yes, I am trying to connect multiple routers and that too with same and different credentials i.e depends on device. @giantjunkbox , I am trying different devices ... like Arista, Juniper, Cisco etc. When you downgraded to 2016.1.0 do you see any eye catching difference in code? Let me try to share more info on version currently in use. |
@giantjunkbox could you please help in checking whether the version you are using contains below code changes:
in class : KeyExchangeDiffieHellman.cs |
@vivek4434 I'm using the 2020.0.1 build from nuget. The method in that version appears to be different then the above. I included it below. Is there newer code checked in that has not been released yet? Thank you
|
@vivek4434 2016.1.0 was a drop in replacement for what I was using it for. Presumably there are some new features or what not that caused the version number to bump. Easy way to tell, just downgrade the package and see if your code still builds ;). |
Hi @giantjunkbox , I tried making few changes in the code, now I am seeing issue : Wondering if you ever faced this issue. |
@vivek4434 I haven't. The only issue I have encountered is the timeout issue. Can you confirm that the older version works for you? If you aren't doing it already, perhaps working backwards from what changed would be good. It is also interesting, at least to me, that it is somewhat random. In my mind that means either a race condition or a deliberately random part of the connection setup is making poor choices. Thanks |
It's done in #1274 |
The 2023.0.1 version has been released to Nuget: https://www.nuget.org/packages/SSH.NET/2023.0.1 |
Hi everyone,
We are using renci library for SSHing into routers. In past, ssh timeout issue was around but the count was low. But now we are seeing the timeout errors more frequently. Below are error logs in failure calls:
FYI : Text in bold are custom logs.
ReceiveMessage from server: 'KeyExchangeInitMessage': 'SSH_MSG_KEXINIT'.
Inside : OnKeyExchangeInitReceived
ReceiveMessage from server: 'KeyExchangeDhGroupExchangeGroup': 'SSH_MSG_KEX_DH_GEX_GROUP'.
ReceiveMessage from server: 'KeyExchangeDhGroupExchangeReply': 'SSH_MSG_KEX_DH_GEX_REPLY'.
ReceiveMessage from server: 'NewKeysMessage': 'SSH_MSG_NEWKEYS'.
ReceiveMessage from server: 'ServiceAcceptMessage': 'SSH_MSG_SERVICE_ACCEPT'.
ReceiveMessage from server: 'FailureMessage': 'SSH_MSG_USERAUTH_FAILURE'.
ReceiveMessage from server: 'InformationRequestMessage': 'SSH_MSG_USERAUTH_INFO_REQUEST'.
ReceiveMessage from server: 'FailureMessage': 'SSH_MSG_USERAUTH_FAILURE'.
ReceiveMessage from server: 'FailureMessage': 'SSH_MSG_USERAUTH_FAILURE'.
Renci.SshNet.Common.SshOperationTimeoutException: Session operation has timed out
at Renci.SshNet.Session.WaitOnHandle(WaitHandle waitHandle)
at Renci.SshNet.Session.Connect()
….
Sometime we do see below logs:
ReceiveMessage from server: 'KeyExchangeInitMessage': 'SSH_MSG_KEXINIT'.
Inside : OnKeyExchangeInitReceived
System.Net.Sockets.SocketException (0x80004005): An established connection was aborted by the software in your host machine
at Renci.SshNet.Session.WaitOnHandle(WaitHandle waitHandle)
at Renci.SshNet.Session.Connect()
….
Sometime we see below logs:
ReceiveMessage from server: 'KeyExchangeInitMessage': 'SSH_MSG_KEXINIT'.
Inside : OnKeyExchangeInitReceived
ReceiveMessage from server: 'KeyExchangeDhGroupExchangeGroup': 'SSH_MSG_KEX_DH_GEX_GROUP'.
ReceiveMessage from server: 'KeyExchangeDhGroupExchangeReply': 'SSH_MSG_KEX_DH_GEX_REPLY'.
Renci.SshNet.Common.SshOperationTimeoutException: Session operation has timed out
at Renci.SshNet.Session.WaitOnHandle(WaitHandle waitHandle)
at Renci.SshNet.Session.Connect()
….
We tried adding more logs on the client side to analyze what causing the sshtimeout, we observed that, failure is coming at different stages before Channel open success stage as :
Connection timeout at each wait handle (KeyExchange, ServiceAccept, UserAuth) were increased from default 30 to 180 with little improvements.
What we experimented in renci code?
While trying to make parallel calls, we observed that calls were waiting for semaphore to acquire. After looking into renci code, it was noticed that the semaphore value was hardcoded to 30 (old renci value), which was stopping 31st call to go-in.
And this causes thread to accumulate and end result is – our application was end-up running with high thread counts and hence it requires timely restart otherwise application becomes unresponsive. Below code reference:
SSH.NET/Session.cs at develop · sshnet/SSH.NET (github.com)
Private static readonly SemaphoreLight AuthenticationConnection = new semaphoreLight(3) – Semaphore is static and is shared across sessions
To overcomes this problem we thought of increasing semaphore value to 3000 and observe the application performance but since then we are seeing increase in SshOperationTimeout issue.
Logically speaking, increasing semaphore value is just making more calls to go-in instead of waiting for semaphore. So, as per our understanding it should not create any problem but we would like to double check on it.
Is there any limit in Renci wrt number of concurrent ssh sessions ?
Can anyone from community help to resolve the problem or have insights around the same?
The text was updated successfully, but these errors were encountered: