-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/crypto/ssh: ssh.Channel should provide idle timeouts #21420
Comments
As an example of deadlock prone code, the channel receive https://github.com/golang/crypto/blob/master/ssh/mux.go#L322
update: Here unfortunately there is no recourse to Close()-ing the OpenChannel call, since there is no channel yet. Even the top level ssh.Client is no help, for even if one Close()-es the ssh.Client on which the call was made, the deadlock at line 322 remains. |
Upon prototyping the net.Conn approach, I discovered that the deadline is too low level to be useful with io.Copy(), in paritcular for sending big files that need some time to complete. Instead I've implemented a high level idle-timeout facility in my branch |
What you point out as a deadlock isn't really a deadlock if the remote side dies. That channel will be closed if the remote site dies, see https://github.com/golang/crypto/blob/81e90905daefcd6fd217b62423c0908922eadb30/ssh/mux.go#L194 . However, I still think there should be idle timeouts, we shouldn't wait forever for the remote side to answer. |
Just realized this issue doesn't really address my problem, I'll open a new issue. edit: never mind, realized you already opened a very similar one so I added a comment there |
"It would be incredibly useful for securing connections everywhere if ssh.Channel implemented the full net.Conn interace. The only missing methods are for time-outs based on setting read and write deadlines." what does 'securing' connections mean? OpenChannel() is a global request, so if we elect to ignore the answer (if we think it takes too long), it will upset the state machine, meaning that you have close the whole ssh.Conn. |
I meant that all the Go programs everywhere that use net.Conn to do TCP... could transparently change to use ssh. And those would be much more secure connections.
I don't see why the upset would be intrinsic. Is it just an artifact of the implementation status quo? |
let me rephrase. Suppose the timeout triggers, how does your program proceed? It cannot try to open another channel, since the global request lock has been taken. (the global request lock is implied by the RFC) |
btw I love this library, and I'm so glad that you work on it. I don't mean to be critical at all. Ok, back to the conversation:
Interesting. I had been assuming I as an application programmer would only care about timeouts on channels that had been successfully established. I assumed the channel open would succeed or fail fairly quickly one way or the other as it stands. Is that not a correct assumption? i.e. can OpenChannel() hang? |
my question is specifically about the hang in l.322 . Your assumption in the last comment is correct, but I was under the impression you wanted to have a way to set a timeout on that (ie. on the OpenChannel method). My response is that this mostly useless, if OpenChannel is stuck, the protocol is organized such that you have to throw away the whole connection rather than just the (pending) channel. We could try to add timeouts to SSH to mimick TCP, but the timeouts would be on the ssh.Conn, not ssh.Channel, as there is usually no way for the channel to fail without the connection itself being broken too. |
So my use case actually sees this alot, basically because it is forwarding a stream from a third party. In a diagram:
So the connection from P2 to P1 is fine, but when the connection from P3 to P2 dies, the channel that is forwarding over P2 to P1 ends up needing an idle timeout, because P3 is gone. And in the meantime P4 to P1 traffic can be fine, so the whole connection from P2 to P1 should be preserved. I appreciate the discussion, but the long and short is that I've got this all working on my branch now; i.e. per Channel timeouts work fine (for reads, never got writes of course). So you're welcome to pull from it as you like to add features. And we can probably close these tickets. |
@glycerine - to me, your diagram is not an argument for adding timeouts to the Channel but rather that forwarding code should detect TCP timeouts and result in closing the Channel that provides the forwarded stream. |
a) nobody wants to rely upon TCP timeouts; they aren't always available or reliable, moreover a unix domain socket won't have them. And logically it doesn't work because the timeout isn't going to be channel specific. If you did get a timeout, which of many channels should it apply to? No way to tell. You could be forwarding a foward and the (inductively) the drop was before you, possibly many steps back. b) The suggestion breaks the io.Writer abstraction. Imagine you are inside io.Copy: that would require that each Write() know the Read() that its source material came from. Ugh. Currently my Channels implement net.Conn, which I am happy about. |
I still don't get it. If P3 -> P2 is gone, then the copy goroutine that forwards over p2 -> p1 will error out, presumably leading to the forwarding channel being closed. Why does that not work as expected? |
Hmm... how does that error out happen without a timeout on the Channel? |
Sorry, misphrased. P2 is the remote server that runs some flavor of sshd. If the connection P3 -> P2 dies, then the server at P2 will notice this at some point. When that happens, it should close down the channel. What sshd is on the other side? |
I thought we were discussing is the converse... Can one specific Channel need a timeout while the underlying Connection is okay? I provided one specific example, but the case is more general, and its not necessarily that an SSH server is involved at all upstream. Channel timeouts handle more general "lack of response" as well as full connection drop somewhere. i.e. someone's goroutine is hung due to logic error, waiting on user input, deadlock, livelock, a long DNS resolve, etc. |
I guess we have an impedance mismatch. I'm trying to understand problems that you have that require changes to the library to be solved. Is there a reason you cannot use a wrapper around ssh.Channel to mimick a net.Conn? |
Several. I wouldn't have forked otherwise, of course. Notice the changes in xcryptossh for details, but in summary: one needs notification or a callback at the start of every Read, every Write, every packet receive, and during continuous operation. And more importantly, one must be able to interrupt the sync.Cond based blocks without closing the connection, e.g. buffer.go (https://github.com/glycerine/xcryptossh/blob/master/buffer.go#L69) and https://github.com/glycerine/xcryptossh/blob/master/channel.go#L504 ). |
I think you're making it too complicated. See https://play.golang.org/p/o_d3T-3irc for an idea. |
That's a deadline, not an idle timeout. Surely by now you've read the discussion above and at xcyptossh for why that's insufficient. Also refer to the title of this ticket. |
Do you want functionality that is like // Read until buf is full or timeout has passed, whichever comes first sorry for being dense, but I still haven't fully grasped what you need. |
(with the function implying that the channel stays alive if the timeout is hit.) |
So I'm going to close this out. It's been a useful and informative discussion. I'm happy with my https://github.com/glycerine/xcryptossh fork and continue to experiment with it, and it makes sense to me that x/crypto/ssh should stay simpler and with a fully backwards compatible API.
The current API looks like this https://github.com/glycerine/xcryptossh/blob/master/channel.go#L94 (lines 94-164). I'm still working on the application layer that uses it, so it may evolve further. Status() is rather new, and I've currently exposed the IdleTimers as a part of the API because I was thinking that was needed. However I'm refactoring at the moment to split the application into two layers, a lower layer that simply maintains ssh.Connections, reconnecting whenever needed, and an upper layer that only deals in ssh.Channel. |
It would be incredibly useful for securing connections everywhere if ssh.Channel implemented the full net.Conn interace. The only missing methods are for time-outs based on setting read and write deadlines.
In addition, presently when using ssh.Channel, it is common to leak Goroutines. Without deadlines for read and write, those blocking Read/Write calls on ssh.Channel will never let the goroutine return. I have found that ssh.Channel.Close() from a separate goroutine will unblock Reads, but this is quite cumbersome and would still require every piece of code currently using net.Conn to need much additional logic and elaborate coordination just to have simple read and write deadlines.
A related but distinct api request: #20288
I inquired of Han-wen and over email and he suggesting filing a bug, hence this ticket.
Emerson suggested the context.Context support might be nice as well, but that's not in the net.Conn interface. However once implemented, the net.Conn interface would make adding context support easier.
@hanwen @Gh0u1L5
The text was updated successfully, but these errors were encountered: