Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent blocking forever when transport channel fails to open #11875

Merged
merged 15 commits into from
Apr 20, 2022
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion api/utils/sshutils/conn.go
Original file line number Diff line number Diff line change
Expand Up @@ -59,10 +59,12 @@ func ConnectProxyTransport(sconn ssh.Conn, req *DialReq, exclusive bool) (*ChCon

channel, discard, err := sconn.OpenChannel(constants.ChanTransport, nil)
if err != nil {
ssh.DiscardRequests(discard)
return nil, false, trace.Wrap(err)
}

// DiscardRequests will return when the channel or underlying connection is closed.
go ssh.DiscardRequests(discard)

// Send a special SSH out-of-band request called "teleport-transport"
// the agent on the other side will create a new TCP/IP connection to
// 'addr' on its network and will start proxying that connection over
Expand Down
172 changes: 172 additions & 0 deletions api/utils/sshutils/conn_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,172 @@
/*
Copyright 2022 Gravitational, Inc.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package sshutils

import (
"crypto/rand"
"crypto/rsa"
"crypto/x509"
"encoding/pem"
"net"
"sync"
"testing"
"time"

"github.com/gravitational/teleport/api/constants"
"github.com/stretchr/testify/require"
"golang.org/x/crypto/ssh"
)

type server struct {
listener net.Listener
config *ssh.ServerConfig
handler func(*ssh.ServerConn)
t *testing.T
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe this is my personal preference but I'd remove t from here and just pass it as an argument to each function that needs it. I think it's easier to understand the code if you know which function needs t, but if you want to keep it here I don't mind either.

mu sync.RWMutex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently it's hard to say what this mutex is protecting. From what I see it's used for closed flag. Can you rename it to myClosed for example to indicate that?

closed bool

cSigner ssh.Signer
hSigner ssh.Signer
}

func (s *server) Run() {
for {
conn, err := s.listener.Accept()

s.mu.RLock()
if s.closed {
s.mu.RUnlock()
return
}
s.mu.RUnlock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need the closed flag here? After you close the listener you will get EOF or connection closed error which you can use to know that the listener has been closed. Example:

if utils.IsOKNetworkError(err) {


require.NoError(s.t, err)

go func() {
defer conn.Close()
sconn, _, _, err := ssh.NewServerConn(conn, s.config)
require.NoError(s.t, err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using require within a goroutine is a bad idea. If the assertion fails it calls t.FailNow and according to the docs shouldn't be called from spawned goroutines:

FailNow marks the function as having failed and stops its execution by calling runtime.Goexit (which then runs all deferred calls in the current goroutine). Execution will continue at the next test or benchmark. FailNow must be called from the goroutine running the test or benchmark function, not from other goroutines created during the test. Calling FailNow does not stop those other goroutines.
https://pkg.go.dev/testing#T.FailNow

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops didn't realize this. Updated a59074c

s.handler(sconn)
}()
}
}

func (s *server) Stop() error {
s.mu.Lock()
defer s.mu.Unlock()
s.closed = true
return s.listener.Close()
}

func generateSigner(t *testing.T) ssh.Signer {
private, err := rsa.GenerateKey(rand.Reader, 2048)
require.NoError(t, err)

block := &pem.Block{
Type: "RSA PRIVATE KEY",
Bytes: x509.MarshalPKCS1PrivateKey(private),
}

privatePEM := pem.EncodeToMemory(block)
signer, err := ssh.ParsePrivateKey(privatePEM)
require.NoError(t, err)

return signer
}
Comment on lines +65 to +79
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a helper for that already

func GenerateKeyPair(passphrase string) ([]byte, []byte, error) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we import any thing from /lib into /api so leaving this as is.


func (s *server) GetClient() (ssh.Conn, <-chan ssh.NewChannel, <-chan *ssh.Request) {
conn, err := net.Dial("tcp", s.listener.Addr().String())
require.NoError(s.t, err)

sconn, nc, r, err := ssh.NewClientConn(conn, "", &ssh.ClientConfig{
Auth: []ssh.AuthMethod{ssh.PublicKeys(s.cSigner)},
HostKeyCallback: ssh.FixedHostKey(s.hSigner.PublicKey()),
})
require.NoError(s.t, err)

return sconn, nc, r
}

func newServer(t *testing.T, handler func(*ssh.ServerConn)) *server {
listener, err := net.Listen("tcp", "localhost:0")
require.NoError(t, err)

cSigner := generateSigner(t)
hSigner := generateSigner(t)

config := &ssh.ServerConfig{
NoClientAuth: true,
}
config.AddHostKey(hSigner)

return &server{
listener: listener,
config: config,
handler: handler,
t: t,
cSigner: cSigner,
hSigner: hSigner,
}
}

// TestTransportError ensures ConnectProxyTransport does not block forever
// when an error occurs while opening the transport channel.
func TestTransportError(t *testing.T) {
errC := make(chan error)

server := newServer(t, func(sconn *ssh.ServerConn) {
_, _, err := ConnectProxyTransport(sconn, &DialReq{
Address: "test", ServerID: "test",
}, false)
errC <- err
})

go server.Run()
defer server.Stop()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
defer server.Stop()
t.Cleanup(func() {require.NoError(t, server.Stop()})


sconn, nc, _ := server.GetClient()
defer sconn.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
defer sconn.Close()
t.Cleanup(func() {require.NoError(t, sconn.Close()})

channel := <-nc
require.Equal(t, channel.ChannelType(), constants.ChanTransport)

sconn.Close()
err := timeoutErrC(t, errC, time.Second*5)
require.Error(t, err)

sconn, nc, _ = server.GetClient()
defer sconn.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
defer sconn.Close()
t.Cleanup(func() { require.NoError(t, sconn.Close()})

channel = <-nc
require.Equal(t, channel.ChannelType(), constants.ChanTransport)

err = channel.Reject(ssh.ConnectionFailed, "test reject")
require.NoError(t, err)

err = timeoutErrC(t, errC, time.Second*5)
require.Error(t, err)
}

func timeoutErrC(t *testing.T, errC <-chan error, d time.Duration) error {
timeout := time.NewTimer(d)
select {
case err := <-errC:
return err
case <-timeout.C:
require.FailNow(t, "failed to receive on err channel in time")
}

return nil
}