-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client: set TCP_USER_TIMEOUT socket option for linux #2307
Conversation
See gRPC proposal A18.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. This approach does seem pretty reasonable to me. One comment in-line.
internal/transport/http2_client.go
Outdated
@@ -252,6 +253,9 @@ func newHTTP2Client(connectCtx, ctx context.Context, addr TargetInfo, opts Conne | |||
t.channelzID = channelz.RegisterNormalSocket(t, opts.ChannelzParentID, "") | |||
} | |||
if t.kp.Time != infinity { | |||
if err = syscall.SetTCPUserTimeout(t.conn, kp.Timeout); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need to be done before handshaking ~line 170, otherwise you won't have a TCPConn anymore.
Also, you need to only do this for TCP connections. The dialer could return a net.Conn backed by a UDS instead.
SetTCPUserTimeout is predicated upon keepaliveEnabled. SetTCPUserTimeout requires access to the underlying TCPConn and must be called before handshaking.
Sorry for the delay here. This implementation LGTM; could you please rebase and resolve any conflicts, and then see if there's a reasonable way to test this? Maybe something in internal/transport/transport_test.go that creates a transport and then reads the socket options back to make sure it has been set? |
net.TCPConn.File() was causing tests to time out. The docs indicate File() returns a different file descriptor than the connection's.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some lint errors showing in in CI:
internal/syscall/syscall_linux.go:36:5: error var GetTCPUserTimeoutNoopError should have name of the form ErrFoo
internal/syscall/syscall_linux.go:36:5: exported var GetTCPUserTimeoutNoopError should have comment or be unexported
And this fails in 1.8, which we will be supporting for another 2 weeks:
internal/syscall/syscall_linux.go:82: tcpconn.SyscallConn undefined (type *net.TCPConn has no field or method SyscallConn)
internal/syscall/syscall_linux.go:103: tcpconn.SyscallConn undefined (type *net.TCPConn has no field or method SyscallConn)
If you want to just wait 2 weeks for this, then we will delete 1.8 from CI and this won't need to be fixed. Otherwise, we'll have to deal with more version-specific files.
internal/transport/transport_test.go
Outdated
}, | ||
} | ||
for _, tt := range tests { | ||
lis, err := net.Listen("tcp", "localhost:0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should only need to make the listener once, right (i.e. outside the loop)?
You also never call lis.Accept()
, which seems like it should be necessary. You would also need to read from the conn
on the server side, because the client writes its 24-byte preface before returning (we think this is the cause of a separate test flake we're having).
A negative socket option int should never occur in actual operation. Returning a nil error is simpler and more consistent with existing functions in syscall.
No rush on this, so let's wait the 2 weeks for 1.8 support to be dropped. I addressed the linter issues and (I hope) the lack of |
Unsure if the recent CI failure in 1.11 is due to my changes or a pre-existing flake. |
@mikeraimondi the 1.11 failure is a known flake (#2380 should fix it). This recent set of changes LGTM. Hopefully there won't be many conflicts to resolve in 2 weeks when this is good to submit. Thanks again for your contribution! |
@mikeraimondi - Go 1.6 and 1.8 have been removed from our CI. Please rebase your branch to master, and I think everything should be good to go. Thanks! |
@dfawley all set! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!!
Implements proposal A18. Not yet manually tested.
gRPC Core issue for reference: grpc/grpc#15889