Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle WebSocket handshake timeouts #422

Merged
merged 16 commits into from
Aug 13, 2020

Conversation

shilder
Copy link
Contributor

@shilder shilder commented Oct 14, 2018

Added new option handshake-timeout for websocket client

Added new option handshake-timeout for websocket client
@shilder
Copy link
Contributor Author

shilder commented Oct 14, 2018

Related to #346 More like request for discussion. Some opened questions

  1. Is it ok to use somewhat private method (.in time/*clock*) to schedule timeout handler execution ?
  2. There are already couple of specific Exception classes for different timeouts (like ReadTimeoutException and so on). Should i add another class ? Something like WebSocketHandshakeTimeoutException ?

@kachayev
Copy link
Collaborator

@shilder Regarding WebSocketHandshakeTimeoutException: 👍. It's better to keep all timeouts separate when possible, so the user can distinguish them somehow later. Subclassing works just fine for that.

@@ -609,6 +614,16 @@
([_ ctx]
(let [ch (.channel ctx)]
(reset! in (netty/buffered-source ch (constantly 1) 16))

;; Start handshake timeout timer
(reset! timeout-task
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to figure out why you used the .in accessor here, and realized that manifold.time/in doesn't allow it to be cancelled. That's my bad, I'm going to merge these changes in and then fix the underlying method so we don't have to reach under the covers.

@shilder shilder force-pushed the ws-handshake-timeout branch from 3f940ab to 075960b Compare October 31, 2018 08:48
@ztellman
Copy link
Collaborator

ztellman commented Feb 1, 2019

@kachayev it seems this no longer merges after I brought in the ping functionality, since you have more familiarity with that would you mind getting it into order?

@kachayev
Copy link
Collaborator

kachayev commented Feb 1, 2019

@ztellman Sure, I'll take care.

(fn []
;; if there was no answer after handshake was sent - close channel
(d/error! d (WebSocketHandshakeTimeoutException. "WebSocket handshake timeout"))
(netty/close ctx))))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not on I/O thread here, so we should probably close the channel, not the context. Right? @ztellman


;; Start handshake timeout timer
(reset! timeout-task
(.in time/*clock*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One another thing... Maybe it's better to use EventLoop executor to schedule the task (ctx.executor().schedule) instead of manifold's clock? We can also just trigger user event from the runnable and check if the handshake was complete in :user-triggered. It seems like this approach would leave fewer spaces for race conditions. What do you think? @shilder @ztellman

I still wonder why this functionality is not implemented in Netty, e.g. SslHandler supports handshake timeout as a parameter. I'm going to open an issue to start a discussion there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One another thing... Maybe it's better to use EventLoop executor to schedule the task (ctx.executor().schedule) instead of manifold's clock

Yes, we can use EventExecutor and Netty is using it internally in IdleStateHandler to schedule timeout handlers.

And about :user-triggered - yes, it will work, but I'm not sure how it will help with race conditions ? AFAIU Future#cancel provides synchronization point. Or maybe you are talking about situation where context is closed by timeout handler while another thread awaits on Future#cancel ? I'm not a Netty expert, so I'm not sure how it handles serialization of events for context

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only for IdleTimeoutHandler but for all timeout tasks. Future canceling is what I want to avoid here specifically because of a synchronization: it might hurt performance when you introduce sync primitives to the pipeline running on I/O thread.

If you don’t mind, I can update your PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don’t mind, I can update your PR.

Sure, go ahead

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On it!

@kachayev
Copy link
Collaborator

kachayev commented Feb 3, 2019

The more I work with this functionality, the more I wonder why don't we use Netty's WebSocket*ProtocolHandler to deal with handshakes? It doesn't support timeouts out of the box (at least for now). Apart from that, it's a well-tested implementation that we can leverage injecting our own handler & realizing deferred with newly created netty/sink on HANDSHAKE_COMPLETE user event triggered by Netty's handler. @ztellman WDYT? Any objections?

@kachayev
Copy link
Collaborator

kachayev commented Feb 5, 2019

@shilder I've reimplemented timeouts, please double check on your side. Notable changes:

  1. Avoid .cancel sync on I/O threads
  2. Schedule timeout on I/O threads
  3. Schedule timeout only when the handshake was effectively flushed, not before
  4. Cancel timeout when processing is done, avoid 2 timeouts

I've also added a test case to check that the solution works. Should we do the same for a server handler?

src/aleph/http/client.clj Outdated Show resolved Hide resolved
@kachayev
Copy link
Collaborator

kachayev commented Feb 6, 2019

From what I can see, there's no necessary to introduce the same for server's handshake, as it's just about flushing response to the client.

@ztellman Now ready for review/merge.

@kachayev
Copy link
Collaborator

@ztellman Merged with the latest changes from master, so now it's mergeable.

@shilder I've also moved timeout cancelation to d/finally' instead of d/chain'. Does it make sense?

@ztellman
Copy link
Collaborator

This has conflicts with the other WS changes I merged, but will merge once those are resolved.

@kachayev
Copy link
Collaborator

@ztellman Done!

@kachayev
Copy link
Collaborator

@ztellman Fixed conflicts here.

@kachayev kachayev added the 1.0 label Aug 11, 2020
@kachayev
Copy link
Collaborator

I'm going to rebase to 1.0.0 to merge with latest changes there. Substantial portion of the code is intertwined with websockets pipeline changes we've made so far.

@kachayev kachayev changed the base branch from master to 1.0.0 August 13, 2020 08:18
@kachayev
Copy link
Collaborator

As of now, merging into 1.0.0 branch to accumulate pre-release changes there.

@kachayev kachayev merged commit 504146c into clj-commons:1.0.0 Aug 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants