Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Library unsafe to use without manual locking #83

Closed
stevenh opened this issue Oct 8, 2015 · 8 comments
Closed

Library unsafe to use without manual locking #83

stevenh opened this issue Oct 8, 2015 · 8 comments

Comments

@stevenh
Copy link

stevenh commented Oct 8, 2015

While NextReader and NextWriter state "The Next(Reader|Writer) method and the (readers|writers) returned from the method cannot be accessed by more than one goroutine at a time." other more commonly used methods such a WriteMessage etc don't so its easy to fall into the trap of using this library in a way which randomly breaks itself.

Ideally it should be updated to not have this issue, which from reading through the code all steams from the internal buffering.

In the mean time enhancing the documentation for each write / read method to be very explicit that they aren't safe would prevent others from going through the same pain we did, debugging an issue with our code for many hours only to find that WriteMessage is unsafe without external serialisation locking.

@garyburd
Copy link
Contributor

garyburd commented Oct 9, 2015

The documentation states that concurrent calls to WriteMessage are not supported.

Because the websocket protocol does not support interleaved messages, concurrent write support will require message buffering, blocking or forced finalization of messages written by other goroutines. Any of these choices will result in situations that are difficult to debug.

A benefit of the current design is that application errors are found easily with the race detector.

@stevenh
Copy link
Author

stevenh commented Oct 14, 2015

Never seen that bit, yes its there but we where working off the API docs, which while it is mentioned in NextWriter, WriteMessage doesn't which is where all our issues stemmed from.

If all relevant methods cross referenced that section then it would be much clearer e.g.
https://godoc.org/github.com/gorilla/websocket#Conn.WriteMessage

Its very easy to get caught out when doing even simple tasks like single goroutine writer + pinger to keep everything alive.

While debugging our issue I found situations in the internals where the write mutex is totally defeated resulting in a very nasty mess that wasn't found by the race detector :(

@stevenh
Copy link
Author

stevenh commented Oct 14, 2015

For reference our issue was that the internal code was corrupting seemingly individual and none concurrent WriteMessage calls. The reason for this is that reads from buffered channels (channel.mu) don't guarantee FIFO and hence messages where not getting fully processed in order of send.

This resulted in the code trying to send messages from the buffer which had been overwritten by subsequent messages. This was due to the WriteMessage concurrency but it was very hard to spot and none obvious due to WriteMessage being a single call and on our case only triggering when a "ping" happened to correspond with data being sent.

@garyburd
Copy link
Contributor

The channel c.mu is used to coordinate concurrent control frame writers and a single message writer. The channel is not designed to coordinate concurrent message writers.

Are you saying that that c.mu failed to protect a concurrent call to WriteMessage and WriteControl(websocket.PingMessage, data, deadline)?

It's interesting to hear that the race detector was not helpful. I plan to to test the race detector myself. If the race detector is not helpful, then I'll consider the blocking option.

@stevenh
Copy link
Author

stevenh commented Oct 14, 2015

That's unclear ATM as we where using WriteMessage for websocket.PingMessage as per the examples.

@stevenh
Copy link
Author

stevenh commented Oct 14, 2015

One thing that is clear however is that the use of c.mu as a gate into c.conn.Write does create potential starvation due to the fact c.mu is buffered and buffered channels are not fair to consumers meaning you could have a control message block until timeout when it could easily have been sent.

We saw this unfair behaviour with competing WriteMessage calls where pending messages where ignored with new messages sent in preference, reordering messages by as much as 20 messages.

Unbuffered channels in contrast are fair.

@garyburd
Copy link
Contributor

Starvation can occur in cases where the application is close to saturating the network. Please submit a separate issue if this is a concern for your application. An immediate workaround is to coordinate writes through a single goroutine.

Concurrent calls to WirteMessage will corrupt connection state. You may be observing the result of this corruption and not starvation.

Unbuffered channels in contrast are fair.

Unbuffered channels cannot be used to implement a mutex.

@garyburd
Copy link
Contributor

The documentation in this package follows the common Go practice of documenting where concurrency is allowed. The default assumption in Go is that concurrency is not allowed.

I wrote a couple of tests and found that the race detector does detect unsynchronized calls to the write methods. I was concerned that internal synchronization for fames might mask races, but it does not.

I have no plans to add locking to the methods used for writing messages (SetWriteDeadline, NextWriter, WriteMessage, WriteJSON). Applications typically call more than one of these methods to write a message:

  • SetWriteDeadline, WriteMessage
  • SetWriteDeadline, NextWriter, Write, Write, ..., Close

Because the application should already be doing something to synchronize the sequence of method calls, there's not much value in adding locking to the individual methods. The additional locking has the drawback that it will mask errors that are currently found by the race detector.

@gorilla gorilla locked and limited conversation to collaborators Feb 14, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants