-
Notifications
You must be signed in to change notification settings - Fork 450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix server crashing on failed publish #201
Conversation
1 similar comment
@gkubisa The Travis build is failing. Did you run the CI check locally first? You can run it like this:
|
When I check the build details in Travis, it says the build failed only in "Node.js 0.1" - I don't believe anyone cares about that ancient version anymore. The build succeeds in all the more recent node versions. |
@gkubisa Aha great point. Perhaps remove that old version from the Travis config then? Not sure if @nateps would approve of that, but seems like a reasonable thing to do. Also it's odd that CI passes for the old version in previous commits, but breaks after this change. I wonder what could be the issue... |
I first fixed the So, I then updated .travis.yml to run tests on all supported nodejs versions only. The tests pass now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aha great! Tricky thing it is with Mingo, totally not from your code.
The changes look good to me. Although, it would be amazing if it could be covered by a test case. Not sure if it's feasible to test this as it involves stopping Redis. Perhaps there's a way to simulate it.
Thanks @gkubisa for the PR and to both of you for updating the Node versions! Nate's starting to involve me in Derby/Share PRs, so you'll see me pop in on these more and more. Sorry for the delay on this - Rachael and I have time time scheduled with Nate to go over these PRs monthly at the moment, and it happened a bit later than usual this month. Notes from the discussion:
To keep the ball rolling, I wrote up a short concrete proposal below. I'm not tied to any particulars if someone has better ideas!
Proposal for handling pub/sub errors:
|
@ericyhwang Excellent proposal! Looks good to me. |
Thanks for the review @ericyhwang. I mostly agree with this proposal and already made Regarding emitting the To emit errors as above, this PR will need to be merged first and a new version of ShareDB published, so that |
Ah, good point about the other methods. Thankfully, it looks like calls to One benefit to putting error emitting in the base class: New adapter subclasses won't have to implement error emitting themselves. They just have to pass callbacks through, and it'll work. Though that doesn't necessarily help if a method gets overridden. Something else to discuss: Should all pub/sub errors get emitted, or only pub/sub errors when there's no callback? Since unhandled "error" type events can cause the process to exit, I'm leaning towards the latter, only emitting unhandled errors. |
Looking forward to see this merged :) Tangentially related to share/sharedb-redis-pubsub#4 (comment) |
@ericyhwang Great point about emitting the errors in the base class - I've just implemented this approach. Since only I believe this PR is ready for merging now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This LGTM, and thanks for updating/adding tests!
It looks like GitHub will let me merge this, but I'd still like Nate to at least glance over it first, if possible.
@nateps - The pubsub/index.js change is a short 20-something line diff, can you take a look and approve in the next couple days?
test/pubsub-memory.js
Outdated
it('emits an error if _publish is unimplemented and callback is not provided', function(done) { | ||
var pubsub = new PubSub(); | ||
pubsub.on('error', function(err) { | ||
expect(err).an(Error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Share doesn't have official code style guidelines that I can find, but everything uses 2-space indents, so let's match that for consistency.
test/pubsub-memory.js
Outdated
}); | ||
|
||
it('can emit events', function(done) { | ||
var pubsub = new PubSub(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same, 2-space indents here too.
@@ -23,22 +32,29 @@ PubSub.prototype.close = function(callback) { | |||
map[id].destroy(); | |||
} | |||
} | |||
if (callback) callback(); | |||
if (callback) process.nextTick(callback); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
In response to @curran about Travis config ("Perhaps remove that old version from the Travis config then? Not sure if @nateps would approve of that"):
For future reference and possibly linking in contributing file, the list of active major Node versions appears to be: https://github.com/nodejs/Release#release-schedule. According to that, it is 4, 6, 8, and 9, and as of just now, 10. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great PR! Thanks for the rapid turnaround on the comments and thanks for implementing the solution so well.
I had one small style comment, and I'll take care of that update real quick then merge.
@@ -13,8 +16,14 @@ function PubSub(options) { | |||
// isn't complete until the callback returns from Redis | |||
// Maps channel -> true | |||
this.subscribed = {}; | |||
|
|||
var self = this; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a style convention, the use of self
as a variable totally works and that is what I use when there is not a better name to use. (You'll see this in sharedb tests even)
However, in sharedb and other repos, we do prefer to use the instance variable name (pubSub
would be the instance name of the class PubSub
) rather than self
. I believe this makes the code and stack traces more specific, easier to understand, and easier to grep / search.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I'll keep that in mind in my future contributions to the project. 👍
@@ -13,8 +16,14 @@ function PubSub(options) { | |||
// isn't complete until the callback returns from Redis | |||
// Maps channel -> true | |||
this.subscribed = {}; | |||
|
|||
var self = this; | |||
this._defaultCallback = function(err) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm impressed you repeated this pattern from QueryEmitter! Great job. 🎉
For future people reading this PR, the pattern means that we only have to create this callback function and the closure of the pubSub
reference one time. The alternative would have been to create an anonymous function every time we publish. This is an ideal application of the same optimization, and it is a good use of consistency in naming and code patterns.
this._unsubscribe(channel, function(err) { | ||
if (err) throw err; | ||
}); | ||
this._unsubscribe(channel, this._defaultCallback); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for spotting this inconsistency and cleaning it up!
This PR fixes the issue where the entire server crashes when ShareDB fails to publish a message to its PubSub system.
The easiest way to reproduce the problem is to use Redis for PubSub (with node-redis configured to try to restore the connection indefinitely), stop Redis and then send some operations to ShareDB server. ShareDB will then try to publish a message to Redis and crash the server.
The fix is to simply add an empty callback to the
publish
function. This works fine as ShareDB can gracefully recover from losing those messages by reading the missing document revisions from the database (eg MongoDB). For the same reason I don't think it is necessary to propagate the publishing error up, although it certainly is a possibility.