Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: This socket has been ended by the other party #192

Closed
gx0r opened this issue Nov 28, 2015 · 42 comments
Closed

Error: This socket has been ended by the other party #192

gx0r opened this issue Nov 28, 2015 · 42 comments

Comments

@gx0r
Copy link

gx0r commented Nov 28, 2015

Saw this a couple times today and yesterday, I am not sure why though. 2.2.4.

Error: This socket has been ended by the other party
     at Socket.writeAfterFIN [as write] (net.js:268:12)
     at /home/user/app/node_modules/rethinkdbdash/lib/connection.js:582:21
     at Object.tryCatch (/home/user/app/node_modules/rethinkdbdash/lib/helper.js:167:3)
     at Connection._send (/home/user/app/node_modules/rethinkdbdash/lib/connection.js:581:10)
     at /home/user/app/node_modules/rethinkdbdash/lib/term.js:199:22
     at tryCatcher (/home/user/app/node_modules/bluebird/js/release/util.js:11:23)
     at Promise._settlePromiseFromHandler (/home/user/app/node_modules/bluebird/js/release/promise.js:488:31)
     at Promise._settlePromise (/home/user/app/node_modules/bluebird/js/release/promise.js:545:18)
     at Promise._settlePromiseCtx (/home/user/app/node_modules/bluebird/js/release/promise.js:582:10)
     at Async._drainQueue (/home/user/app/node_modules/bluebird/js/release/async.js:130:12)
     at Async._drainQueues (/home/user/app/node_modules/bluebird/js/release/async.js:135:10)
     at Immediate.Async.drainQueues [as _onImmediate] (/home/user/app/node_modules/bluebird/js/release/async.js:16:14)
     at processImmediate [as _immediateCallback] (timers.js:383:17)
@neumino
Copy link
Owner

neumino commented Nov 29, 2015

@llambda -- does it crash? Or is that the error returned by the promise?

@neumino
Copy link
Owner

neumino commented Nov 29, 2015

Actually if I properly read the stacktrace, your query is rejected.

I guess this can happen if you something like this happen:

  • Receive a FIN packet
  • Send a query before the socket has the change to emit end.

@llambda -- how does often does it happen? Do you have unreliable connection between your server and the database?
I've tracked down the end event, and as far as I can tell, the connection is properly removed from the pool, so I have a hard time figuring out a case when this can often happen.

@gx0r
Copy link
Author

gx0r commented Nov 29, 2015

Did not crash. It was a local node web server hitting a local rethinkdb on my Linux desktop. Once it happened, it seemed like no queries resolved until I restarted the web app. I happened I think twice in the last few days, seemed to be after returning to the computer after away from a while. I don't have any power management on except blank screen, so not sure if that is related or not.

@gx0r
Copy link
Author

gx0r commented Dec 10, 2015

I have another stack with latest version of Rethinkdbdash. The process uptime was 75405 seconds on a production box, so no sleep or power management.

It seems to be occurring in a Promise rejection, as it is handled and logged for me by a bluebird unhandled rejection handler, and therefore the process did not crash.

I am not sure if it automatically recovers or not; but I am pretty sure that it gets in a state where no DB queries return results, until the web server process is restarted. I'm not sure because it was on one server out of three and I was upgrading Node on them so ended up restarting the web server processes and never noticed the issue had occurred until I was reviewing the logs today. (The servers are load balanced, so if a problem on one occurs, I don't normally notice it unfortunately unless I guess they all went down.) I'll keep an eye on it and try to troubleshoot further.

I don't know if it would be helpful here to log the host of the rethinkdb. I have 3 computers in the cluster, each runs a rethinkdb and a node web server. So the rethinkdb connection here could potentially be to a different host than the host the web server is running on.

Error: This socket has been ended by the other party
    at Socket.writeAfterFIN [as write] (net.js:268:12)
    at /opt/app/node_modules/rethinkdbdash/lib/connection.js:587:21
    at Object.tryCatch (/opt/app/node_modules/rethinkdbdash/lib/helper.js:167:3)
    at Connection._send (/opt/app/node_modules/rethinkdbdash/lib/connection.js:586:10)
    at /opt/app/node_modules/rethinkdbdash/lib/term.js:202:22
    at tryCatcher (/opt/app/node_modules/bluebird/js/release/util.js:11:23)
    at Promise._settlePromiseFromHandler (/opt/app/node_modules/bluebird/js/release/promise.js:489:31)
    at Promise._settlePromise (/opt/app/node_modules/bluebird/js/release/promise.js:546:18)
    at Promise._settlePromiseCtx (/opt/app/node_modules/bluebird/js/release/promise.js:583:10)
    at Async._drainQueue (/opt/app/node_modules/bluebird/js/release/async.js:134:12)
    at Async._drainQueues (/opt/app/node_modules/bluebird/js/release/async.js:139:10)
    at Immediate.Async.drainQueues [as _onImmediate] (/opt/app/node_modules/bluebird/js/release/async.js:16:14)
    at processImmediate [as _immediateCallback] (timers.js:383:17)

@neumino
Copy link
Owner

neumino commented Dec 11, 2015

Hum, I still can't figure how this could happen, but I can add one more check in connection._send to not send anything if the socket emitted end or close before.
I'll try to do that tomorrow evening.

neumino added a commit that referenced this issue Dec 11, 2015
@neumino
Copy link
Owner

neumino commented Dec 11, 2015

I just pushed 2.2.10 that should catch earlier this error but I believe it doesn't address the underlying issue. My guess is that we attempt to write on a TCP socket after it's closed and before it emitted end or close.

@llambda -- when it happens again, can you provide the stacktrace? Thanks!

@neumino neumino closed this as completed Dec 11, 2015
@gx0r
Copy link
Author

gx0r commented Dec 13, 2015

Hey @neumino , I got 5 of the following logged this afternoon, all on just one server.
Times were 12:35pm, 12:57PM, and three times at 1:58PM.

TypeError: Cannot read property 'reject' of undefined
    at Connection._send (/opt/app/node_modules/rethinkdbdash/lib/connection.js:561:25)
    at /opt/app/node_modules/rethinkdbdash/lib/term.js:202:22
    at tryCatcher (/opt/app/node_modules/bluebird/js/release/util.js:11:23)
    at Promise._settlePromiseFromHandler (/opt/app/node_modules/bluebird/js/release/promise.js:489:31)
    at Promise._settlePromise (/opt/app/node_modules/bluebird/js/release/promise.js:546:18)
    at Promise._settlePromiseCtx (/opt/app/node_modules/bluebird/js/release/promise.js:583:10)
    at Async._drainQueue (/opt/app/node_modules/bluebird/js/release/async.js:134:12)
    at Async._drainQueues (/opt/app/node_modules/bluebird/js/release/async.js:139:10)
    at Immediate.Async.drainQueues [as _onImmediate] (/opt/app/node_modules/bluebird/js/release/async.js:16:14)
    at processImmediate [as _immediateCallback] (timers.js:384:17)

neumino added a commit that referenced this issue Dec 13, 2015
@neumino
Copy link
Owner

neumino commented Dec 13, 2015

@llambda -- My bad, just pushed a fix in 2.2.11 for that.

@gx0r
Copy link
Author

gx0r commented Dec 17, 2015

So, I haven't seen any unhandled promise rejections now. On 2.2.11.

I do see the following, repeated twice, at the end of the stderr of the web server that had the original rejections. However, since it appears to just have been logged to stderr, there is no date information, so I am not sure when it happened. The app hasn't crashed or stopped working so I am guessing this is just normal logging for something handled by the connection pool now?

ReqlDriverError: The connection was closed by the other party.
    at Connection._send (/opt/app/node_modules/rethinkdbdash/lib/connection.js:559:15)
    at /opt/app/node_modules/rethinkdbdash/lib/term.js:202:22
    at tryCatcher (/opt/app/node_modules/rethinkdbdash/node_modules/bluebird/js/release/util.js:11:23)
    at Promise._settlePromiseFromHandler (/opt/app/node_modules/rethinkdbdash/node_modules/bluebird/js/release/promise.js:489:31)
    at Promise._settlePromise (/opt/app/node_modules/rethinkdbdash/node_modules/bluebird/js/release/promise.js:546:18)
    at Promise._settlePromiseCtx (/opt/app/node_modules/rethinkdbdash/node_modules/bluebird/js/release/promise.js:583:10)
    at Async._drainQueue (/opt/app/node_modules/rethinkdbdash/node_modules/bluebird/js/release/async.js:134:12)
    at Async._drainQueues (/opt/app/node_modules/rethinkdbdash/node_modules/bluebird/js/release/async.js:139:10)
    at Immediate.Async.drainQueues [as _onImmediate] (/opt/app/node_modules/rethinkdbdash/node_modules/bluebird/js/release/async.js:16:14)
    at processImmediate [as _immediateCallback] (timers.js:384:17)

@neumino
Copy link
Owner

neumino commented Dec 17, 2015

Hum, this seems to mean that the connection was closed (rethinkdbdash properly closed it) but somehow it's being used to run a query.

@llambda -- do you use the connection pool to run query? Or do you open connection yourself?

@gx0r
Copy link
Author

gx0r commented Dec 17, 2015

I use the connection pool. Come to think of it, those 2 connection messages could have been at process shutdown, though, since right before there was a "Draining the pool connected to host:28015"

@neumino
Copy link
Owner

neumino commented Dec 17, 2015

Ok thanks, I have a rough idea what's happening. I probably won't have time in the next few days since I'm moving, but I should be ablet to take a look at that after.

@mbroadst
Copy link
Contributor

mbroadst commented Jan 9, 2016

@neumino this is happening for me as well. It's a pretty big issue as there is (or at least was) an expectation that rdbdash will silently handle reconnection (read: we don't have to worry as long as the remote server is up, queries will "Just Work™"). I can reproduce this just by leaving my local rest api running, closing my laptop, opening it back up and attempting to use the api.

I think the issue should be reopened?

@neumino neumino reopened this Jan 10, 2016
@neumino
Copy link
Owner

neumino commented Jan 10, 2016

I still can't reproduce it, but I have a theory that may explain why.

@neumino
Copy link
Owner

neumino commented Jan 10, 2016

Hum, actually no my theory doesn't work.
@llambda, @mbroadst -- any chance one of you could create a way to reproduce this bug? I can't reproduce this. I tried to put my laptop to sleep but that doesn't seem to do the trick.

Are you both running your script on OSX? Where's the RethinkDB instance?

@gx0r
Copy link
Author

gx0r commented Jan 10, 2016

I actually haven't had the issue anymore myself. I run on Fedora 23 desktop using node 5. Production is CentOS 7, with 3 computers each running the same node.js script and a rethinkdb instance. I run clustered with socket.io and am using changefeeds.

I wonder if taking something like https://github.com/llambda/tcpslow and adding random fault-injecting socket.destroy calls might help trigger it. I might try to do that.

@gx0r
Copy link
Author

gx0r commented Jan 10, 2016

I added fault injection and did manage to get a

Sun, 10 Jan 2016 03:02:37 GMT unhandledPromiseRejection: ReqlDriverError: The connection was closed by the other party.
    at Connection._send (/home/u/app/node_modules/rethinkdbdash/lib/connection.js:559:15)
    at /home/u/app/node_modules/rethinkdbdash/lib/term.js:202:22
    at tryCatcher (/home/u/app/node_modules/bluebird/js/release/util.js:11:23)
    at Promise._settlePromiseFromHandler (/home/u/app/node_modules/bluebird/js/release/promise.js:489:31)
    at Promise._settlePromise (/home/u/app/node_modules/bluebird/js/release/promise.js:546:18)
    at Promise._settlePromiseCtx (/home/u/app/node_modules/bluebird/js/release/promise.js:583:10)
    at Async._drainQueue (/home/u/app/node_modules/bluebird/js/release/async.js:130:12)
    at Async._drainQueues (/home/u/app/node_modules/bluebird/js/release/async.js:135:10)
    at Immediate.Async.drainQueues [as _onImmediate] (/home/u/app/node_modules/bluebird/js/release/async.js:16:14)
    at processImmediate [as _immediateCallback] (timers.js:383:17)

I had to sit in the browser and press Ctrl + R a lot to get that. But nothing else, just expected errors like that and "unhandledPromiseRejection: ReqlServerError: The connection was closed before the query could be completed for ...".

@gx0r
Copy link
Author

gx0r commented Jan 10, 2016

@mbroadst when it happens what do you see (if anything) for error message, and does your process exit? Which version are you on? (I am on 2.2.15)

@neumino
Copy link
Owner

neumino commented Jan 10, 2016

So this stacktrace is basically you calling query.run() at some point.
The problem is that the connection we get from the connection pool is actually closed. The underlying TCP connection received a FIN packet.

When that happens, the TCP connection emits end, and we immediately remove it from the pool. So I don't really see how that can happen.

I'll dig a bit more, but so far I've been unable to reproduce it and find how this could happen.

@neumino
Copy link
Owner

neumino commented Jan 10, 2016

I just pushed 2.2.16.

In case a connection is being used by a query, and the connection receive a FIN packet, maybe we put back the connection in the pool later? Though that seems impossible as it would mean that the query actually succeeded.

@llambda, @mbroadst -- let me know if you still see this error with 2.2.16.
If not I'll dig into how a query can succeed when the connection is closed.
Thanks!

@neumino
Copy link
Owner

neumino commented Jan 10, 2016

One more question, are you using noreply queries?

@mbroadst
Copy link
Contributor

Hey guys, sorry busy weekend over here. Let's see

  • no these are not noreply queries
  • this seemed to happen consistently when I left my rest service running connected to a remote rethinkdb, let the laptop to go true sleep (like > a few minutes), and then woke it back up
  • I have a sneaking suspicion that this might be related to changefeeds

I haven't had a chance to test again, but I'll try to run some test scenarios later tonight and see if I can recreate it. Thanks

@snackycracky
Copy link

I updated to 2.2.17 and still having the problem. But the changes watch keeps working somhow. Will use the standard driver now.

screenshot 2016-01-21 14 57 20

@neumino
Copy link
Owner

neumino commented Jan 21, 2016

@snackycracky -- Your issue is not related. The connection on which you open a changefeed was closed by something. This is probably an issue in your setup (docker? firewall? linux tcp settings?)
You can use the standard driver, but I doubt this will change something.

@snackycracky
Copy link

yeah I am using docker + ELB on aws, do you have an idea what this "something" could be ? The Ilde-time for the load balancers maybe?

@neumino
Copy link
Owner

neumino commented Jan 22, 2016

@snackycracky -- I have never used ELB, but looking at the docs, there's a 60 timeout for idle connections: https://aws.amazon.com/blogs/aws/elb-idle-timeout-control/

@snackycracky
Copy link

yeah, there are multiple options possible with ELB, but connecting directly to rethinkdb seems to solve the problem also with rethinkdbdash. Probably the ilde timeout is not the only reason, I probably also need sticky sessions and HAProxy and enable ProxyProtocolPolicyType. Sorry for highjacking this issue. My problem is more related to rethinkdb/rethinkdb#2956

@atis--
Copy link

atis-- commented Feb 5, 2016

Same error here, all queries stopped working (2.2.17).

[2016-02-05T10:49:43.879Z] ERROR: infoswitch.pbx/14595 on atis-VirtualBox: unhandled server error ReqlDriverError: The connection was closed by the other party. at Connection._send (/home/atis/WORK/infoswitch/pbx/node_modules/rethinkdbdash/lib/connection.js:559:15) at /home/atis/WORK/infoswitch/pbx/node_modules/rethinkdbdash/lib/term.js:202:22 at processImmediate [as _immediateCallback] (timers.js:367:17) From previous event: at Term.run (/home/atis/WORK/infoswitch/pbx/node_modules/rethinkdbdash/lib/term.js:142:15)

One other thing which may be related is that the computer changed wifi networks, and the log has quite a few of these messages:

Fail to create a new connection for the connection pool. Error:{"message":"Failed to connect to 192.168.1.10:28015\nFull error:\n{\"code\":\"ENETUNREACH\",\"errno\":\"ENETUNREACH\",\"syscall\":\"connect\"}.","isOperational":true}

In this case RethinkDB ran with --bind all and rethinkdbdash had discovery=true. So I guess even though the initial host connection param was set to localhost, it later changed to another (in this case computer's 'public') address 192.168.1.10? And then after the network switch the 192.168.1.10 address was no longer valid and things went south.

@neumino
Copy link
Owner

neumino commented Feb 9, 2016

If you switched network, it's expected that all the connections will dropped.
As long as the driver eventually recovers, it's working as intended.

@atis--
Copy link

atis-- commented Feb 9, 2016

Sure, I just mentioned it here because it may be related. Whether it ever recovers, I don't know... certainly didn't do so within 5 or so minutes I was looking at it.

What may be the original problem reappeared today on a test server (no changing networks there) -- all queries stopped working. Looking at open file descriptors, (lsof -p PID | grep 28015 | wc -l) it says 1001 open connections to RethinkDB, so it seems it had exhausted all max=1000 connections. On a server with almost no user activity! The log file did not contain anything useful (perhaps because of NODE_ENV=production).

The process had been running for 5 days up to that point, so I guess the open connections accumulate and never get properly reused or cleaned up? Sorry I can't be of more help atm.

@atis--
Copy link

atis-- commented Feb 9, 2016

Well, since now I know how to count open connections, I did a quick test:

  1. restart node+expess+rethinkdbdash webapp -- 50 open connections (buffer value)
  2. navigate to web page (a couple of session-related db queries + one or two changefeeds) and boom:
  3. 53 open connections to RethinkDB
  4. refresh the page a few times
  5. 69 connections...

So it looks like rethinkdbdash keeps opening new connections and never even reuses or closes the old ones!

@neumino
Copy link
Owner

neumino commented Feb 10, 2016

This is working as expected. If you set a buffer of 50 connections, the driver will try to keep 50 connections available. What happens in your example is:

  1. 50 connections are open for the buffer
  2. Open X changefeeds and run Y more queries at the same time. What happens here is that you have X+Y connections used. So rethinkdbdash has 50-X-Y connections available and will therefore open X+Y
  3. When the queries are done, the connection are released.
  4. Run a few more queries - It will use the 50+Y connections available and will try again to keep 50 available.

If you wait a bit, the connection will be cleaned. By default it's 2 hours if I'm not mistaken. Basically it's still working as intended.

@atis--
Copy link

atis-- commented Feb 10, 2016

Ah, cool! Thanks for the explanation, it makes sense now.

The source of my too-many-connections problem turned out to be not closing the changefeed stream returned by .toStream(). I was erroneously assuming that the stream would close after unpiping it. Not closing it causes changefeed cursors to remain open forever and quickly fill up the connection pool at which point all new queries just hang waiting for a free connection.

rethinkdbdash defines its own .close() method on the returned stream which should be called whenever the changefeed stream is no longer necessary. Can this be included in the README? Seems kinda easy to miss.

@hoffination
Copy link

I'm also running into the problem @llambda ran into:

var express = require('express');
var server = require('http').Server(express());
var config = rootRequire('config').rethinkdb;
var r = require('rethinkdbdash')({servers: [config]});
var winston = rootRequire('log');
var metric = rootRequire('metric');
var stackTrace = require('stack-trace');

module.exports = function(io) {
  io.on('connection', function(socket) {
    var now = Date.now();
    socket.on('initNotification', function(identifier) {
      winston.info('a user connected', {user: identifier, endpoint: 'socketio'});
        return r.db('notification').table('events')
          .filter(function(item) {
            return item('time').gt(now).and(
              item('user2').eq(identifier))
          })
          .changes()
          .run()
          .then(function(cursor) {
            cursor.each(function(err, item) {
              if (item && item.new_val && !item.old_val) {
                socket.emit('notification', item.new_val);
              }
            })
            socket.on('disconnect', function() {
              winston.info('a user disconnected correctly', {endpoint: 'socketio'});
              cursor.close();
            })
          })
          .then(function() {
            metric.checkDaily(function() {
              var values = {users: identifier};
              metric.updateTable('dailyMetrics', values);
            });
          })
          .catch(function(err) {
            winston.error(JSON.stringify(err),
              {user: identifier, endpoint: 'socketio', trace: stackTrace.parse(err)});
          });
    });
  });
}

When I leave my server running with these connections open for a long period of time I'll get a similar output as listed above:

{"user":"95b75690-b011-4365-88b5-0e9436dd17a6","endpoint":"socketio","trace":
  [{"fileName":"/root/chillnserver/chat_server/lib/notification/socketio/node_modules/rethinkdbdash/lib/connection.js","lineNumber":559,"functionName":"Connection._send","typeName":"Connection","methodName":"_send","columnNumber":15,"native":false},
  {"fileName":"/root/chillnserver/chat_server/lib/notification/socketio/node_modules/rethinkdbdash/lib/term.js","lineNumber":202,"functionName":null,"typeName":null,"methodName":null,"columnNumber":22,"native":false},
  {"fileName":"/root/chillnserver/chat_server/lib/notification/socketio/node_modules/rethinkdbdash/node_modules/bluebird/js/release/util.js","lineNumber":16,"functionName":"tryCatcher","typeName":"Object","methodName":null,"columnNumber":23,"native":false},
  {"fileName":"/root/chillnserver/chat_server/lib/notification/socketio/node_modules/rethinkdbdash/node_modules/bluebird/js/release/promise.js","lineNumber":497,"functionName":"Promise._settlePromiseFromHandler","typeName":"Promise","methodName":"_settlePromiseFromHandler","columnNumber":31,"native":false},
  {"fileName":"/root/chillnserver/chat_server/lib/notification/socketio/node_modules/rethinkdbdash/node_modules/bluebird/js/release/promise.js","lineNumber":555,"functionName":"Promise._settlePromise","typeName":"Promise","methodName":"_settlePromise","columnNumber":18,"native":false},
  {"fileName":"/root/chillnserver/chat_server/lib/notification/socketio/node_modules/rethinkdbdash/node_modules/bluebird/js/release/promise.js","lineNumber":592,"functionName":"Promise._settlePromiseCtx","typeName":"Promise","methodName":"_settlePromiseCtx","columnNumber":10,"native":false},
  {"fileName":"/root/chillnserver/chat_server/lib/notification/socketio/node_modules/rethinkdbdash/node_modules/bluebird/js/release/async.js","lineNumber":130,"functionName":"Async._drainQueue","typeName":"Async","methodName":"_drainQueue","columnNumber":12,"native":false},
  {"fileName":"/root/chillnserver/chat_server/lib/notification/socketio/node_modules/rethinkdbdash/node_modules/bluebird/js/release/async.js","lineNumber":135,"functionName":"Async._drainQueues","typeName":"Async","methodName":"_drainQueues","columnNumber":10,"native":false},
  {"fileName":"/root/chillnserver/chat_server/lib/notification/socketio/node_modules/rethinkdbdash/node_modules/bluebird/js/release/async.js","lineNumber":16,"functionName":"Immediate.Async.drainQueues [as _onImmediate]","typeName":"Immediate","methodName":"Async.drainQueues [as _onImmediate]","columnNumber":14,"native":false},
  {"fileName":"timers.js","lineNumber":358,"functionName":"processImmediate [as _immediateCallback]","typeName":"Object","methodName":null,"columnNumber":17,"native":false}],
"level":"error","message":"{\"message\":\"The connection was closed by the other party.\",\"isOperational\":true}"}

Is there something I'm not doing in the catch statement or something else I'm missing? I'm running on 2.2.18 with RethinkDB 2.2.5

@neumino
Copy link
Owner

neumino commented Apr 2, 2016

Do you run that locally? What's the tcp timeout on your machine?

@hoffination
Copy link

I'm running it on a DigitalOcean droplet with RethinkDB on the same box. It's my dev environment. I'm not sure what the tcp timeout is on the box. How can I check my tcp timeout?

@neumino
Copy link
Owner

neumino commented Apr 3, 2016

So basically here's what happen is the following:

  • You open a changefeed from your node.js app to RethinkDB
  • The connection is idle as no traffic is sent for the changefeed
  • Your linux box decides to kill the connection after some time. Hence the error.

The driver does specifies the connection to be keep alived
https://github.com/neumino/rethinkdbdash/blob/master/lib/connection.js#L66

As far as I know, RethinkDB doesn't close idle connection, so it's probably linux that decides to kill the connection. There's some doc here about tcp keepalive
http://tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.html

Maybe you have bad defaults? Or try to increase the keep alive time?

@rucsi
Copy link

rucsi commented Apr 7, 2016

we're experiencing similar issues. we have a changes feed on a table but if the connection brakes with the db for whatever reason (db restart, network issue) we will not receive the changes events anymore.
we also tried to subscribe and use the healthy event, which I guess would not solve the network issue, but unable to run a query even after the event fired, cause we get the ReqlDriverError: The connection was closed by the other party. when we try. I need to create a new pool in order to make it work again. any advice? thanks

@neumino
Copy link
Owner

neumino commented Apr 9, 2016

@danielmewes -- can we add something in the protocol to just ping a connection? It looks like keepalive is not always sufficient for some people.

@danielmewes
Copy link
Contributor

@neumino Could you just run something like r.expr(null)? I'm not sure if adding an additional query type is necessary.

That being said the idea has come up before. I think some routers don't work too well when it comes to TCP keepalive, or the timeouts are simply too long. Sounds worth considering, but probably not at the top of the priority list for the next few months.

@neumino
Copy link
Owner

neumino commented Apr 21, 2016

Ok, I just released 2.3.3 with the option pingInterval.
Set it to a positive value to make the driver ping the connection every seconds. This should prevent routers dropping idle connections.

Note that we use r.error("__rethinkdbdash_ping__") to ping a connection, so consider this error reserved.

@snackycracky
Copy link

cool

Sent from my Tricorder

On 21.04.2016, at 07:49, Michel [email protected] wrote:

Ok, I just released 2.3.3 with the option pingInterval.
Set it to a positive value to make the driver ping the connection every seconds. This should prevent routers dropping idle connections.

Note that we use r.error("rethinkdbdash_ping") to ping a connection, so consider this error reserved.


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants