-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Many connections in state CLOSE_WAIT and FIN_WAIT2 without release #1380
Comments
FYI what worked for us: Switching from Socket.io to SockJS and terminating HTTPS outside of node.js. |
I've got the same problem using socket.io 0.9.16. I don't want to switch away from socket.io. I am hoping someone can help. Not sure what to do from here. I can easily replicate this problem when I have a few clients connecting and then disconnecting in a loop as a load test. Even when the load test stops, the FIN_WAIT2's stay there forever it seems. The only way to clear the FIN_WAIT2 is to restart the node.js application. |
@leopapadopoulos : I had this issue previously but it's been gone after I stopped using RedisStore and upgraded to Socket.io v0.9.16 and Node > v0.10.12 |
I try to use multiple connections to connect socket server at same time, I found that some of the client socket will use the same SOCKET ID(get from xhr and it will looks like _nmXTMmCGNQp4EncrfHqj_) to establish connection. I close the browser when all connections established, and it will cause many CLOSE_WAIT connections without release. A few of connections will close (Base on number of Unique SOCKET ID that have been generated). Because server will establish TCP/IP connection from SOCKET ID. But, if SOCKET ID connections already exist in connections pool, this connection will not store in connections pool. So when client send FIN packet to try to close connection but not exist in server connections pool. Server will always not send ACK packet to prepare close connection. So these connection will stay in CLOSE_WAIT state and without release. var host = 'http://socket.server/';
var sockets = [];
for(var i=0;i<200;i++){
var socket = io.connect(host,{"force new connection":true});
sockets.push(socket);
socket.on("message",function(message){
console.log(message);
});
socket.on("disconnect",function(){
console.log("disconnect");
});
} Fix _lib\manager.js_ line 670. Not to establish TCP/IP connection from SOCKET ID when SOCKET ID connections already exist in connections pool. See also: kejyun@8d6c02a if (!this.connected[data.id]) {
if (transport.open) {
if (this.closed[data.id] && this.closed[data.id].length) {
transport.payload(this.closed[data.id]);
this.closed[data.id] = [];
}
this.onOpen(data.id);
this.store.publish('open', data.id);
this.transports[data.id] = transport;
}
this.onConnect(data.id);
this.store.publish('connect', data.id);
//....etc
}
} The following was socket service connections status, running about 6 hours. netstat -anl | grep <PORT_OF_NODE_PROCESS> | awk '/^tcp/ {t[$NF]++}END{for(state in t){print state, t[state]} }'
FIN_WAIT2 37
LISTEN 1
TIME_WAIT 13
ESTABLISHED 295
FIN_WAIT1 20 |
@samsonradu Thank you for the prompt response. I am already running socket.io V0.9.16 and node v0.10.16. I am not 'knowingly' using RedisStore . Is it possible that socket.io or node is using RedisStore internally? My understanding is that socket.io needs to be configured for RedisStore and I have not done that. |
@kejyun Thank you for your answer. I will try what you suggest as soon as I can. You directions are really clear thank you. I have not modified someone else's node.js code before. I have socket.io installed in the node_modules directory. When I modify manager.js do I need to recompile? If so how? Or is this file just referenced in which case I will simply modify it and try my app again. |
@kejyun Unfortunately your suggestion did not work for my problem. When I try it using your test that you describe it does work to stop the CLOSE_WAIT. However, my problem is not with CLOSE_WAIT it is with FIN_WAIT2. netstat -anl | grep 8443 | awk '/^tcp/ {t[$NF]++}END{for(state in t){print state, t[state]} }' FIN_WAIT2 631 Node Version = v0.10.16 |
@njam Thanks for the suggestion. About switching away from socket.io to SOCKJS. I need all the room features and the room emit features of socket.io. This is why it is difficult to switch away from socket.io. |
@leopapadopoulos Can you describe scenario of your application? Such as how many users? With mobile device? Does network stable? etc... My FIN_WAIT2 state problem still exist too. I guess it is happened on mobile device with unstable network. I'm not figure it out yet. I wonder know how to generate FIN_WAIT2 state about your scenario. thanks. |
@kejyun I use node.js and socket.io as a kind of a pub/sub server. I use socket.io rooms to represent a topic. Data is broadcast to any clients that have joined(subscribed to) the room when the data is changed. In my test I have a c++ client (socket.io-poco) connect to the server via websocket. The client then subscribes to one or more rooms and makes data changes. Then the test client exits. The client does NOT exit cleanly. However, that is no reason for the FIN_WAIT2's to get stuck forever. For example, if a client crashes or loses network connectivity, socket.io should clean up after itself I would think. The client does the above over and over again in 10 second intervals as a kind of a load test to see how this would act in production. As it does this over and over again the FIN_WAIT2's grow and grow, so it is clearly not ready for production in this application. |
I am having the same problem, in my case I am running IOS clients and node 0.10.16 with socket.io 0.9.16. Whenever the client "connects" and "disconnects" a connection will stay at FIN_WAIT2 sate and never being released. |
I have same problem.
I try @kejyun commented , get many TIME_WAIT state
|
I recently try to trace socket.io source code. I found the socket structure are so complex. The structure seems like following: Manager = {
SocketNamespace :{
sockets : {},
transport : {
'websocket' : {},
'htmlfile' : {},
'xhr-polling' : {},
'jsonp-polling' : {}
}
}
} Every structure level will reference each other. I guess it is use for convenient to call each structure level method. Such as "transport" call "Manager". And a lots of structure level will reference "Socket" connections. So when client disconnect, there may some structure still keep this "disconnect" connection. So it will cause CLOSE_WAIT and FIN_WAIT2 problem. So I try @njam recommend SockJS, and it will not cause CLOSE_WAIT and FIN_WAIT2 problem. I'm running it really well. But SockJS without "reconnect server when dissconnect" , "event message" , "chat room" function such as socket.io , but my application need these function. So I try to Implement these fuction call SockJSUtility ( https://github.com/kejyun/SockJSUtility ). And it work well now. |
@kejyun you should checkout Primus, which wraps Sockjs, Socket.IO, engine.io and plain websockets with a common interface and exposes a plugin interface. There are already plugins that implement event emitter and rooms. In addition to that, reconnect is build in to primus using randomized exponential backoff. |
@3rd-Eden Thank you for the suggestion to use https://github.com/primus/primus I will check it out. There are many many posts about this problem. @kejyun seems to be the only one to have come up with his own solution. I had given up on it and was just alarming when FIN_WAIT2 reached a certain level. With node.js being so popular it is a mystery why this is not getting fixed. I wonder if we are posting this problem to the wrong place? |
I'm noticing this same behaviour with 1.0. Disconnected connections are still available after 60 seconds (the default TTL). |
@panuhorsmalahti how are you listing disconnected connections? |
I'm listing them with io['sockets']['adapter']['sids'] and io['sockets']['adapter']['rooms'], so the problem is atleast with socket.io-adapter. |
We've been seeing a similar problem on our production servers, but we've found a fix that seems to work for us, and we wanted to post it to see if it might help others. We had the same problem as everyone else - a slowly increasing number of zombie connections stuck in FIN_WAIT2 or ESTABLISHED. After trying a bunch of different fixes, we had some luck with the node-ka-patch that @kejyun posted in a different thread. With that patch, we found that it only eliminated the connections stuck in FIN_WAIT2, but not our zombie ESTABLISHED connections. We took this as a sign we were on the right track, and that enabling keep-alive on the connections could address the problem. The node.js Buffer API exposes a function to set keep-alive, but we couldn't find a way to call that from a socket that we received from socket.io. So, we decided to see if we could force it using the node-ka-patch as inspiration. Here's the code snippet:
It's a bit hacky, as it's spamming a call to setKeepAlive() everyime the socket writes out a buffer, but we haven't found any noticeable performance impact in our production environment. As I said before - we'd prefer to just set this on the socket once the connection is established, but couldn't find a way to do that through socket.io. Hopefully, this helps out other folks having this problem. We'd love to hear if anyone has a way to improve on this, or find a way to just call setKeepAlive() directly on the socket. Oh, and it it's useful, we are using node.js 10.25 and socket.io 0.9.16. |
hi , I used ubuntu(12.04) + nodejs (v0.10.22) + socket.io (v0.9.16) to transmit messages.
There are ~300 simultaneous connections. After some hours (about 1 or 2 hours above, it doesn't show up immediately), some connections will persistent in the state CLOSE_WAIT or FIN_WAIT2.
And these un-dead connections grows linearly with time. The users will hard to connect socket server when the connections number reach the limit (Default 1024 - see also Linux TCP/IP tuning for scalability) , unless some connections released normally.
The following was socket service connections status, running about 3 hours.
Probably Solutions:
Using Nodemon Package to run js file, when change the file's last modified time, nodemon will restart service, and release all previous un-dead connections (CLOSE_WAIT or FIN_WAIT2)
Try to let connections hard to reach limit.
Let operation system to close connections automatically in the short time, but I'm not try it yet.
see also : Using TCP keepalive under Linux
Question:
I found some probably solution to fix the problem. But the above solutions were not really solved the persistent connections with state CLOSE_WAIT or FIN_WAIT2 problem. I could find this is a result of server(CLOSE_WAIT) or clients (FIN_WAIT2) not correctly closing connections. I think socket.io will force-close these incorrectly connection after some timeout. But it seems like not work correctly.
I try to reappear the state CLOSE_WAIT or FIN_WAIT2 problem in my test environment. But it never show up these connection situation.
I found @njam ask related question before (Many stale connections in state CLOSE_WAIT and FIN_WAIT2), but still can't find the solution. Does anyone know how to solve this problem??
Thanks.
References:
The text was updated successfully, but these errors were encountered: