Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disconnection Event Lost #2852

Closed
1 of 2 tasks
sgrtho opened this issue Feb 7, 2017 · 7 comments
Closed
1 of 2 tasks

Disconnection Event Lost #2852

sgrtho opened this issue Feb 7, 2017 · 7 comments

Comments

@sgrtho
Copy link

sgrtho commented Feb 7, 2017

I want to:

  • report a bug
  • request a feature

This report may or may not be related to issues #2544 and/or #2565.

Current behaviour

Sometimes connected sockets (which received a connection event) won't receive the disconnect event although they are removed from the sockets object and (most likely) disappear from socket.io server.

Steps to reproduce / Side effects

I have been unable to safely reproduce this issue and it only occurs in non-debug environment (like our beta servers). But I can elaborate on observed side effects, which may help to pinpoint the problem source (or my confusion...).

We handle a namespaced connection:

ioServer
  .of(coreSettings.core.ioNsp)
  .on("connection", handleNewSocket)
  ;

Where the handler looks like this:

export default function handleNewSocket(socket: SocketIO.Socket) {
  let
    iv : NodeJS.Timer = setInterval(intervalGuard, 120000)
    ;

  // [...] report connect exception

  socket.on("disconnect"    , () => {
    // [...] report disconnect exception

    clearInterval(iv)
  });

  socket.on("error", (error: any) => {
    // [...] report error exception
  });

  function intervalGuard() {
    if(!(socket.id in io.of(coreSettings.core.ioNsp).sockets)) {
      // [...] report disconnect-lost exception

      clearInterval(iv);
    }
  }
}

Using this code I regularly receive the disconnect-lost exception detected by the interval workaround. No error events occur, no other exceptions show up.

The only significant detail I could find, occurs when I also listen on the default namespace:

ioServer
  .on("connection", handleNewSocketInDefaultNamespace)
  ;

There I also report connection, disconnect and error events. It seems that in a case where a disconnect is lost the socket does enter the namespaced listener but is not reaching the default listener, which I can derive from the exceptions I recorded:

> exceptions.explore();
===================
List of exceptions:
-> at-ex-control (1)
-> at-io-connect-global-socket-connect-event (63) (sampling disabled)
-> at-io-connect-global-socket-disconnect-event (57) (sampling disabled)
-> at-io-connect-namespaced-socket-connect-event (64) (sampling disabled)
-> at-io-connect-namespaced-socket-disconnect-event (57) (sampling disabled)
-> at-io-connect-namespaced-socket-lost (1) (sampling disabled)

Here you can see that the counter for global-connect is smaller than namespaced-connect, 63 vs. 64. I think it is exactly the same socket which drops the disconnect event. More on that:

> let
>  lostSocket = exceptions.explore("at-io-connect-namespaced-socket-lost").annotations[0]
>  ;

> lostSocket.disconnect();
> exceptions.explore();

List of exceptions:
===================
-> at-ex-control (1)
-> at-io-connect-global-socket-connect-event (63) (sampling disabled)
-> at-io-connect-global-socket-disconnect-event (57) (sampling disabled)
-> at-io-connect-namespaced-socket-connect-event (64) (sampling disabled)
-> at-io-connect-namespaced-socket-disconnect-event (58) (sampling disabled)
-> at-io-connect-namespaced-socket-lost (1) (sampling disabled)

You can see that the disconnect actually fires once I manually disconnect the socket. I performed all theses actions in single sync run and the counter of namespaced-disconnect increased by one. Any ideas?

Expected behaviour

If a socket receives a connection event it should always receive a disconnect event, no matter if in a namespace or not.

Setup

  • OS: Debian 8.7
  • Node: 6.9.4
  • Browsers: Android Chrome (WebView), iOS Safari (UIWebView)
  • socket.io version: 1.7.2 (server and -client)
@sgrtho
Copy link
Author

sgrtho commented Feb 9, 2017

I'd like to update my observations: recording over a longer period yields a result indicating there is no correlation between default/namespaced connects and the number of lost disconnects:

console.log("Sockets default:", Object.keys(io.sockets).length);
console.log("Sockets namespaced:", Object.keys(io.of(coreSettings.core.ioNsp).sockets).length); 
exceptions.explore();

Sockets default: 9
Sockets namespaced: 4
===================
List of exceptions:
-> at-ex-control (1)
-> at-io-connect-global-socket-connect-event (11293) (sampling disabled)
-> at-io-connect-global-socket-disconnect-event (11289) (sampling disabled)
-> at-io-connect-namespaced-socket-connect-event (11410) (sampling disabled)
-> at-io-connect-namespaced-socket-disconnect-event (11406) (sampling disabled)
-> at-io-connect-namespaced-socket-lost (206) (sampling disabled)

Please be aware that the recorded exceptions now include the interval based workaround wich manually disconnects a lost socket. This is why the the namespaced connects and disconnects match so closely.

I can see that the number of sockets in the default namespace is higher than the difference of recorded connects and disconnects. I suspect this is because some of these are waiting for re-connection.

But it is safe to say that there is no perfectly clear connection between the number of lost disconnects and the number of default/namespaced connects.

@Azzaroff
Copy link

Azzaroff commented Sep 7, 2017

I noticed the same thing and was also unable to reproduce the cause. The workaround seems a good way to at least detect this. Thanks for your insight!

@brenc
Copy link

brenc commented Nov 4, 2017

I believe I am seeing this too. The only additional information I can provide is that it seems to be happening to users with really spotty internet connections i.e. users that connect and disconnect a lot (usually with a 'ping timeout').

@martinzwirner
Copy link

@sgrtho & others: Does this issue still exist in the latest version of Socket.IO (2.0.4)? We are currently using version 1.7.2 and may experience this problem as well.

@sgrtho
Copy link
Author

sgrtho commented Dec 4, 2017

Hey, I can't really say. We did not yet migrate to 2.x (probably never will).

@brenc
Copy link

brenc commented Dec 4, 2017

I don't think my issue was with socket.io but I ended up removing it from all my projects. I use ws on its own now.

@darrachequesne
Copy link
Member

Closed due to inactivity, please reopen if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants