Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

socket.io redis store leaks channels #663

Closed
brettkiefer opened this issue Nov 23, 2011 · 15 comments
Closed

socket.io redis store leaks channels #663

brettkiefer opened this issue Nov 23, 2011 · 15 comments

Comments

@brettkiefer
Copy link

Repro:

Socket.io 0.8.7
Run socket.io with the redis backing store.
run 'info' on your redis instance, noting the number of pubsub_channels
Create a websocket connection to socket.io
run 'info' again: you should now have 5 more pubsub_channels than before
Close the websocket connection
run 'info' again: you now have 3 more pubsub_channels than when you started

Repeat -- it leaks 3 channels each time.

Is this intended?

This line would cause it to leak 2 channels, because we're not unsubscribing in the non-local case https://github.com/LearnBoost/socket.io/blob/master/lib/manager.js#L516

But even if I remove the if (local) logic, I still leak one channel per opened and closed connection.

@3rd-Eden
Copy link
Contributor

ping @dshaw ^

@dshaw
Copy link
Contributor

dshaw commented Nov 24, 2011

On it.

@brettkiefer
Copy link
Author

The third leaked subscription is this one: https://github.com/LearnBoost/socket.io/blob/master/lib/manager.js#L433

It looks like this is because the other processes never get a disconnect message, and it looks to me as though that's because connected process's manager never publishes a message on the 'disconnect' channel -- so that probably means we're leaking memory in the client processes as well as channels in the Redis instance. I'll issue a pull; if you want it against some repo other than this one, let me know which.

@dshaw
Copy link
Contributor

dshaw commented Nov 26, 2011

@brettkeifer Pull Request #665 is not going to handle it correctly. Sorry, I've been tied up with Thanksgiving holidays. Landing patch(es) now.

dshaw added a commit to dshaw/socket.io that referenced this issue Nov 26, 2011
…s not getting passed through onClientDisconnect().
rauchg added a commit that referenced this issue Nov 26, 2011
…s not getting passed through onClientDisconnect()."

This reverts commit d5ab46d.
@brettkiefer
Copy link
Author

@dshaw Cool, and I see that you picked up the disconnect publish from c742735 in c110036 to get rid of the other leaked channel, so that looks like it ought to fix our issue, I'll give your changes a try. Thanks! I did it the way I did because it looked to me as though onClientDisconnect would only ever be called when 'local' should be true, so there was no need for the param -- can you explain if that reasoning was wrong?

@guille Uh oh, looks like this broke something for you, hence the reverts?

@rauchg
Copy link
Contributor

rauchg commented Nov 27, 2011

It might or might not have broken our tests :D

@rauchg rauchg closed this as completed Nov 27, 2011
@rauchg rauchg reopened this Nov 27, 2011
@dshaw
Copy link
Contributor

dshaw commented Nov 27, 2011

Tests run clean for me.

@dshaw
Copy link
Contributor

dshaw commented Nov 27, 2011

@brettkiefer onClientDisconnect is called in 3 places, only one of which is with the local flag.

@dshaw
Copy link
Contributor

dshaw commented Nov 27, 2011

Issue is with Travis, not dshaw. ;-P

@brettkiefer
Copy link
Author

@dshaw Okay, got it - so a Socket.prototype.disconnect and a Manager.prototype.handleClient happen in processes where the client isn't 'local'?

@brettkiefer
Copy link
Author

Running with dshaw's patch applied to 0.8.7, it's certainly not leaking a 3 channels per closed connection (thanks!). It's still leaking channels, but very slowly, and I haven't yet figured out when or why.

Also, just in case this rings any bells, it also LOOKS like we're seeing an issue in production where one of our processes will start creating socket.io pubsub channels so quickly that it will flood all of the space available for Redis in under a minute, so I've added logging to try to figure that out, too. It looks like it happens about once per 10000 open websockets per hour.

@dshaw
Copy link
Contributor

dshaw commented Dec 5, 2011

If you have reproducible steps for the additional leaks, please post them. I'd like to squash that too.

I've encountered an issue like that intermittently myself. Seems to be related to XHR transports getting in a weird state. Are you only using pure websockets?

@brettkiefer
Copy link
Author

Thanks, and of course I'll post more info on the other issues as I get it. Yes, pure websockets only -- we use a homegrown and fairly efficient short xhr poll as the fallback for Trello because that approach ALWAYS works, and we had to turn Socket.io websockets off right after launch due to some difficult-to-diagnose, sudden, and catastrophic problems we were having with it under load. So we're trying to get all of our websocket-capable browsers back on websockets now.

@brettkiefer
Copy link
Author

Okay, it looks like the issue with Redis consuming all of its available memory in a very short time (issuecomment 3010815, three comments up) is NOT due to a bunch of pubsub channels being opened -- measurement error on my part, my apologies.

@brettkiefer
Copy link
Author

The issue with Redis going down the tubes and evicting all of its keys turned out to be redis/redis#91, see that case for more info. It LOOKED like it was a problem with socket.io because without socket.io running the offending process would die quickly (and was almost always the first to err on a SET), whereas with socket.io we try to close all of our websocket connections on process exit to work around issue #495, and the extra time doing that kept the Redis connection (and thus its output_list) around long enough to do more damage.

pitr pushed a commit to uken/socket.io that referenced this issue Jun 26, 2013
…s not getting passed through onClientDisconnect().
darrachequesne pushed a commit that referenced this issue Jul 8, 2024
…663)

Bumps [engine.io](https://github.com/socketio/engine.io) from 4.1.2 to 6.2.1.
- [Release notes](https://github.com/socketio/engine.io/releases)
- [Changelog](https://github.com/socketio/engine.io/blob/main/CHANGELOG.md)
- [Commits](socketio/engine.io@4.1.2...6.2.1)

---
updated-dependencies:
- dependency-name: engine.io
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants