Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sentinel thread-safe during master failover #909

Closed
wants to merge 1 commit into from

Conversation

lalinsky
Copy link

In the current version, the sentinel code tries to close all
connections immediately after discovering there is a new master.
This is a problem in multi-threaded environment, because
neither ConnectionPool.disconnect nor Connection.disconnect are
thread-safe. If you call SentinelConnectionPool.disconnect after master
failover, that will close all connections that are potentially used
from other threads, causing all kinds of errors.

This change avoids that behavior by adding acquire/release checks, so
connections that don't belong the current master are never returned
to the pool and they are closed instead.

@@ -59,9 +59,6 @@ def read_response(self):
# When talking to a master, a ReadOnlyError when likely
# indicates that the previous master that we're still connected
# to has been demoted to a slave and there's a new master.
# calling disconnect will force the connection to re-query
# sentinel during the next connect() attempt.
self.disconnect()
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The connection will get closed when it's returned to the pool.

self.master_address = master_address
elif master_address != self.master_address:
# Master address changed, disconnect all clients in this pool
self.disconnect()
Copy link
Author

@lalinsky lalinsky Sep 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All old connections will get closed when attempting to fetch them from the pool. I could theoretically close connections from _available_connections here, but I don't know what would the thread-safety implications be. The fact that this code depends on atomic operations in Python is confusing enough.

@@ -125,16 +120,36 @@ def rotate_slaves(self):
pass
raise SlaveNotFoundError('No slave found for %r' % (self.service_name))

def _checkpid(self):
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation in ConnectionPool is thread-safe and the __init__() call is not needed, reset() does everything necessary.

master_address = self.sentinel_manager.discover_master(
self.service_name)
if self.is_master:
if self.master_address is None:
if master_address != self.master_address:
self.master_address = master_address
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related pull request - #847

In the current version, the sentinel code tries to close all
connections immediately after discovering there is a new master.
This is a problem in multi-threaded environment, because
neither `ConnectionPool.disconnect` nor `Connection.disconnect` are
thread-safe. If you call `SentinelConnectionPool.disconnect` after master
failover, that will close all connections that are potentially used
from other threads, causing all kinds of errors.

This change avoids that behavior by adding acquire/release checks, so
connections that don't belong the current master are never returned
to the pool and they are closed instead.
@lalinsky lalinsky force-pushed the thread-safe-sentinel branch from 38a0157 to bbf553c Compare October 2, 2017 15:26
@github-actions
Copy link
Contributor

This pull request is marked stale. It will be closed in 30 days if it is not updated.

@github-actions github-actions bot added the Stale label Jun 30, 2020
@github-actions github-actions bot closed this Jul 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant