-
Notifications
You must be signed in to change notification settings - Fork 7
Add lifetime to the distributed locks #136
Add lifetime to the distributed locks #136
Conversation
* refactor: Change TimeoutError to asyncio.TimeoutError
Codecov Report
@@ Coverage Diff @@
## main #136 +/- ##
==========================================
+ Coverage 76.82% 76.93% +0.11%
==========================================
Files 26 26
Lines 3301 3317 +16
==========================================
+ Hits 2536 2552 +16
Misses 765 765
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- etcetra: Please revoke the grant when unlocking.
Let's skip the watchdog for |
src/ai/backend/common/lock.py
Outdated
super().__init__(lifetime=lifetime) | ||
self.lock_name = lock_name | ||
self.etcd = etcd | ||
self.lifetime = lifetime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is self._lifetime
inherited already. Please use it!
…://github.com/lablup/backend.ai-common into feature/add-lock-timeout-to-distributed-locks
This will auto-release the locks if a manager process holding the lock hangs or abruptly killed in HA setup.
FileLock
andPgAdvisoryLock
(in the manager) auto-releases the lock when the manager process gets terminatedbecause the OS will close the relevant file descriptors.
EtcdLock
requires an explicit unlock and this may not release the lock in such cases.We could avoid this problem by adding an explicit lifetime that automatically releases the lock in the server-side.
Let's set the default lifetime to be
min(interval + 30, interval * 2)
and implement:EtcdLock
: lease-based lock lifetimeFileLock
: add a watchdog task to auto-release the lock in case of hang