-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-28113 Modify the way of acquiring the RegionStateNode lock in checkOnlineRegionsReport to tryLock #5442
HBASE-28113 Modify the way of acquiring the RegionStateNode lock in checkOnlineRegionsReport to tryLock #5442
Conversation
… modify the way of acquiring the RegionStateNode lock in checkOnlineRegionsReport to tryLock.
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make the tryLock timeout configurable? And is 10 seconds a suitable default value? For me I think it is too long...
serverName, regionNode, diff); | ||
// It is likely that another thread is currently holding the lock. To avoid deadlock, use | ||
// tryLock waiting for a specified period of time | ||
if (regionNode.tryLock(10, TimeUnit.SECONDS)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tryLock should be placed outside the try block, as we do not need to unlock if we get a false here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Apache9 Thank you for your suggestion. The position of the tryLock has been adjusted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could just use tryLock without any timeout? Then it will not throw any exceptions so we do not need this two level try catch block. For me I think it is OK as this is not a critical check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a very good suggestion, and it has been adjusted accordingly.
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
if (diff > lag) { | ||
LOG.warn("Reporting {} state does not match {} (time since last update={}ms)", | ||
serverName, regionNode, diff); | ||
// It is likely that another thread is currently holding the lock. To avoid deadlock, use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is incorrect now? And better mention that this is just a fallback check for checking unexpected incosistency. This is what ChatGPT suggested
This is just a fallback check designed to identify unexpected data inconsistencies, so we use tryLock to attempt to acquire the lock, and if the lock cannot be acquired, we skip the check. This will not cause any additional problems and also prevents the regionServerReport call from being stuck for too long which may cause deadlock on region assignment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your advice and the suggestions from ChatGPT are very valuable and worth considering.
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Hi, @Apache9 Based on my assessment, the failure of the unit test is not related to this modification. I can manually run the unit test and still pass. |
…heckOnlineRegionsReport to tryLock (#5442) * To prevent threads from being blocked by the lock of RegionStateNode, modify the way of acquiring the RegionStateNode lock in checkOnlineRegionsReport to tryLock. Co-authored-by: lvhaiping.lhp <[email protected]> Signed-off-by: Duo Zhang <[email protected]> (cherry picked from commit e07d1fe)
…heckOnlineRegionsReport to tryLock (#5442) * To prevent threads from being blocked by the lock of RegionStateNode, modify the way of acquiring the RegionStateNode lock in checkOnlineRegionsReport to tryLock. Co-authored-by: lvhaiping.lhp <[email protected]> Signed-off-by: Duo Zhang <[email protected]> (cherry picked from commit e07d1fe)
…heckOnlineRegionsReport to tryLock (#5442) * To prevent threads from being blocked by the lock of RegionStateNode, modify the way of acquiring the RegionStateNode lock in checkOnlineRegionsReport to tryLock. Co-authored-by: lvhaiping.lhp <[email protected]> Signed-off-by: Duo Zhang <[email protected]> (cherry picked from commit e07d1fe)
…heckOnlineRegionsReport to tryLock (#5442) * To prevent threads from being blocked by the lock of RegionStateNode, modify the way of acquiring the RegionStateNode lock in checkOnlineRegionsReport to tryLock. Co-authored-by: lvhaiping.lhp <[email protected]> Signed-off-by: Duo Zhang <[email protected]> (cherry picked from commit e07d1fe)
…heckOnlineRegionsReport to tryLock (apache#5442) * To prevent threads from being blocked by the lock of RegionStateNode, modify the way of acquiring the RegionStateNode lock in checkOnlineRegionsReport to tryLock. Co-authored-by: lvhaiping.lhp <[email protected]> Signed-off-by: Duo Zhang <[email protected]>
…heckOnlineRegionsReport to tryLock (apache#5442) * To prevent threads from being blocked by the lock of RegionStateNode, modify the way of acquiring the RegionStateNode lock in checkOnlineRegionsReport to tryLock. Co-authored-by: lvhaiping.lhp <[email protected]> Signed-off-by: Duo Zhang <[email protected]> (cherry picked from commit e07d1fe) (cherry picked from commit cd3a83f) Change-Id: Ib90080d08c703c30c25e52faa0a597fc3970bc62
To prevent threads from being blocked by the lock of RegionStateNode, modify the way of acquiring the RegionStateNode lock in checkOnlineRegionsReport to tryLock.