-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-27551 Add config options to delay assignment to retain last region location #4945
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to clarify, we try to assign the region with queueAssign
before we find the region server was dead , such it will not issue the SCP , right?
long retries = 0; | ||
while (!isOnline && retries < forceRegionRetainmentRetries) { | ||
try { | ||
synchronized (this) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it mean there may have multiple procedure tries to use this checkAndWaitForOriginalServer
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, this would only be called by each TRSP individually. Removing this synchronized block on next commit.
💔 -1 overall
This message was automatically generated. |
Not really, in fact an SCP triggers a TRSP which will call |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
760d29a
to
87fc143
Compare
public static final String FORCE_REGION_RETAINMENT_WAIT = | ||
"hbase.master.scp.retain.assignment.force.wait"; | ||
|
||
public static final long DEFAULT_FORCE_REGION_RETAINMENT_WAIT = 500; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I first read this config as how long we should wait for this server to come online. However, the code uses this config to sleep for this period between checks, so the overall wait time is FORCE_REGION_RETAINMENT_WAIT * FORCE_REGION_RETAINMENT_RETRIES.
From the operator's perspective, I think it is better to manage this with overall wait time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A single overall wait time would be less efficient, we would forcibly wait for the whole of the given timeout, even if the RS comes back online much sooner than that. Maybe better leave as this, but put comments on this constant to avoid confusion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation/comment is definitely good to have here. Is it costly to check the RS's availability? Maybe hbase.master.scp.retain.assignment.force.wait can be set to a small value so configuring it might not be that important.
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
Will take a look in the next few days. Please give me some time~ |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Change-Id: I58eba4f382974f73167fc8cd2bf1bb4f3d5be0a9
1ff0b08
to
69db293
Compare
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
throws ProcedureSuspendedException { | ||
boolean isOnline = serverManager.findServerWithSameHostnamePortWithLock(lastHost) != null; | ||
long retries = 0; | ||
while (!isOnline && retries < forceRegionRetainmentRetries) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this way the procedure will hang here for a very long time without releasing the procedure worker. Better suspend the procedure and reschedule it again later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Thanks for the suggestion, I had changed this in the last commit.
@@ -126,6 +144,14 @@ | |||
|
|||
private boolean isSplit; | |||
|
|||
private boolean forceRegionRetainment; | |||
|
|||
private ServerManager serverManager; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ServerManager can be gotten from MasterProcedureEnv, so we do not need to store it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Problem here is that we don't keep any MasterProcedureEnv ref as class attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When executing you will always have a MasterProcedureEnv...
@@ -126,6 +144,14 @@ | |||
|
|||
private boolean isSplit; | |||
|
|||
private boolean forceRegionRetainment; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the below fields should not be stored here, when reloading we will use the default constructor to create a procedure and use deserialize method to restore the fields, so if you want to store them here, you need to serialize them, or you should implement the afterReplay method to initialize them...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out. Since these are all config readings, decided to go with the afterReplay
overriding option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at other configs in TRSP, I prefer we store these configs in AssignmentManager, just like
private final boolean shouldAssignRegionsWithFavoredNodes;
private final int assignDispatchWaitQueueMaxSize;
private final int assignDispatchWaitMillis;
private final int assignMaxAttempts;
private final int assignRetryImmediatelyMaxAttempts;
Change-Id: Ie13da5dfad3969c1477eac0e37f808051c884bff
🎊 +1 overall
This message was automatically generated. |
Hi @Apache9 @petersomogyi @taklwu , any other comments/suggestions? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
@@ -126,6 +144,14 @@ | |||
|
|||
private boolean isSplit; | |||
|
|||
private boolean forceRegionRetainment; | |||
|
|||
private ServerManager serverManager; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When executing you will always have a MasterProcedureEnv...
@@ -126,6 +144,14 @@ | |||
|
|||
private boolean isSplit; | |||
|
|||
private boolean forceRegionRetainment; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at other configs in TRSP, I prefer we store these configs in AssignmentManager, just like
private final boolean shouldAssignRegionsWithFavoredNodes;
private final int assignDispatchWaitQueueMaxSize;
private final int assignDispatchWaitMillis;
private final int assignMaxAttempts;
private final int assignRetryImmediatelyMaxAttempts;
retries++; | ||
LOG.info("Suspending the TRSP PID={} because {} is true and previous host {} " | ||
+ "for region is not yet online.", this.getProcId(), FORCE_REGION_RETAINMENT, lastHost); | ||
setTimeout(forceRegionRetainmentWait); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the forceRegionRetainmentWait
here is interval? Not total time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the wait interval between attempts, not the total time.
Thanks for the comments, @Apache9 , I have applied your latest suggestions. Let me know if this is good to go. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
public static final boolean DEFAULT_FORCE_REGION_RETAINMENT = false; | ||
|
||
/** The wait time in millis before checking again if the region's previous RS is back online */ | ||
public static final String FORCE_REGION_RETAINMENT_WAIT = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better call it wait interval?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
...ver/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java
Show resolved
Hide resolved
retries++; | ||
LOG.info("Suspending the TRSP PID={} because {} is true and previous host {} " | ||
+ "for region is not yet online.", this.getProcId(), FORCE_REGION_RETAINMENT, lastHost); | ||
setTimeout(env.getAssignmentManager().getForceRegionRetainmentWait()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So here we do not want to use Exponential backoff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Switched to use RetryCounter and exponential backoff.
Thanks for the latest suggestions @Apache9 , I had pushed a new commit addressing those. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Hi @Apache9 , just checking if the latest commit is good to go. |
…ion location (#4945) Signed-off-by: Tak Lon (Stephen) Wu <[email protected]> Signed-off-by: Duo Zhang <[email protected]>
…ion location (apache#4945) Signed-off-by: Tak Lon (Stephen) Wu <[email protected]> Signed-off-by: Duo Zhang <[email protected]> Change-Id: Ie92e5f00e9c9efacb939f9810002a9a1edeeb3ff
No description provided.