-
Notifications
You must be signed in to change notification settings - Fork 15
[TS | LIP-164000] Reset Sweep Progress #5277
Conversation
Generate changelog in
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine, but I'm not really sure why we want to stop sweep in addition to resetting. Though it's a good forcing function to make people remove the flag so I guess that's the reason.
@@ -135,6 +135,32 @@ private long increaseValueFromToAtLeast(ShardAndStrategy shardAndStrategy, long | |||
return currentValue; | |||
} | |||
|
|||
public void resetProgressForShard(ShardAndStrategy shardAndStrategy) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh, not a fan of the code duplication, but it's different enough to make it awkward to reuse. We could just delete the entry to make this much simpler but I assume we want HA -- not that sweep will work if we don't have delete consistency anyway...
conservativeScheduler.scheduleBackgroundThreads(); | ||
thoroughScheduler.scheduleBackgroundThreads(); | ||
if (shouldResetAndStopSweep) { | ||
log.warn("This AtlasDB node is operating in a mode where it is attempting to reset the progress of " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really what we want? We could just allow it to start sweeping. Also the nodes on old versions will still be attempting to sweep anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline: this centers around the behaviour of targeted sweep where nodes CAS the bound from (thing I read -> my progress) repeatedly. We need to wait for all nodes to report that they're done with this.
doThrow(new CheckAndSetException("sadness")).when(mockKvs).checkAndSet(any()); | ||
ShardProgress instrumentedProgress = new ShardProgress(mockKvs); | ||
|
||
assertThatCode(() -> instrumentedProgress.resetProgressForShard(CONSERVATIVE_TEN)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably want to verify that it only throws once the value actually repeats
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think this is tested by stopsTryingToResetIfSomeoneElseDid()
above - in practice you could check the CheckAndSetException's actual values but we don't get that here because of mocks, so I don't see another way of easily doing this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OH I meant just verify that the method was called 4 times
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added!
Discussed offline: document why we need to stop running sweep and wait for all nodes to reset |
Released 0.302.5 |
Goals (and why):
Implementation Description (bullets):
Testing (What was existing testing like? What have you done to improve it?): Added some unit tests.
Concerns (what feedback would you like?):
Where should we start reviewing?: ShardProgress
Priority (whenever / two weeks / yesterday): this week