-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: quiesce ranges [postponed] #8811
Conversation
Review status: 0 of 1 files reviewed at latest revision, 3 unresolved discussions, some commit checks failed. docs/RFCS/quiesce_ranges.md, line 16 [r1] (raw file):
Consider dropping this second "potentially" docs/RFCS/quiesce_ranges.md, line 59 [r1] (raw file):
theoretically docs/RFCS/quiesce_ranges.md, line 62 [r1] (raw file):
What happens on the raft layer when a killed node is restarted? How does it figure out that a range is quiesced, and how does this interact with the lazy replica initialization? I think it could be useful to include these scenarios in the explanation. Comments from Reviewable |
more than the election timeout, it was going to call the election anyway, thus | ||
ending the term that quiesced. | ||
|
||
Once quiesced, any replica receiving a request should restart raft and trigger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/receiving a request/asked to propose a command/
Consider also adding some discussion on the urgency of this proposal - I think it's fairly low compared to some other, more immediate problems we have with Raft performance: we bottleneck when many replicas make substantial progress simultaneously, in particular when snapshots are involved (#8638). |
|
||
This almost certainly needs to be implemented at least partially upstream in | ||
raft -- inspecting the follower state to determine when it is safe to stop | ||
heartbeating involves inspecting internal raft state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could add here that this feature may not be received enthusiastically in upstream Raft, as we're the only multi-range use case and the cost of letting a single Raft group live is marginal.
Review status: 0 of 1 files reviewed at latest revision, 10 unresolved discussions, some commit checks failed. docs/RFCS/quiesce_ranges.md, line 65 [r1] (raw file):
|
Reviewed 1 of 1 files at r1. docs/RFCS/quiesce_ranges.md, line 29 [r1] (raw file):
being able to assume Comments from Reviewable |
Context for people who missed discussion with @mjibson and @tschotdorf last week: So while the potential resource usage savings haven't yet seemed like enough, on their own, to make this a priority, its potential in building other things is why we wanted to explore it further now (though someone mentioning an idle cluster burning hundreds of gigs of network recently seems to suggest those savings might be worth something on their own too). Review status: 0 of 1 files reviewed at latest revision, 11 unresolved discussions, some commit checks pending. docs/RFCS/quiesce_ranges.md, line 16 [r1] (raw file):
|
The document is written from the perspective that this is an optimization; if the primary motivation is now bulk ingestion, that should be reflected in the doc. Quiescing-as-optimization and quiescing-for-bulk-ingestion have rather different needs. As discussed below, raft-level quiescing is fragile since there's no great way to synchronize it across all nodes and the slightest breeze will bring the raft group back. That's fine if it's an optimization, but not if you're semantically relying on it. I think for bulk ingestion it would be better to base things on the Freeze command and avoid low-level raft tinkering. Review status: 0 of 1 files reviewed at latest revision, 9 unresolved discussions, some commit checks failed. Comments from Reviewable |
To some degree I was hoping quiescing could be used by both bulk ingestion and as an resource usage optimization -- the machinery discussed herein for how to quiesce a range is hopefully general enough that it could be used either by a "looks like this range is inactive so I can save some cycles" routine or by an explicit "i need this range to freeze its raft state so I can do something" bulk ingestion. The current freeze doesn't quite work for some of the current proposals for the later, as they involve simple snapshots that happen to correspond to a an applied index in the future, and thus need to be sure raft is inactive and that index stays in the future -- but all of that I believe is covered in more detail in the forthcoming bulk-ingestion-of-ranges RFC that @paperstreet is writing -- this just got broken into a standalone RFC that that could reference and so we could gauge scope, particularly since it might have utility outside bulk ingestion as well. Review status: 0 of 1 files reviewed at latest revision, 11 unresolved discussions, some commit checks failed. Comments from Reviewable |
Having discussed this a fair bit, I think that quiescing for restore purposes is a long shot (or at least so complicated that we're better off pondering alternatives), so I think that we're best off focusing here on the merit quiescing could have outside of that context. Review status: 0 of 1 files reviewed at latest revision, 11 unresolved discussions, some commit checks failed. Comments from Reviewable |
Ok, sounds like we've gathered more or less what we wanted to know for backup/restore -- quiescing active ranges seems tricky enough and the other benefits questionable enough that we might be better off exploring other ways for bulk ingestion first. Should I change status to rejected and merge, or just close unmerged? |
Merge as rejected, or maybe as "postponed" or "deferred". On Wed, Aug 31, 2016, 10:10 PM David Taylor [email protected]
|
done. |
This change is