-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver,cli,roachtest,sql: introduce a fully decommissioned bit #50329
kvserver,cli,roachtest,sql: introduce a fully decommissioned bit #50329
Conversation
579ab5f
to
eb323ef
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Irfan! Just got through the first round of review. Summarizing the main work items that I see (they're also in individual comments):
- it seems better to reuse the
decommissioning
bool (make it an enum). I saw all of this compatibility logic and it looks way more complicated than I hope it needs to be. Reusing the field, all you have to do is put a cluster version check that we're never writing enum values other than0
and1
until we have the new version active, and none of the consumers of liveness statuses really need special logic. - worried that
commission
is not an intuitive choice. My strawman is to name this the "membership status" which is either "active", "decommissioning", "decommissioned". I think that users seeing that in their output would intuitively know what that is more than "commission status: commissioning".
Also, in the diagram, it looks like there's an arrow from "decommissioned" to "decommissioning" which should not be there.
It may also be a good time to write a mixed-version decommissioning test, perhaps as a replacement to the acceptance test which dropped out due to needing (why?) more nodes. This could be kept relatively simple, set up a four node cluster in which two nodes are new and two are old, then pick a random node to decommission and check that it goes through without a hitch (using the old binary as cli, I think you'll have a hard time getting the new one to work since it expects the new column to exist). runVersionUpgrade
is a good template to follow for mixed-version stuff. (I'd be fine with not doing all of the cli output parsing which is super nasty but I don't see an obvious better way).
You may want to pull out the deprecation of the nodelivenessstatus proto into a separate PR so that it's out of your hair here. I would prefer to keep this minimal (just a rename) and to cut down on its uses only when you are ready to focus on it (i.e. probably not now).
Reviewed 11 of 11 files at r1, 27 of 27 files at r2, 2 of 2 files at r3, 19 of 19 files at r4, 4 of 4 files at r5.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @irfansharif and @knz)
pkg/cli/demo_cluster.go, line 375 at r4 (raw file):
req := &serverpb.CommissionRequest{ NodeIDs: []roachpb.NodeID{nodeID}, CommissionStatus: kvserverpb.CommissionStatusFromBooleanForm(decommissioning),
why Form
? Isn't CommissionStatusFromBool
good enough?
pkg/cli/node.go, line 66 at r4 (raw file):
conn, makeQuery(`SELECT node_id FROM crdb_internal.gossip_liveness WHERE commission_status = 'commissioned' OR split_part(expiration,',',1)::decimal > now()::decimal`),
Add a release note that the 20.2 cli node
family of commands will not work with older versions, but that older versions will work against 20.2.
pkg/cli/node.go, line 340 at r4 (raw file):
req := &serverpb.CommissionRequest{ NodeIDs: nodeIDs, CommissionStatus: kvserverpb.CommissionStatus_DECOMMISSIONING,
If a 20.2 cli were used against a 20.1 node, it would always recommission, which is bad.
I think this problem goes away if you re-use the field as I suggested in another comment.
pkg/cli/node.go, line 361 at r4 (raw file):
for _, status := range resp.Status { replicaCount += status.ReplicaCount allDecommissioning = allDecommissioning && status.CommissionStatus.Decommissioning() ||
Uh, oh. false && false || true == true
, so this is a bug. You want false && (false || true)
pkg/cli/node.go, line 364 at r4 (raw file):
status.CommissionStatus.Decommissioned() } if replicaCount == 0 {
You would never want to enter this if !allDecommissioning
, no?
pkg/cli/node.go, line 365 at r4 (raw file):
} if replicaCount == 0 { // We now mark the node as fully decommissioned.
node(s)
pkg/cli/node.go, line 389 at r4 (raw file):
} if wait == nodeDecommissionWaitNone { // We don't bother with marking the node as fully decommissioned.
But you may already have done it? I think you need to make sure you never do it in that case.
Let's give this method a real close look, since it seems really easy to introduce bugs here.
pkg/cli/node.go, line 419 at r4 (raw file):
strconv.FormatBool(node.IsLive), strconv.FormatInt(node.ReplicaCount, 10), strconv.FormatBool(node.CommissionStatus.Decommissioning() || node.CommissionStatus.Decommissioned()),
All of these too will be better off with the enum reusing the bool field. Right now if you use a new cli against old node you'll just get wrong answers. I'm ok breaking compatibility here but we shouldn't silently return wrong data. Problem will disappear with enum reuse, luckily.
pkg/cli/node.go, line 446 at r4 (raw file):
// messages generated by cockroach. func extractCommissionErr(errMsg string) string { idx := strings.Index(errMsg, kvserverpb.ErrIllegalCommissionStatusTransition.Error())
This returns -1 if not found, in which case you'll explode below.
I see now that you guard against that at the caller, but move that check in here. If you can't match, return "".
pkg/cli/node.go, line 474 at r4 (raw file):
// If it's a specific illegal commission transition error, we try to // surface a more readable message to the user. if strings.Contains(err.Error(), kvserverpb.ErrIllegalCommissionStatusTransition.Error()) {
Isn't there a better way to solve this? You can return a structured grpc error:
err := status.Error(codes.FailedPrecondition, "something something blah blah")
and then later
if s, ok := status.FromError(err); ok && s.Code() == codes.FailedPrecondition {
fmt.Println(s.Message())
}
pkg/cli/interactive_tests/test_demo_node_cmds.tcl, line 70 at r4 (raw file):
send "select node_id, draining, decommissioning, commission_status from crdb_internal.gossip_liveness ORDER BY node_id;\r" eexpect "1 | false | false | commissioned"
It's been lingering in my mind throughout the review, and I think for a good reason - I'm not sure commissioned
really is the most intuitive choice here. It looks like a verb - like we commissioned a node, i.e. we ordered one, it's bound to arrive any time soon. I feel that this might be confusing to users especially when we have some very obvious choices available, like active
. Consider switching to that.
pkg/cli/interactive_tests/test_multiple_nodes.tcl, line 24 at r3 (raw file):
start_test "Check that a double decommission prints out a warning"
Do we want to maintain these warnings though? In that case it seems wise to remove the code only when we're ready to replace it with new code. It seems that this PR is trying to do two things at once - add the decommission bit and deprecating the liveness status thing, but it's not really following through on the latter, so I would prefer if we kept that to the minimum and followed through in a separate PR.
pkg/cmd/roachtest/acceptance.go, line 38 at r4 (raw file):
{name: "build-analyze", fn: runBuildAnalyze}, {name: "cli/node-status", fn: runCLINodeStatus}, {name: "decommission", fn: runDecommissionAcceptance},
?
pkg/cmd/roachtest/decommission.go, line 253 at r4 (raw file):
r.Add(testSpec{ // This test was originally an acceptance test, we preserve its name. Name: "acceptance/decommission",
There's no issue open for it, so let's rename it now.
pkg/cmd/roachtest/decommission.go, line 279 at r4 (raw file):
// runDecommissionAcceptance tests a bunch of node // decommissioning/recommissioning procedures, all the while checking for // replica movement and appropriate commission status detection behavior.
It's great that you're updating this test, but I got a little lost in what it's actually doing now (and why it needs so many nodes). Could you provide a high-level overview of what operations it carries out here in the top comment? I know this may go stale at some point but it will still be better than having to trawl through the code every time.
In particular, are we at any point decommissioning multiple nodes at once (i.e. with one cli invocation)? If not that would be good to add now that we can.
Did you verify that this still works against <= 20.1? I'd be sort of surprised if it did, but we only have one decommissioning test and so it will have to.
pkg/kv/kvserver/node_liveness.go, line 49 at r4 (raw file):
// conditionally write the target commissioning status. It's safe to retry // when encountering this error. errChangeCommissionStatusFailed = errors.New("failed to change the commissioning status")
Similar concern about commissioning status
(which pops up in a number of column headers). What about membership
status? After all, decommissioning is all about removing members from a cluster and in the context of long-running operations, we are trying to reason about cluster membership in an even stricter sense (who can access the cluster as opposed to who is tracked as a member)
pkg/kv/kvserver/node_liveness.go, line 360 at r4 (raw file):
if errors.Is(err, errChangeCommissionStatusFailed) { // Expected when epoch incremented, or when running in mixed version // clusters. It's safe to retry.
This shouldn't fail in mixed-version clusters. I don't think it would when you take my suggestion to merge the fields. The only caveat is that we are not allowed to mark as decommissioned
but the cli
should just special-case the mixed version error there and print a warning to the client (but succeed anyway).
pkg/kv/kvserver/node_liveness.go, line 382 at r4 (raw file):
}() // Lets compute what our new liveness record should be.
Let's
pkg/kv/kvserver/node_liveness.go, line 395 at r4 (raw file):
// We may have read a liveness record written by a v20.1 node, so we // reconcile. newLiveness.EnsureCompatible()
Just a heads up that I'm going to skip lightly over this mixed version stuff for now since I expect it to change.
pkg/kv/kvserver/node_liveness.go, line 438 at r4 (raw file):
// liveness record. It has both the new and the old version of the proto. type livenessUpdate struct { new kvserverpb.Liveness
new
is also a keyword, let's avoid that.
pkg/kv/kvserver/kvserverpb/liveness.go, line 71 at r4 (raw file):
// authoritative if not. // // TODO(irfansharif): Remove this once v20.2 is cut.
Could this really be removed? Nodes that never join the cluster with 20.2 will never update their liveness, so you'd have to retain some understanding of the old format ~forever (or write an explicit migration).
I think this problem will go away when you merge the fields.
pkg/kv/kvserver/kvserverpb/liveness.proto, line 51 at r2 (raw file):
// fully decommissioned. It is deprecated in favor of CommissionStatus, and is // slated for removal in v21.1. bool deprecated_decommissioning = 5;
The protobuf encoding uses varint encoding for both bools and enum, so you don't need to add a new field. You can change the type and make sure that the enum has 0
as Commissioned
and 1
as decommissioning
. And of course you cannot use any other value unless the cluster version is active.
pkg/kv/kvserver/kvserverpb/liveness.proto, line 128 at r2 (raw file):
// across these states (LivenessStatus for e.g.) seems wholly arbitrary. enum NodeLivenessStatus { DEPRECATED_UNKNOWN = 0;
Next time, make sure you put pure renames like that in a separate commit. There were three things going in on in this one which was mildly unfortunate.
pkg/server/server.go, line 1807 at r4 (raw file):
eventType = sql.EventLogNodeDecommissioned } else if targetStatus.Commissioned() { eventType = sql.EventLogNodeCommissioned
And here you'd check the cluster version.
pkg/server/server.go, line 1809 at r4 (raw file):
eventType = sql.EventLogNodeCommissioned } else { panic("unexpected target commission status")
We need to avoid panicking whenever we can. Here it's easy to return an error.
pkg/server/serverpb/admin.proto, line 879 at r2 (raw file):
// // The name of this method, Decommission, is a misnomer for legacy // reasons. It should be thought of as `Commission` instead.
I would prefer if we just stuck with the old name throughout. CommissionRequest
sounds like a request to "commission" a node, i.e. to mark it as commissioned, and thus doesn't really seem preferable to DecommissionRequest
(which at least will be used for its name most of the time).
pkg/sql/crdb_internal.go, line 2742 at r4 (raw file):
draining BOOL NOT NULL, decommissioning BOOL NOT NULL, commission_status STRING NOT NULL,
Looking at this, I think membership: active
would have a good intuitive ring to it.
pkg/sql/event_log.go, line 90 at r4 (raw file):
// EventLogNodeDecommissioned is recorded when a node is marked as // decommissioned. EventLogNodeDecommissioned EventLogType = "node_decommissioned"
Is there a migration concern with this change? Not really right? They're only ever used as strings.
We mark the NodeLivenessStatus proto as deprecated. It's unclear if the enum was ever well considered. It enumerates across two distinct set of things: the "membership" status (live/active, decommissioning, decommissioned), and the node "process" status (live, unavailable, available). It's possible for two of these "states" to be true, simultaneously (consider a decommissioned, dead node). It makes for confusing semantics, and the code attempting to disambiguate across these states (kvserver.LivenessStatus() for e.g.) seem wholly arbitrary. It's currently used by the Allocator to detect replicas placed on dead nodes. But there's no good reason for the "deadness" to be proto-backed. It should suffice to capture the "membership" status and view of node's liveness (when was the record last heartbeated) + current time to inform allocator decisions. --- This was pulled out of cockroachdb#50329. Release note: None.
2f797ba
to
56f6c85
Compare
Just an update here: I've addressed everything brought up so far (pushing the raw form to run CI) save for the mixed version test. I tried it out manually and seems to work cleanly now that we've upgraded the bool instead of introducing a new enum. I'm going to do some leg work tomorrow to clean it back up to be reviewable again (+ add the mixed version test), in addition to rebasing it atop #50478. I'll ping back on both issues once done. |
...and discourage further use of it altogether. This came up in trying to introduce a new enum type for node membership (active, decommissioning, decommissioned), and realizing the symbols were already used by `NodeStatusLiveness`. On investigating further, it doesn't seem like we should be using NodeStatusLiveness anymore. It's unclear if the enum was ever well considered. It enumerates across two distinct set of things: the "membership" status (live/active, decommissioning, decommissioned), and the node "process" status (live, unavailable, available). It's possible for two of these "states" to be true, simultaneously (consider a decommissioned, dead node). It makes for confusing semantics, and the code attempting to disambiguate across these states (kvserver.LivenessStatus() for e.g.) seem wholly arbitrary. See cockroachdb#50707 for greater detail. Short of replacing the usage of NodeStatusLiveness, we're adding some text pointing to cockroachdb#50707 to discourage further usage. We also rename the enums to be able to reuse those symbols in introducing the node membership enum. --- This was pulled out of cockroachdb#50329. Release note: None.
56f6c85
to
607045f
Compare
50478: kvserver,server: discourage further use of NodeStatusLiveness r=irfansharif a=irfansharif We mark the NodeLivenessStatus proto as deprecated. It's unclear if the enum was ever well considered. It enumerates across two distinct set of things: the "membership" status (live/active, decommissioning, decommissioned), and the node "process" status (live, unavailable, available). It's possible for two of these "states" to be true, simultaneously (consider a decommissioned, dead node). It makes for confusing semantics, and the code attempting to disambiguate across these states (kvserver.LivenessStatus() for e.g.) seem wholly arbitrary. It's currently used by the Allocator to detect replicas placed on dead nodes. But there's no good reason for the "deadness" to be proto-backed. It should suffice to capture the "membership" status and view of node's liveness (when was the record last heartbeated) + current time to inform allocator decisions. --- This was pulled out of #50329. Release note: None. Co-authored-by: irfan sharif <[email protected]>
607045f
to
294c695
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review, Tobi. I ended up re-writing most of what so I'd just look the last few commits. (It's basically a new PR, and partly why I've been dragging my feet with this one.)
Also added randomized multi-version and regular decomm/recomm tests.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @knz and @tbg)
pkg/cli/demo_cluster.go, line 375 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
why
Form
? Isn'tCommissionStatusFromBool
good enough?
Done (ended up removing entirely).
pkg/cli/node.go, line 66 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Add a release note that the 20.2 cli
node
family of commands will not work with older versions, but that older versions will work against 20.2.
Done.
pkg/cli/node.go, line 340 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
If a 20.2 cli were used against a 20.1 node, it would always recommission, which is bad.
I think this problem goes away if you re-use the field as I suggested in another comment.
Done.
pkg/cli/node.go, line 361 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Uh, oh.
false && false || true == true
, so this is a bug. You wantfalse && (false || true)
Done (re-wrote to be more readable).
pkg/cli/node.go, line 364 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
You would never want to enter this if
!allDecommissioning
, no?
Done.
pkg/cli/node.go, line 365 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
node(s)
Done.
pkg/cli/node.go, line 389 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
But you may already have done it? I think you need to make sure you never do it in that case.
Let's give this method a real close look, since it seems really easy to introduce bugs here.
The comment below was off. I've reworked this method a bit to make it clear what the intent was, PTAL.
pkg/cli/node.go, line 419 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
All of these too will be better off with the enum reusing the bool field. Right now if you use a new cli against old node you'll just get wrong answers. I'm ok breaking compatibility here but we shouldn't silently return wrong data. Problem will disappear with enum reuse, luckily.
Done (yikes, didn't even notice this one).
pkg/cli/node.go, line 446 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
This returns -1 if not found, in which case you'll explode below.
I see now that you guard against that at the caller, but move that check in here. If you can't match, return "".
Removed entirely.
pkg/cli/node.go, line 474 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Isn't there a better way to solve this? You can return a structured grpc error:
err := status.Error(codes.FailedPrecondition, "something something blah blah")and then later
if s, ok := status.FromError(err); ok && s.Code() == codes.FailedPrecondition { fmt.Println(s.Message()) }
Done (TIL, my first attempt here was to look at the cockroachdb/errors package but it seems that we never unflatten errors off the wire, so didn't know what else to try).
pkg/cli/interactive_tests/test_demo_node_cmds.tcl, line 70 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
It's been lingering in my mind throughout the review, and I think for a good reason - I'm not sure
commissioned
really is the most intuitive choice here. It looks like a verb - like we commissioned a node, i.e. we ordered one, it's bound to arrive any time soon. I feel that this might be confusing to users especially when we have some very obvious choices available, likeactive
. Consider switching to that.
Done.
pkg/cli/interactive_tests/test_multiple_nodes.tcl, line 24 at r3 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Do we want to maintain these warnings though? In that case it seems wise to remove the code only when we're ready to replace it with new code. It seems that this PR is trying to do two things at once - add the decommission bit and deprecating the liveness status thing, but it's not really following through on the latter, so I would prefer if we kept that to the minimum and followed through in a separate PR.
Separated it out in #50478.
pkg/cmd/roachtest/acceptance.go, line 38 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
?
The test was moved into decommission.go, and renamed.
pkg/cmd/roachtest/decommission.go, line 253 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
There's no issue open for it, so let's rename it now.
Done.
pkg/cmd/roachtest/decommission.go, line 279 at r4 (raw file):
Reworked this test to be a bit clearer, and added multi-node decommissioning checks (+ comments).
Did you verify that this still works against <= 20.1?
I don't follow, why would this need to? Wouldn't we run the version of this roachtest present in 20.1 code to run against 20.1 code?
pkg/kv/kvserver/node_liveness.go, line 49 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Similar concern about
commissioning status
(which pops up in a number of column headers). What aboutmembership
status? After all, decommissioning is all about removing members from a cluster and in the context of long-running operations, we are trying to reason about cluster membership in an even stricter sense (who can access the cluster as opposed to who is tracked as a member)
Done.
pkg/kv/kvserver/node_liveness.go, line 360 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
This shouldn't fail in mixed-version clusters. I don't think it would when you take my suggestion to merge the fields. The only caveat is that we are not allowed to mark as
decommissioned
but thecli
should just special-case the mixed version error there and print a warning to the client (but succeed anyway).
Merged the fields, and succeed anyway now in mixed version cluster settings.
pkg/kv/kvserver/node_liveness.go, line 382 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Let's
Done.
pkg/kv/kvserver/node_liveness.go, line 395 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Just a heads up that I'm going to skip lightly over this mixed version stuff for now since I expect it to change.
Done.
pkg/kv/kvserver/node_liveness.go, line 438 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
new
is also a keyword, let's avoid that.
Done.
pkg/kv/kvserver/kvserverpb/liveness.go, line 71 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Could this really be removed? Nodes that never join the cluster with 20.2 will never update their liveness, so you'd have to retain some understanding of the old format ~forever (or write an explicit migration).
I think this problem will go away when you merge the fields.
Done.
pkg/kv/kvserver/kvserverpb/liveness.proto, line 51 at r2 (raw file):
Done. This cleaned a lot of things up, I didn't realize it was possible. I was looking at https://developers.google.com/protocol-buffers/docs/proto#updating:
enum is compatible with int32, uint32, int64, and uint64 in terms of wire format
And took it at face value given that bool was not explicitly mentioned. But I see right above:
int32, uint32, int64, uint64, and bool are all compatible
Would've saved myself a lot of headache.
pkg/kv/kvserver/kvserverpb/liveness.proto, line 128 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Next time, make sure you put pure renames like that in a separate commit. There were three things going in on in this one which was mildly unfortunate.
Did you mean separate PR? I had pulled in the proto renames in the first commit, but agreed that the NodeLivenessStatus deprecation was entirely orthogonal (#50478).
pkg/server/server.go, line 1807 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
And here you'd check the cluster version.
Done.
pkg/server/server.go, line 1809 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
We need to avoid panicking whenever we can. Here it's easy to return an error.
This seems like a panic-worthy assertion, no? There are only three possible states defined for the targetStatus, and if we were to introduce more, we should like to think about the events that would correspond to (so, programming error if this was missed).
pkg/server/serverpb/admin.proto, line 879 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
I would prefer if we just stuck with the old name throughout.
CommissionRequest
sounds like a request to "commission" a node, i.e. to mark it as commissioned, and thus doesn't really seem preferable toDecommissionRequest
(which at least will be used for its name most of the time).
Done.
pkg/sql/crdb_internal.go, line 2742 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Looking at this, I think
membership: active
would have a good intuitive ring to it.
Done.
pkg/sql/event_log.go, line 90 at r4 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Is there a migration concern with this change? Not really right? They're only ever used as strings.
Not as far as I can tell.
@knz, this is ready for you to take a look as well. |
294c695
to
4115d75
Compare
I just learned that roachtests need to be compatible across multiple versions of crdb since they're built from master in our CI, and run against earlier version builds. The original
in there for now given the test was written with the new recommissioning behavior in mind, but now we've lost decommission test coverage against v20.1/v19.2/v19.1 builds. Is that fine? I can revive the earlier version of |
4115d75
to
33b01e1
Compare
Fixes cockroachdb#51097. Fixes cockroachdb#51367. This is fallout from cockroachdb#50329, this test previously attempted to recommission a fully decommissioned node. It seems we relied on the decomm/recomm subsystems to trigger replica gc operations, that that test was then asserting on. It suffices to simply mark the nodes as decommissioning instead of fully decommissioning them. While here, I've re-written this test in the more stateful style of the `decommission-recommission` roachtest. Release note: None
Fixes cockroachdb#51097. Fixes cockroachdb#51367. This is fallout from cockroachdb#50329, this test previously attempted to recommission a fully decommissioned node. It seems we relied on the decomm/recomm subsystems to trigger replica gc operations, that that test was then asserting on. It suffices to simply mark the nodes as decommissioning instead of fully decommissioning them. While here, I've re-written this test in the more stateful style of the `decommission-recommission` roachtest. Release note: None
51394: roachtest: deflake+improve `replicagc-changed-peers` test r=irfansharif a=irfansharif Fixes #51097. Fixes #51367. This is fallout from #50329, this test previously attempted to recommission a fully decommissioned node. It seems we relied on the decomm/recomm subsystems to trigger replica gc operations, that that test was then asserting on. It suffices to simply mark the nodes as decommissioning instead of fully decommissioning them. While here, I've re-written this test in the more stateful style of the `decommission-recommission` roachtest. Release note: None Co-authored-by: irfan sharif <[email protected]>
The usual problem with only the pointer implementing Stringer. Particularly important since NodeLiveness.GetLivenes() returns a value Liveness. It used to be implemented on values, but that changed in cockroachdb#50329 for some reason. Release note: None
55620: kvserver: implement Liveness.String() on value receiver r=andreimatei a=andreimatei The usual problem with only the pointer implementing Stringer. Particularly important since NodeLiveness.GetLivenes() returns a value Liveness. It used to be implemented on values, but that changed in #50329 for some reason. Release note: None Co-authored-by: Andrei Matei <[email protected]>
… not just availability Fixes cockroachdb#53515. We should have the autoupgrade process look at the fully decommissioned bit we added in cockroachdb#50329, instead of just looking at availability. It would avoid the hazard described in cockroachdb#53515. Previously the autoupgrade process was also looking at NodeStatusLiveness, which we've since soured upon (see cockroachdb#50478). Now that we always create a liveness record on start up (cockroachdb#53805), we can simply fetch all liveness records from KV. We add a helper to do this, which we'll also rely on in future PRs for other purposes. It's a bit unfortunate that we're further adding on to the NodeLiveness API without changing the caching structure, but the methods fetching records from KV is the world we're hoping to move towards going forward. Release note: None
Ignore first commit, cherry-picked from @andreimatei's #49135.
This commit introduces a fully decommissioned bit to Cockroach.
Previously our Liveness schema only contained a decommissioning bool,
with consequently no ability to disamiguate between a node currently
undergoing decommissioning, and a node that was fully decommissioned. We
used some combination of store dead threshold to surface, in our UI,
"fully decommissioned" nodes, but it was never quite so. We need this
specificity for the Connect RPC.
We wire up the new CommissionStatus enum that's now part of the liveness
record. In doing so it elides usage of the decommissioning bool used in
v20.1. We're careful to maintain an on-the-wire representation of the
Liveness record that will be understood by v20.1 nodes (see
Liveness.EnsureCompatible for details).
We repurpose the AdminServer.Decommission RPC (which is now a misnomer,
should be thought of being named Commission instead) to persist
CommissionStatus states to KV through the lifetime of a node
decommissioning/recommissioning. See cli/node.go for where that's done.
For recommissioning a node, it suffices to simply persist a COMMISSIONED
state. When decommissioning a node, since it's a longer running process,
we first persist an in-progress DECOMMISSIONING state, and once we've
moved off all the Replicas in the node, we finalize the decommissioning
process by persisting the DECOMMISSIONED state.
When transitioning between CommissionStatus states, we CPut against
what's already there, disallowing illegal state transitions. The
appropriate error codes are surfaced back to the user. An example would
be in attempting to recommission a fully decommissioned node, in which
case we'd error out with the following:
Note that this is a behavioral change for
cockroach node recommission
.Previously it was able to recommission any "decommissioned" node,
regardless of how long ago it's was removed from the cluster. Now
recommission serves to only cancel an accidental decommissioning process
that wasn't finalized.
The 'decommissioning' column in crdb_internal.gossip_liveness is now
powered by this new CommissionStatus, and we introduce a new
commission_status column to it. We also introduce the same column to
the output generated by
cockroach node status --decommission
. Theis_decommissioning column still exists, but is also powered by the
CommissionStatus now.
We also iron out the events plumbed into system.eventlog: it now has a
dedicated event for "node decommissioning".
Release note (general change):
cockroach node recommission
has newsemantics.Previously it was able to recommission any decommissioned node,
regardless of how long ago it's was fully removed from the cluster. Now
recommission serves to only cancel an accidental decommissioning process
that wasn't finalized.
Release note (cli change): We introduce a commission_status column to
the output generated by
cockroach node status --decommission
. Itshould be used in favor of the is_decommissioning column going forward.