forked from cockroachdb/cockroach
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
migration: introduce primitives for below-raft migrations...
and onboard the truncated state migration. These include: - The KV ranged `Migrate` command. This command forces all ranges overlapping with the request spans to execute the (below-raft) migrations corresponding to the specific, stated version. This has the effect of moving the range out of any legacy modes operation they may currently be in. KV waits for this command to durably apply on all the replicas before returning, guaranteeing to the caller that all pre-migration state has been completely purged from the system. - The `SyncAllEngines` RPC. This is be used to instruct the target node to persist releveant in-memory state to disk. Like we mentioned above, KV currently waits for the `Migrate` command to have applied on all replicas before returning. With the applied state, there's no necessity to durably persist it (the representative version is already stored in the raft log). Out of an abundance of caution, and to really really ensure that no pre-migrated state is ever seen in the system, we provide the migration manager a mechanism to flush out all in-memory state to disk. This will let us guarantee that by the time a specific cluster version is bumped, all pre-migrated state from prior to a certain version will have been fully purged from the system. We'll also use it in conjunction with PurgeOutdatedReplicas below. - The `PurgeOutdatedReplicas` RPC. This too comes up in the context of wanting the ensure that ranges where we've executed a ranged `Migrate` command over have no way of ever surfacing pre-migrated state. This can happen with older replicas in the replica GC queue and with applied state that is not yet persisted. Currently we wait for the `Migrate` to have applied on all replicas of a range before returning to the caller. This does not include earlier incarnations of the range, possibly sitting idle in the replica GC queue. These replicas can still request leases, and go through the request evaluation paths, possibly tripping up assertions that check to see no pre-migrated state is found. The `PurgeOutdatedReplicas` lets the migration manager do exactly as the name suggests, ensuring all "outdated" replicas are processed before declaring the specific cluster version bump complete. - The concept of a "replica state version". This is what's used to construct the migration manager's view of what's "outdated", telling us which migrations can be assumed to have run against a particular replica. When we introduce backwards incompatible changes to the replica state (for example using the unreplicated truncated state instead of the replicated variant), the version would inform us if, for a given replica, we should expect a state representation prior to, or after the migration (in our example this corresponds to whether or not we can assume an unreplicated truncated state). As part of this commit, we also re-order the steps taken by the migration manager so that it executes a given migration first before bumping version gates cluster wide. This is because we want authors of migrations to ascertain that their own migrations have run to completion, instead of attaching that check to the next version. --- This PR motivates all of the above by also onboarding the TruncatedAndRangeAppliedState migration, lets us do the following: i. Use the RangeAppliedState on all ranges ii. Use the unreplicated TruncatedState on all ranges In 21.2 we'll finally be able to delete holdover code that knows how to handle the legacy replicated truncated state. Release note (general change): Cluster version upgrades, as initiated by SET CLUSTER SETTING version = <major>-<minor>, now perform internal maintenance duties that will delay how long it takes for the command to complete. The delay is proportional to the amount of data currently stored in the cluster. The cluster will also experience a small amount of additional load during this period while the upgrade is being finalized. --- The ideas here follow from our original prototype in cockroachdb#57445.
- Loading branch information
1 parent
1697431
commit 1247fed
Showing
57 changed files
with
3,781 additions
and
1,063 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,140 @@ | ||
// Copyright 2020 The Cockroach Authors. | ||
// | ||
// Use of this software is governed by the Business Source License | ||
// included in the file licenses/BSL.txt. | ||
// | ||
// As of the Change Date specified in that file, in accordance with | ||
// the Business Source License, use of this software will be governed | ||
// by the Apache License, Version 2.0, included in the file | ||
// licenses/APL.txt. | ||
|
||
package batcheval | ||
|
||
import ( | ||
"context" | ||
|
||
"github.com/cockroachdb/cockroach/pkg/clusterversion" | ||
"github.com/cockroachdb/cockroach/pkg/keys" | ||
"github.com/cockroachdb/cockroach/pkg/kv/kvserver/batcheval/result" | ||
"github.com/cockroachdb/cockroach/pkg/kv/kvserver/kvserverpb" | ||
"github.com/cockroachdb/cockroach/pkg/kv/kvserver/spanset" | ||
"github.com/cockroachdb/cockroach/pkg/roachpb" | ||
"github.com/cockroachdb/cockroach/pkg/storage" | ||
"github.com/cockroachdb/cockroach/pkg/util/hlc" | ||
"github.com/cockroachdb/errors" | ||
) | ||
|
||
func init() { | ||
RegisterReadWriteCommand(roachpb.Migrate, declareKeysMigrate, Migrate) | ||
} | ||
|
||
func declareKeysMigrate( | ||
_ *roachpb.RangeDescriptor, | ||
header roachpb.Header, | ||
_ roachpb.Request, | ||
latchSpans, lockSpans *spanset.SpanSet, | ||
) { | ||
// TODO(irfansharif): This will eventually grow to capture the super set of | ||
// all keys accessed by all migrations defined here. That could get | ||
// cumbersome. We could spruce up the migration type and allow authors to | ||
// define the allow authors for specific set of keys each migration needs to | ||
// grab latches and locks over. | ||
latchSpans.AddNonMVCC(spanset.SpanReadWrite, roachpb.Span{Key: keys.RaftTruncatedStateLegacyKey(header.RangeID)}) | ||
lockSpans.AddNonMVCC(spanset.SpanReadWrite, roachpb.Span{Key: keys.RaftTruncatedStateLegacyKey(header.RangeID)}) | ||
} | ||
|
||
// migrationRegistry is a global registry of all KV-level migrations. See | ||
// pkg/migration for details around how the migrations defined here are | ||
// wired up. | ||
var migrationRegistry = make(map[roachpb.Version]migration) | ||
|
||
type migration func(context.Context, storage.ReadWriter, CommandArgs) (result.Result, error) | ||
|
||
func init() { | ||
registerMigration(clusterversion.TruncatedAndRangeAppliedStateMigration, truncatedAndAppliedStateMigration) | ||
} | ||
|
||
func registerMigration(key clusterversion.Key, migration migration) { | ||
migrationRegistry[clusterversion.ByKey(key)] = migration | ||
} | ||
|
||
// Migrate executes the below-raft migration corresponding to the given version. | ||
func Migrate( | ||
ctx context.Context, readWriter storage.ReadWriter, cArgs CommandArgs, _ roachpb.Response, | ||
) (result.Result, error) { | ||
args := cArgs.Args.(*roachpb.MigrateRequest) | ||
migrationVersion := args.Version | ||
|
||
fn, ok := migrationRegistry[migrationVersion] | ||
if !ok { | ||
return result.Result{}, errors.Newf("migration for %s not found", migrationVersion) | ||
} | ||
pd, err := fn(ctx, readWriter, cArgs) | ||
if err != nil { | ||
return result.Result{}, err | ||
} | ||
|
||
// Since we're a below raft migration, we'll need update our replica state | ||
// version. | ||
if err := MakeStateLoader(cArgs.EvalCtx).SetVersion( | ||
ctx, readWriter, cArgs.Stats, &migrationVersion, | ||
); err != nil { | ||
return result.Result{}, err | ||
} | ||
if pd.Replicated.State == nil { | ||
pd.Replicated.State = &kvserverpb.ReplicaState{} | ||
} | ||
// NB: We don't check for clusterversion.ReplicaVersions being active here | ||
// as all below-raft migrations (the only users of Migrate) were introduced | ||
// after it. | ||
pd.Replicated.State.Version = &migrationVersion | ||
return pd, nil | ||
} | ||
|
||
// truncatedAndRangeAppliedStateMigration lets us stop using the legacy | ||
// replicated truncated state and start using the new RangeAppliedState for this | ||
// specific range. | ||
func truncatedAndAppliedStateMigration( | ||
ctx context.Context, readWriter storage.ReadWriter, cArgs CommandArgs, | ||
) (result.Result, error) { | ||
var legacyTruncatedState roachpb.RaftTruncatedState | ||
legacyKeyFound, err := storage.MVCCGetProto( | ||
ctx, readWriter, keys.RaftTruncatedStateLegacyKey(cArgs.EvalCtx.GetRangeID()), | ||
hlc.Timestamp{}, &legacyTruncatedState, storage.MVCCGetOptions{}, | ||
) | ||
if err != nil { | ||
return result.Result{}, err | ||
} | ||
|
||
var pd result.Result | ||
if legacyKeyFound { | ||
// Time to migrate by deleting the legacy key. The downstream-of-Raft | ||
// code will atomically rewrite the truncated state (supplied via the | ||
// side effect) into the new unreplicated key. | ||
if err := storage.MVCCDelete( | ||
ctx, readWriter, cArgs.Stats, keys.RaftTruncatedStateLegacyKey(cArgs.EvalCtx.GetRangeID()), | ||
hlc.Timestamp{}, nil, /* txn */ | ||
); err != nil { | ||
return result.Result{}, err | ||
} | ||
pd.Replicated.State = &kvserverpb.ReplicaState{ | ||
// We need to pass in a truncated state to enable the migration. | ||
// Passing the same one is the easiest thing to do. | ||
TruncatedState: &legacyTruncatedState, | ||
} | ||
} | ||
return pd, nil | ||
} | ||
|
||
// TestingRegisterMigrationInterceptor is used in tests to register an | ||
// interceptor for a below-raft migration. | ||
func TestingRegisterMigrationInterceptor(version roachpb.Version, fn func()) (unregister func()) { | ||
if _, ok := migrationRegistry[version]; ok { | ||
panic("doubly registering migration") | ||
} | ||
migrationRegistry[version] = func(context.Context, storage.ReadWriter, CommandArgs) (result.Result, error) { | ||
fn() | ||
return result.Result{}, nil | ||
} | ||
return func() { delete(migrationRegistry, version) } | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.