Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: introduce RangeAppliedState.RaftAppliedIndexTerm #75675

Merged
merged 1 commit into from
Feb 4, 2022

Conversation

sumeerbhola
Copy link
Collaborator

The same field is also introduced in ReplicaState, since ReplicaState
is used in internal data-structures and when sending a state machine
snapshot.

The migration code uses a special unused term value in a
ReplicatedEvalResult to signal to the state machine application
machinery to start populating the term field.

Fixes #75671

Release note: None

@sumeerbhola sumeerbhola requested a review from a team as a code owner January 28, 2022 21:20
@sumeerbhola sumeerbhola requested a review from a team January 28, 2022 21:20
@sumeerbhola sumeerbhola requested a review from a team as a code owner January 28, 2022 21:20
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@sumeerbhola
Copy link
Collaborator Author

This lacks tests (and will probably fail some tests).

  • I am looking for early feedback.
  • There are some TODOs in the code with questions.

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good already, mostly just needs some testing.

Reviewed 15 of 15 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker and @sumeerbhola)


pkg/kv/kvserver/replica_application_state_machine.go, line 853 at r1 (raw file):

	ctx context.Context, cmd *replicatedCmd,
) {
	// TODO(sumeer): when will raftAppliedIndex be 0?

I don't think it ever can (even the raft-proposed entries on leadership change are proper entries, i.e. have a proper index). Does anything fail if you make this unconditional? git blame shows @nvanbenschoten as last touching this, perhaps you remember Nathan?


pkg/kv/kvserver/replica_application_state_machine.go, line 859 at r1 (raw file):

		// We are post migration or this replicatedCmd is doing the migration.
		if b.state.RaftAppliedIndexTerm > 0 || (rs != nil &&
			rs.RaftAppliedIndexTerm == stateloader.RaftLogTermSignalForAddRaftAppliedIndexTermMigration) {

👍


pkg/kv/kvserver/replica_application_state_machine.go, line 931 at r1 (raw file):

	r.mu.Lock()
	r.mu.state.RaftAppliedIndex = b.state.RaftAppliedIndex
	// This will be non-zero only when the migration has happened.

Name the migration so we adjust this comment when we remove it. Or just remove the comment.


pkg/kv/kvserver/replica_raftstorage.go, line 389 at r1 (raw file):

// does not contain any of the replica data. The snapshot is actually generated
// (and sent) by the Raft snapshot queue.
// TODO(sumeer): is this method really used?

Yes. It has one caller in our code which is a test (TestSyncSnapshot) which hardly counts, but raft will call this method, see (*raftLog).snapshot. However, the way we do snapshots is sort of different from how raft wants them to happen. raft expects that it generates the snapshot (by calling Snapshot) and that "sending" the result actually sends the snapshot.

In CRDB, that message is really just intercepted (at the sender) and instead we add the replica (the raft leader) to the raft snapshot queue, and when its turn comes we'll look at the raft state for followers that want a snapshot, and then send one, but this won't call (*replicaRaftStorage).Snapshot:

https://github.com/cockroachdb/cockroach/blob/59ef14de9551031db777c8c812577ff43f7a71fe/pkg/kv/kvserver/raft_transport.go#L676-L678Te

Since you're here, consider distilling this into a comment.


pkg/kv/kvserver/replica_raftstorage.go, line 431 at r1 (raw file):

	// Or do we not care whether these two values are consistent given that
	// appliedIndex is just serving as an upper bound on what can be truncated
	// in the log, so it's completely best-effort?

It's not terrible to go a little wrong here, but this is my first time (after a long time, when I wrote it) of looking at this code and yeah, this isn't good. We should load the applied index from the snapshot for cleanliness. If nothing else, at least appliedIndex should be pushed down to a smaller scope so that there's no danger of someone using it later.
Let me know if you'd prefer not cleaning this up in this PR, then I will send a separate one.


pkg/kv/kvserver/replica_raftstorage.go, line 585 at r1 (raw file):

	if state.RaftAppliedIndexTerm != 0 && term != state.RaftAppliedIndexTerm {
		return OutgoingSnapshot{},
			errors.Errorf("unequal terms %d != %d", term, state.RaftAppliedIndexTerm)

errors.AssertionFailedf? After all, this should never happen.


pkg/kv/kvserver/replica_raftstorage.go, line 1000 at r1 (raw file):

	// raftpb.SnapshotMetadata.
	r.mu.lastIndex = state.RaftAppliedIndex
	// TODO(sumeer): why can't this be set to nonemptySnap.Metadata.Term?

I think that would be correct, too. We could also populate it in the only other use of invalidLastTerm:

r.mu.lastIndex, err = r.mu.stateLoader.LoadLastIndex(ctx, r.Engine())
if err != nil {
return err
}
r.mu.lastTerm = invalidLastTerm

I think the idea must have been that it is "safest" to have only one place that populates this field with the real term, i.e. this one:

r.mu.lastTerm = lastTerm

That code above actually looks a little brittle. We make a local lastTerm

lastTerm := r.mu.lastTerm
and then massage it whenever something happens. For example, when we apply a snapshot (updating r.mu.lastTerm there is this bit:

// r.mu.lastIndex, r.mu.lastTerm and r.mu.raftLogSize were updated in
// applySnapshot, but we also want to make sure we reflect these changes in
// the local variables we're tracking here.
r.mu.RLock()
lastIndex = r.mu.lastIndex
lastTerm = r.mu.lastTerm
raftLogSize = r.mu.raftLogSize
r.mu.RUnlock()

I don't think now is the time to make this better, but it's the kind of thing that would break, perhaps in a subtle way, when touched.

For now, perhaps leave a comment saying that setting it to the actual term here would also work, but that this is fine as is, too.


pkg/kv/kvserver/store_snapshot.go, line 907 at r1 (raw file):

	}

	// TODO: is SendEmptySnapshot only used for a new cluster, so

It's only used for (triggered by) the cockroach debug reset-quorum tool. So this seems correct.


pkg/kv/kvserver/batcheval/cmd_end_transaction.go, line 1014 at r1 (raw file):

			return enginepb.MVCCStats{}, result.Result{}, errors.Wrap(err, "unable to load replica version")
		}
		// The RHS should populate RaftAppliedIndexTerm if the LHS is doing so.

The other option is to do it when it's possible, i.e. gated behind the cluster setting. That way, there will be less to migrate later. Both are ok.


pkg/kv/kvserver/batcheval/cmd_migrate.go, line 103 at r1 (raw file):

		Replicated: kvserverpb.ReplicatedEvalResult{
			State: &kvserverpb.ReplicaState{
				// NB: This is not a real term value. We can not know the term at

This comment is sort of confusing, since you're intentionally sending a term here that can never be used as a real term (we have either 0 or >=10 in CRDB due to raftInitialLogTerm, as you know). How about

// Signal the migration by sending a term on the new field that we want to migrate into. This
// term is chosen as one that would never be used in practice (since raftInitialLogTerm is 10),
// so we can special-case it below raft and start writing the (real) term to the AppliedState.


pkg/kv/kvserver/stateloader/initial.go, line 118 at r1 (raw file):

	initialMS := enginepb.MVCCStats{}

	// TODO: is WriteInitialRangeState only used for a new cluster, so

In prod this code will only be used for >= v22.1 nodes (which will bootstrap at the v22.1 cluster setting, i.e. with the applied index term active).

In testing, this may not be true as the cluster version can be specified arbitrarily, i.e. the version here may not be the "default choice":

if err := kvserver.WriteInitialClusterData(
ctx, eng, initialValues,
bootstrapVersion.Version, len(engines), splits,
hlc.UnixNano(), storeKnobs,
); err != nil {
return nil, err
}

So I think you need to compare replicaVersion against the versions you've introduced and only use true when the first one is active.

This is probably also what you'll want to do if you want to end-to-end test the migration, perhaps similar to

// TestMigrateWaitsForApplication checks to see that migrate commands wait to be
// applied on all replicas before returning to the caller.
func TestMigrateWaitsForApplication(t *testing.T) {

(make a cluster at old version, make a scratch range, check that it doesn't track the term, invoke Migrate, check that it now tracks the term).


pkg/migration/migrations/raft_applied_index_term.go, line 87 at r1 (raw file):

) error {
	// TODO(sumeer): this is copied from postTruncatedStateMigration. In
	// comparison, postSeparatedIntentsMigration iterated over ranges and issues

The code here removes replicas that are waiting for replicaGC. This wouldn't be achieved by anything that "talks to ranges", as by definition a replicaGC'able replica will not be contacted by a request sent to a range.


pkg/storage/enginepb/mvcc3.proto, line 232 at r1 (raw file):

  // setting it to a value > 0. This is desirable since we don't want a mixed
  // version cluster to have divergent replica state simply because we have
  // introduced this field. An explicit migration will cause this field to

Mention the version name so that we can clean up all comments when the migration is removed.
I would recommend doing this in all comments related to a migration.

Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @sumeerbhola, and @tbg)


pkg/migration/migrations/raft_applied_index_term.go, line 87 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

The code here removes replicas that are waiting for replicaGC. This wouldn't be achieved by anything that "talks to ranges", as by definition a replicaGC'able replica will not be contacted by a request sent to a range.

The separated intents migration has the following comment

	// Issue no-op Migrate commands to all ranges. This has the only
	// purpose of clearing out any orphaned replicas, preventing interleaved
	// intents in them from resurfacing.

https://github.com/itsbilal/cockroach/blob/e780937d18b052d20f384930bb5555d4a7be967a/pkg/migration/migrations/separated_intents.go#L428-L430

How do these "orphaned replicas" compare to replicas "waiting for replicaGC"? Sounds the same to me.
And your comment that such a replica won't be contacted by a request that needs to be raft replicated makes a lot of sense. So did we make a mistake in the separated intents migration?

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker and @sumeerbhola)


pkg/migration/migrations/raft_applied_index_term.go, line 87 at r1 (raw file):

Previously, sumeerbhola wrote…

The separated intents migration has the following comment

	// Issue no-op Migrate commands to all ranges. This has the only
	// purpose of clearing out any orphaned replicas, preventing interleaved
	// intents in them from resurfacing.

https://github.com/itsbilal/cockroach/blob/e780937d18b052d20f384930bb5555d4a7be967a/pkg/migration/migrations/separated_intents.go#L428-L430

How do these "orphaned replicas" compare to replicas "waiting for replicaGC"? Sounds the same to me.
And your comment that such a replica won't be contacted by a request that needs to be raft replicated makes a lot of sense. So did we make a mistake in the separated intents migration?

That would be my reading of this. I'm not sure what the rationale was for this step. The original PR (#66445) has a lot of activity, and looking through I wasn't able to find a discussion of it either.

@sumeerbhola
Copy link
Collaborator Author

@irfansharif @itsbilal do you have any insights into the two different approaches taken to get rid of orphaned replicas in preceding migrations that we are discussing above. I want to make sure I have a clear understanding of this, to avoid making a mistake in this PR.

@sumeerbhola sumeerbhola changed the title [WIP] kvserver: introduce RangeAppliedState.RaftAppliedIndexTerm kvserver: introduce RangeAppliedState.RaftAppliedIndexTerm Feb 2, 2022
Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!
I will ping when the tests are ready.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @nvanbenschoten, @sumeerbhola, and @tbg)


pkg/kv/kvserver/replica_application_state_machine.go, line 853 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I don't think it ever can (even the raft-proposed entries on leadership change are proper entries, i.e. have a proper index). Does anything fail if you make this unconditional? git blame shows @nvanbenschoten as last touching this, perhaps you remember Nathan?

Added a log.Fatal if it is 0.


pkg/kv/kvserver/replica_application_state_machine.go, line 931 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Name the migration so we adjust this comment when we remove it. Or just remove the comment.

Added a comment


pkg/kv/kvserver/replica_raftstorage.go, line 389 at r1 (raw file):

Yes. It has one caller in our code which is a test (TestSyncSnapshot) which hardly counts, but raft will call this method, see (*raftLog).snapshot.

That's the real usage I was searching for. Thanks.

Added a comment.


pkg/kv/kvserver/replica_raftstorage.go, line 431 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

It's not terrible to go a little wrong here, but this is my first time (after a long time, when I wrote it) of looking at this code and yeah, this isn't good. We should load the applied index from the snapshot for cleanliness. If nothing else, at least appliedIndex should be pushed down to a smaller scope so that there's no danger of someone using it later.
Let me know if you'd prefer not cleaning this up in this PR, then I will send a separate one.

I've pushed appliedIndex to a narrow scope as you suggested and added a comment with TODO.


pkg/kv/kvserver/replica_raftstorage.go, line 585 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

errors.AssertionFailedf? After all, this should never happen.

Done


pkg/kv/kvserver/replica_raftstorage.go, line 1000 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I think that would be correct, too. We could also populate it in the only other use of invalidLastTerm:

r.mu.lastIndex, err = r.mu.stateLoader.LoadLastIndex(ctx, r.Engine())
if err != nil {
return err
}
r.mu.lastTerm = invalidLastTerm

I think the idea must have been that it is "safest" to have only one place that populates this field with the real term, i.e. this one:

r.mu.lastTerm = lastTerm

That code above actually looks a little brittle. We make a local lastTerm

lastTerm := r.mu.lastTerm
and then massage it whenever something happens. For example, when we apply a snapshot (updating r.mu.lastTerm there is this bit:

// r.mu.lastIndex, r.mu.lastTerm and r.mu.raftLogSize were updated in
// applySnapshot, but we also want to make sure we reflect these changes in
// the local variables we're tracking here.
r.mu.RLock()
lastIndex = r.mu.lastIndex
lastTerm = r.mu.lastTerm
raftLogSize = r.mu.raftLogSize
r.mu.RUnlock()

I don't think now is the time to make this better, but it's the kind of thing that would break, perhaps in a subtle way, when touched.

For now, perhaps leave a comment saying that setting it to the actual term here would also work, but that this is fine as is, too.

Added comment


pkg/kv/kvserver/store_snapshot.go, line 907 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

It's only used for (triggered by) the cockroach debug reset-quorum tool. So this seems correct.

Hardcoded this to true and added a comment.


pkg/kv/kvserver/batcheval/cmd_end_transaction.go, line 1014 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

The other option is to do it when it's possible, i.e. gated behind the cluster setting. That way, there will be less to migrate later. Both are ok.

I've added a comment specifying this option, but left the code the same.
Do you think we need this fallback anyway -- say the unsplit range has been migrated and then moved to this node, but the view of the cluster setting at this node is lagging, so it would be bad if the RHS did not populate the term.


pkg/kv/kvserver/batcheval/cmd_migrate.go, line 103 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

This comment is sort of confusing, since you're intentionally sending a term here that can never be used as a real term (we have either 0 or >=10 in CRDB due to raftInitialLogTerm, as you know). How about

// Signal the migration by sending a term on the new field that we want to migrate into. This
// term is chosen as one that would never be used in practice (since raftInitialLogTerm is 10),
// so we can special-case it below raft and start writing the (real) term to the AppliedState.

Done


pkg/kv/kvserver/stateloader/initial.go, line 118 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

In prod this code will only be used for >= v22.1 nodes (which will bootstrap at the v22.1 cluster setting, i.e. with the applied index term active).

In testing, this may not be true as the cluster version can be specified arbitrarily, i.e. the version here may not be the "default choice":

if err := kvserver.WriteInitialClusterData(
ctx, eng, initialValues,
bootstrapVersion.Version, len(engines), splits,
hlc.UnixNano(), storeKnobs,
); err != nil {
return nil, err
}

So I think you need to compare replicaVersion against the versions you've introduced and only use true when the first one is active.

This is probably also what you'll want to do if you want to end-to-end test the migration, perhaps similar to

// TestMigrateWaitsForApplication checks to see that migrate commands wait to be
// applied on all replicas before returning to the caller.
func TestMigrateWaitsForApplication(t *testing.T) {

(make a cluster at old version, make a scratch range, check that it doesn't track the term, invoke Migrate, check that it now tracks the term).

Done


pkg/storage/enginepb/mvcc3.proto, line 232 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Mention the version name so that we can clean up all comments when the migration is removed.
I would recommend doing this in all comments related to a migration.

Done

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r2, 8 of 8 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker and @sumeerbhola)


pkg/kv/kvserver/replica_raftstorage.go, line 1000 at r1 (raw file):

Previously, sumeerbhola wrote…

Added comment

Something off-putting about referencing reviewable comments; not sure it'll stand the test of time. Consider #75675 (review) as a more reliable permalink.


pkg/kv/kvserver/store_snapshot.go, line 907 at r1 (raw file):

Previously, sumeerbhola wrote…

Hardcoded this to true and added a comment.

Reset-quorum is not new in this release, so the hard-coding is technically incorrect & could cause issues if used against a mixed-version cluster. But I think this is still ok; reset-quorum is experimental and it's just unlikely that it would be used here. But the correct thing to do, which you may want to do, is to have SendEmptySnapshot return an error unless the version is active. Nobody will ever use it in the case in which it returns an error, but at least we know that if they do it won't cause problems but return an error.


pkg/kv/kvserver/batcheval/cmd_end_transaction.go, line 1014 at r1 (raw file):

Previously, sumeerbhola wrote…

I've added a comment specifying this option, but left the code the same.
Do you think we need this fallback anyway -- say the unsplit range has been migrated and then moved to this node, but the view of the cluster setting at this node is lagging, so it would be bad if the RHS did not populate the term.

No, fine as is. You could always "or" both conditions, but I really think this is fine, not sure why I even suggested the alternative.

Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @sumeerbhola, and @tbg)


pkg/migration/migrations/raft_applied_index_term_external_test.go, line 173 at r4 (raw file):

			// [n1,bump-cluster-version] 3480  active cluster version setting is now 21.2-54 (up from 21.2-53(fence))
			if repl.Version().Less(clusterversion.ByKey(
				clusterversion.AddRaftAppliedIndexTermMigration)) {

I am not sure why I was not able to specify PostAddRaftAppliedIndexTermMigration here, as noted in the code comment.
With this change this new migration test passes.

And I still need to investigate failures on TestTenantUpgrade, TestUpgradeHappensAfterMigrations, TestStartableJobMixedVersion.

@sumeerbhola sumeerbhola force-pushed the sm_term branch 2 times, most recently from 5d9d720 to 22f3f04 Compare February 2, 2022 23:14
Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @sumeerbhola, and @tbg)


pkg/kv/kvserver/replica_raftstorage.go, line 1000 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Something off-putting about referencing reviewable comments; not sure it'll stand the test of time. Consider #75675 (review) as a more reliable permalink.

Done (I was trying to point to the specific comment instead of the aggregated version that the mapping from reviewable to github produces).


pkg/kv/kvserver/store_snapshot.go, line 907 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Reset-quorum is not new in this release, so the hard-coding is technically incorrect & could cause issues if used against a mixed-version cluster. But I think this is still ok; reset-quorum is experimental and it's just unlikely that it would be used here. But the correct thing to do, which you may want to do, is to have SendEmptySnapshot return an error unless the version is active. Nobody will ever use it in the case in which it returns an error, but at least we know that if they do it won't cause problems but return an error.

Done

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @sumeerbhola, and @tbg)


pkg/migration/migrations/raft_applied_index_term_external_test.go, line 173 at r4 (raw file):

Previously, sumeerbhola wrote…

I am not sure why I was not able to specify PostAddRaftAppliedIndexTermMigration here, as noted in the code comment.
With this change this new migration test passes.

And I still need to investigate failures on TestTenantUpgrade, TestUpgradeHappensAfterMigrations, TestStartableJobMixedVersion.

repl.Version() is not the cluster version. It's the version of the highest Migrate() command that has been executed on the range. So your migration will set it to -52, but there is no Range migration associated to -54 and so repl.Version() will never reach that value. (If we did this differently, for each of the 27 versions in the release so far, we'd have to go through all of the ranges and bump the version 27 times)

@tbg tbg self-requested a review February 3, 2022 12:38
Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 6 files at r4, 5 of 5 files at r5, 1 of 1 files at r6, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker and @sumeerbhola)

@tbg tbg self-requested a review February 3, 2022 12:38
@sumeerbhola
Copy link
Collaborator Author

failure of TestEnsureSpanConfigReconciliation is unrelated #75849

Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @sumeerbhola, and @tbg)


pkg/migration/migrations/raft_applied_index_term_external_test.go, line 67 at r6 (raw file):

		// The in-memory ReplicaState maintained in Replica must be consistent
		// with that we are placing in RangeAppliedState.
		require.Equal(t, appliedState.RaftAppliedIndexTerm, appliedIndexTermInReplicaStruct)

this gets an error after the split (when running under stress after 100s of runs -- unfortunately I am unable to reproduce the race on my machine)

Error:      	Not equal: 
        	            	expected: 0x5
        	            	actual  : 0x6
        	Test:       	TestRaftAppliedIndexTermMigration

It looks like the term in the Replica has been bumped up from raftInitialLogTerm.

I've added a Replica.LockRaftMuForTesting() method to prevent this race.


pkg/migration/migrations/raft_applied_index_term_external_test.go, line 193 at r6 (raw file):

		require.NoError(t, err)
		t.Logf("split range into %s and %s", leftDesc.KeySpan(), rightDesc.KeySpan())
		total, withTerm := getReplicaCounts(ctx, t, tc)

this call runs into a race condition (see the other comment)

Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker and @tbg)


pkg/migration/migrations/raft_applied_index_term_external_test.go, line 173 at r4 (raw file):

Previously, tbg (Tobias Grieger) wrote…

repl.Version() is not the cluster version. It's the version of the highest Migrate() command that has been executed on the range. So your migration will set it to -52, but there is no Range migration associated to -54 and so repl.Version() will never reach that value. (If we did this differently, for each of the 27 versions in the release so far, we'd have to go through all of the ranges and bump the version 27 times)

Thanks for clarifying!

@sumeerbhola
Copy link
Collaborator Author

bors r+

@craig
Copy link
Contributor

craig bot commented Feb 3, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Feb 4, 2022

Merge conflict.

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r7, 7 of 7 files at r8, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker and @sumeerbhola)


pkg/kv/kvserver/replica.go, line 2006 at r8 (raw file):

}

// LockRaftMuForTesting is for use only by tests that are trying to avoid a

Move to helpers_test.go but we already have (*Replica).RaftLock there so just use that.

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dismissed @sumeerbhola from a discussion.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker and @sumeerbhola)

The same field is also introduced in ReplicaState, since ReplicaState
is used in internal data-structures and when sending a state machine
snapshot.

The migration code uses a special unused term value in a
ReplicatedEvalResult to signal to the state machine application
machinery to start populating the term field.

Fixes cockroachdb#75671

Release note: None
Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @sumeerbhola, and @tbg)


pkg/kv/kvserver/replica.go, line 2006 at r8 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Move to helpers_test.go but we already have (*Replica).RaftLock there so just use that.

I couldn't use that since test code from another package is not imported and there doesn't seem to be a good solution based on reading https://stackoverflow.com/questions/56404355/how-do-i-package-golang-test-helper-code
I didn't want to change that file to test_helpers.go for only this one case, which will go away in the next release when the migration code is removed. I've added to the code comment here that this is only for testing code outside kvserver.

@sumeerbhola
Copy link
Collaborator Author

bors r+

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 7 files at r9, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker and @sumeerbhola)


pkg/kv/kvserver/replica.go, line 2006 at r8 (raw file):

Previously, sumeerbhola wrote…

I couldn't use that since test code from another package is not imported and there doesn't seem to be a good solution based on reading https://stackoverflow.com/questions/56404355/how-do-i-package-golang-test-helper-code
I didn't want to change that file to test_helpers.go for only this one case, which will go away in the next release when the migration code is removed. I've added to the code comment here that this is only for testing code outside kvserver.

Oh, didn't realize the caller wasn't in kvserver. It is a bit awkward; the migration test could live in kvserver, no? But I am fine merging as is, seeing how that code has a short shelf life anyway.

@craig
Copy link
Contributor

craig bot commented Feb 4, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Feb 4, 2022

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Feb 4, 2022

Build succeeded:

@craig craig bot merged commit 3eb782c into cockroachdb:master Feb 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

kvserver: add RangeAppliedState.RaftAppliedIndexTerm
3 participants