Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver: purge gc-able, unmigrated replicas during migrations #62838

Merged
merged 1 commit into from
Apr 1, 2021

Conversation

irfansharif
Copy link
Contributor

@irfansharif irfansharif commented Mar 31, 2021

Fixes #58378.
Fixes #62267.

Previously it was possible for us to have replicas in-memory, with
pre-migrated state, even after a migration was finalized. This led to
the kind of badness we were observing in #62267, where it appeared that
a replica was not using the applied state key despite us having migrated
into it (see TruncatedAndRangeAppliedState, introduced in #58088).


To see how, consider the following set of events:

  • Say r42 starts off on n1, n2, and n3
  • n3 flaps and so we place a replica for r42 on n4
  • n3's replica, r42/3, is now GC-able, but still un-GC-ed
  • We run the applied state migration, first migrating all ranges into it
    and then purging outdated replicas
  • Well, we should want to purge r42/3, cause it's un-migrated and
    evaluating anything on it (say a lease request) is unsound because
    we've bumped version gates that tell the kvserver to always expect
    post-migration state
  • What happens when we try to purge r42/3? Previous to this PR if it
    didn't have a replica version, we'd skip over it (!)
  • Was it possible for r42/3 to not have a replica version? Shouldn't it
    have been accounted for when we migrated all ranges? No, that's precisely
    why the migration infrastructure purge outdated replicas. The migrate
    request only returns once its applied on all followers; in our example
    that wouldn't include r42/3 since it was no longer one
  • The stop-gap in kv: (re-)introduce a stopgap for lack of ReplicaState synchronization #60429 made it so that we didn't GC r42/3, when we
    should've been doing the opposite. When iterating over a store's
    replicas for purging purposes, an empty replica version is fine and
    expected; we should interpret that as signal that we're dealing with a
    replica that was obviously never migrated (to even start using replica
    versions in the first place). Because it didn't have a valid replica
    version installed, we can infer that it's soon to be GC-ed (else we
    wouldn't have been able to finalize the applied state + replica
    version migration)
  • The conditions above made it possible for us to evaluate requests on
    replicas with migration state out-of-date relative to the store's
    version
  • Boom

Release note: None

@irfansharif irfansharif requested review from tbg and a team March 31, 2021 01:16
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Fixes cockroachdb#58378.
Fixes cockroachdb#62267.

Previously it was possible for us to have replicas in-memory, with
pre-migrated state, even after a migration was finalized. This led to
the kind of badness we were observing in cockroachdb#62267, where it appeared that
a replica was not using the applied state key despite us having migrated
into it (see TruncatedAndRangeAppliedState, introduced in cockroachdb#58088).

---

To see how, consider the following set of events:

- Say r42 starts off on n1, n2, and n3
- n3 flaps and so we place a replica for r42 on n4
- n3's replica, r42/3, is now GC-able, but still un-GC-ed
- We run the applied state migration, first migrating all ranges into it
  and then purging outdated replicas
- Well, we should want to purge r42/3, cause it's un-migrated and
  evaluating anything on it (say a lease request) is unsound because
  we've bumped version gates that tell the kvserver to always expect
  post-migration state
- What happens when we try to purge r42/3? Previous to this PR if it
  didn't have a replica version, we'd skip over it (!)
- Was it possible for r42/3 to not have a replica version? Shouldn't it
  have been accounted for when we migrated all ranges? No, that's precisely
  why the migration infrastructure purge outdated replicas. The migrate
  request only returns once its applied on all followers; in our example
  that wouldn't include r42/3 since it was no longer one
- The stop-gap in cockroachdb#60429 made it so that we didn't GC r42/3, when we
  should've been doing the opposite. When iterating over a store's
  replicas for purging purposes, an empty replica version is fine and
  expected; we should interpret that as signal that we're dealing with a
  replica that was obviously never migrated (to even start using replica
  versions in the first place). Because it didn't have a valid replica
  version installed, we can infer that it's soon to be GC-ed (else we
  wouldn't have been able to finalize the applied state + replica
  version migration)
- The conditions above made it possible for us to evaluate requests on
  replicas with migration state out-of-date relative to the store's
  version
- Boom

Release note: None
@irfansharif irfansharif force-pushed the 210330.repl-version branch from 8d4b905 to 1416a4e Compare March 31, 2021 01:22
Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work tracking this down! Really glad you took that shower. How confident are we that this fixes version-upgrade? What you're saying is plausible, I just remember looking at the logs and not being able to explain to myself why we were even replicaGC'ing that replica (there wasn't any prior rebalancing of it). Were you able to find out what we missed then? I am pretty sure we wouldn't replicaGC by accident, so there is likely an explanation, I just wonder why we weren't able to easily find the rebalances.

@irfansharif
Copy link
Contributor Author

How confident are we that this fixes version-upgrade?

Very confident. Looking back at those logs, it's clear that the panicky replica should've already been GC-ed on the new node. When we were checking for the applied state assertion, we were comparing against the store's cluster version that was bumped past the migration that rolls over all replicas into using it.

we wouldn't replicaGC by accident, so there is likely an explanation

Looking at the range log for those ranges, the other replicas of that range existed elsewhere with lease start times earlier than the node's start up timestamp. Basically, like this PR suggests, the migration infrastructure depends on the GC-able-but-not-yet-GC-ed replica to have been purged by then. It wasn't by accident, this replica was no longer part of the raft group.

I just wonder why we weren't able to easily find the rebalances.

I didn't really dig much further, but it wasn't the only range that exhibited this behavior. If this comes up again, I'll look in closer detail. In either 21.1 or 20.2 we're probably missing a log statement in some codepath.

Really glad you took that shower.

Reads funny out of context haha.


bors r+

@craig
Copy link
Contributor

craig bot commented Mar 31, 2021

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Mar 31, 2021

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Mar 31, 2021

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Apr 1, 2021

Build succeeded:

@craig craig bot merged commit 41f921d into cockroachdb:master Apr 1, 2021
@irfansharif irfansharif deleted the 210330.repl-version branch April 1, 2021 04:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants