Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
86883: kvserver: cancel consistency checks more reliably r=tbg a=pavelkalinnikov This PR increases chance of propagating cancelation signal to replicas to prevent them from running abandoned consistency check tasks. Specifically: - The computation is aborted if the collection request is canceled. - The computation is not started if the collection request gave up recently. - The initiator runs all requests in parallel to reduce asynchrony, and to be able to cancel all the requests explicitly, instead of skipping some of them. --- ### Background Consistency checks are initiated by `ComputeChecksum` command in the Raft log, and run until completion under a background context. The result is collected by the initiator via the `CollectChecksum` long poll. The task is synchronized with the collection handler via the map of `replicaChecksum` structs. Currently, the replica initiating the consistency check sends a collection request to itself first, and only then to other replicas in parallel. This results in substantial asynchrony on the receiving replica, between the request handler and the computation task. The current solution to that is keeping the checksum computation results in memory for `replicaChecksumGCInterval` to return them to late arriving requests. However, there is **no symmetry** here: if the computation starts late instead, it doesn't learn about a previously failed request. The reason why the initiator blocks on its local checksum first is that it computes the "master checksum", which is then added to all other requests. However, this field is only used by the receiving end to log an inconsistency error. The actual killing of this replica happens on the second phase of the protocol, after the initiating replica commits another Raft message with the `Terminate` field populated. So, there is **no strong reason to keep this blocking behaviour**. When the `CollectChecksum` handler exits due to a canceled context (for example, the request timed out, or the remote caller crashed), the background task continues to run. If it was not running, it may start in the future. In both cases, the consistency checks pool (which has a limited size and processing rate) spends resources on running dangling checks, and rejects useful ones. If the initiating replica fails to compute its local checksum, it does not send requests (or any indication to cancel) to other replicas. This is problematic because the checksum tasks will be run on all replicas, which opens the possibility for accumulating many such dangling checks. --- Part of #77432 Release justification: performance and stability improvement Release note(bug fix): A consistency check is now skipped/stopped when its remote initiator gives up on it. Previously such checks would still be attempted to run, and, due to the limited size of the worker pool, prevent the useful checks from running. In addition, consistency check requests are now sent in parallel, and cancelation signal propagates more reliably. Co-authored-by: Pavel Kalinnikov <[email protected]>
- Loading branch information