-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv/kvnemesis: rare resolved timestamp violation #60929
Comments
I think I see the problem here. We consider a range's lease start time to be closed (see explanation in 7037b54), but this can be violated by a write to the subsumed keyspace after a range merge causes a range to grow. In such cases, we bump the leaseholder's timestamp cache over the subsumed keyspace to the freeze time, but this may still lag behind the LHS range's lease start time. This allows a write to land on the subsumed side of the range below the lease start time. Unless I'm missing something, we'll either want to bump the subsumed keyspace to the lease start time during a merge or stop considering the lease start time as a contributor to the closed timestamp (which we may want to do anyway, see #61989). There might be more to this story, as I'm also seeing evidence that a leaseholder has a clock below its lease start time. I wouldn't expect this to be possible, but I'm also now second-guessing that we properly prevent it. We do set the cockroach/pkg/kv/kvserver/batcheval/cmd_lease_transfer.go Lines 71 to 77 in e09b93f
It's unclear whether this is contributing to the issue, merely making it more likely, or unrelated. cc. @andreimatei you might have thoughts here. |
Can you spell this out for me? Isn't a write to the subsumed keyspace after a merge bumped above the merged lease? And what does it mean for it to "cause a range to grow"? |
That's the problem - what enforces this? In the usual case, a new lease bumps the timestamp cache across a range's keyspace to the lease start time. But if the range then grows because it merges in its RHS neighbor, the timestamp cache for this newly added keyspace isn't guaranteed to be above the LHS's lease start time. So the LHS can serve writes below its lease start time, violating its reported closed timestamp. |
Yeah, this seems like a bug. Like you're saying, it doesn't seem like the lease start time should be particularly tied to the closed timestamp. Particularly in 21.1, where a lease change doesn't cause a regression in any replica's info about what closed timestamp applies. |
Fixes cockroachdb#60929. Relates to cockroachdb#61986. Relates to cockroachdb#61989. This commit fixes a closed timestamp violation that could allow a value/intent write at a timestamp below a range's closed timestamp. This could allow for serializability violations if it allowed a follower read to miss a write and could lead to a panic in the rangefeed processor if a rangefeed was watching at the right time, as we saw in cockroachdb#60929. In cockroachdb#60929, we found that this bug was caused by a range merge and a lease transfer racing in such a way that the closed timestamp could later be violated by a write to the subsumed portion of the joint range. The root cause of this was an opportunistic optimization made in 7037b54 to consider a range's lease start time as an input to its closed timestamp computation. This optimization did not account for the possibility of serving writes to a newly subsumed keyspace below a range's lease start time if that keyspace was merged into a range under its current lease and with a freeze time below the current lease start time. This bug is fixed by removing the optimization, which was on its way out to allow for cockroachdb#61986 anyway. Note that removing this optimization does not break `TestClosedTimestampCanServeThroughoutLeaseTransfer`, because the v2 closed timestamp system does not allow for closed timestamp regressions, even across leaseholders. This was one of the many benefits of the new system.
62570: kv: don't consider lease start time as closed timestamp r=nvanbenschoten a=nvanbenschoten Fixes #60929. Relates to #61986. Relates to #61989. This commit fixes a closed timestamp violation that could allow a value/intent write at a timestamp below a range's closed timestamp. This could allow for serializability violations if it allowed a follower read to miss a write and could lead to a panic in the rangefeed processor if a rangefeed was watching at the right time, as we saw in #60929. In #60929, we found that this bug was caused by a range merge and a lease transfer racing in such a way that the closed timestamp could later be violated by a write to the subsumed portion of the joint range. The root cause of this was an opportunistic optimization made in 7037b54 to consider a range's lease start time as an input to its closed timestamp computation. This optimization did not account for the possibility of serving writes to a newly subsumed keyspace below a range's lease start time if that keyspace was merged into a range under its current lease and with a freeze time below the current lease start time. This bug is fixed by removing the optimization, which was on its way out to allow for #61986 anyway. Note that removing this optimization does not break `TestClosedTimestampCanServeThroughoutLeaseTransfer`, because the v2 closed timestamp system does not allow for closed timestamp regressions, even across leaseholders. This was one of the many benefits of the new system. Co-authored-by: Nathan VanBenschoten <[email protected]>
Fixes cockroachdb#60929. Relates to cockroachdb#61986. Relates to cockroachdb#61989. This commit fixes a closed timestamp violation that could allow a value/intent write at a timestamp below a range's closed timestamp. This could allow for serializability violations if it allowed a follower read to miss a write and could lead to a panic in the rangefeed processor if a rangefeed was watching at the right time, as we saw in cockroachdb#60929. In cockroachdb#60929, we found that this bug was caused by a range merge and a lease transfer racing in such a way that the closed timestamp could later be violated by a write to the subsumed portion of the joint range. The root cause of this was an opportunistic optimization made in 7037b54 to consider a range's lease start time as an input to its closed timestamp computation. This optimization did not account for the possibility of serving writes to a newly subsumed keyspace below a range's lease start time if that keyspace was merged into a range under its current lease and with a freeze time below the current lease start time. This bug is fixed by removing the optimization, which was on its way out to allow for cockroachdb#61986 anyway. Note that removing this optimization does not break `TestClosedTimestampCanServeThroughoutLeaseTransfer`, because the v2 closed timestamp system does not allow for closed timestamp regressions, even across leaseholders. This was one of the many benefits of the new system.
After about 45 minutes of running kvnemesis, I often see an error like:
At first, I figured that this was fallout from #59566, but I've managed to reproduce with that change disabled. This is an
MVCCWriteIntentOp
below a range's resolved timestamp, which should never be allowed.The text was updated successfully, but these errors were encountered: