Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: remove v1 closed timestamp system #61989

Closed
nvanbenschoten opened this issue Mar 15, 2021 · 0 comments
Closed

kv: remove v1 closed timestamp system #61989

nvanbenschoten opened this issue Mar 15, 2021 · 0 comments
Assignees
Labels
A-kv-closed-timestamps Relating to closed timestamps C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. T-kv KV Team

Comments

@nvanbenschoten
Copy link
Member

nvanbenschoten commented Mar 15, 2021

Now that the v2 closed timestamp system has landed, we can remove the old closed timestamp system and clean up lots of code. This should be rather satisfying.

@andreimatei I'm primarily bringing this up now because I think we should prep this change early (during the v21.1 stability period) to make sure we think through the migration while we still have time to get small tweaks into v21.1. Specifically, I'm concerned about how we'll migrate away the following two closed timestamp considerations:

maxClosed.Forward(lease.Start.ToTimestamp())
maxClosed.Forward(initialMaxClosed)

If we leave v21.1 nodes in a place where they continue to consult this auxiliary state, then we may be limiting how much we can do in v21.2. Remember that for each of these closed timestamp considerations, we need to enforce the invariant that if a v21.1 node thinks that a timestamp is closed, a v21.2 node acting as a leaseholder won't allow writes below it.

The first consideration here is that a v21.1 node will consider the start timestamp of the current lease closed. I'm worried that this will run into issues with #61986 and our ability to remove this code. Can we remove this consideration safely in v21.1? I think so, and I think we probably should do so now. v21.1 leaseholder will continue to respect the lease start time as an effective closed timestamp because of this code, so they won't violate v20.2 nodes' expectations, but v21.1 nodes themselves don't need to consider the lease start time closed.

The second consideration here is that a v21.1 node will carry the v1 closed timestamp from the LHS of a split to the RHS of a split. I don't think there's anything to do here now and I think this should be safe to remove in v21.2.

Epic: CRDB-2558

@nvanbenschoten nvanbenschoten added C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. A-kv-closed-timestamps Relating to closed timestamps labels Mar 15, 2021
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Mar 26, 2021
Fixes cockroachdb#60929.
Relates to cockroachdb#61986.
Relates to cockroachdb#61989.

This commit fixes a closed timestamp violation that could allow a
value/intent write at a timestamp below a range's closed timestamp. This
could allow for serializability violations if it allowed a follower read
to miss a write and could lead to a panic in the rangefeed processor if
a rangefeed was watching at the right time, as we saw in cockroachdb#60929.

In cockroachdb#60929, we found that this bug was caused by a range merge and a
lease transfer racing in such a way that the closed timestamp could
later be violated by a write to the subsumed portion of the joint range.
The root cause of this was an opportunistic optimization made in 7037b54
to consider a range's lease start time as an input to its closed
timestamp computation. This optimization did not account for the
possibility of serving writes to a newly subsumed keyspace below a
range's lease start time if that keyspace was merged into a range under
its current lease and with a freeze time below the current lease start
time. This bug is fixed by removing the optimization, which was on its
way out to allow for cockroachdb#61986 anyway.

Note that removing this optimization does not break
`TestClosedTimestampCanServeThroughoutLeaseTransfer`, because the v2
closed timestamp system does not allow for closed timestamp regressions,
even across leaseholders. This was one of the many benefits of the new
system.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Mar 30, 2021
Fixes cockroachdb#60929.
Relates to cockroachdb#61986.
Relates to cockroachdb#61989.

This commit fixes a closed timestamp violation that could allow a
value/intent write at a timestamp below a range's closed timestamp. This
could allow for serializability violations if it allowed a follower read
to miss a write and could lead to a panic in the rangefeed processor if
a rangefeed was watching at the right time, as we saw in cockroachdb#60929.

In cockroachdb#60929, we found that this bug was caused by a range merge and a
lease transfer racing in such a way that the closed timestamp could
later be violated by a write to the subsumed portion of the joint range.
The root cause of this was an opportunistic optimization made in 7037b54
to consider a range's lease start time as an input to its closed
timestamp computation. This optimization did not account for the
possibility of serving writes to a newly subsumed keyspace below a
range's lease start time if that keyspace was merged into a range under
its current lease and with a freeze time below the current lease start
time. This bug is fixed by removing the optimization, which was on its
way out to allow for cockroachdb#61986 anyway.

Note that removing this optimization does not break
`TestClosedTimestampCanServeThroughoutLeaseTransfer`, because the v2
closed timestamp system does not allow for closed timestamp regressions,
even across leaseholders. This was one of the many benefits of the new
system.
craig bot pushed a commit that referenced this issue Mar 30, 2021
62570: kv: don't consider lease start time as closed timestamp r=nvanbenschoten a=nvanbenschoten

Fixes #60929.
Relates to #61986.
Relates to #61989.

This commit fixes a closed timestamp violation that could allow a
value/intent write at a timestamp below a range's closed timestamp. This
could allow for serializability violations if it allowed a follower read
to miss a write and could lead to a panic in the rangefeed processor if
a rangefeed was watching at the right time, as we saw in #60929.

In #60929, we found that this bug was caused by a range merge and a
lease transfer racing in such a way that the closed timestamp could
later be violated by a write to the subsumed portion of the joint range.
The root cause of this was an opportunistic optimization made in 7037b54
to consider a range's lease start time as an input to its closed
timestamp computation. This optimization did not account for the
possibility of serving writes to a newly subsumed keyspace below a
range's lease start time if that keyspace was merged into a range under
its current lease and with a freeze time below the current lease start
time. This bug is fixed by removing the optimization, which was on its
way out to allow for #61986 anyway.

Note that removing this optimization does not break
`TestClosedTimestampCanServeThroughoutLeaseTransfer`, because the v2
closed timestamp system does not allow for closed timestamp regressions,
even across leaseholders. This was one of the many benefits of the new
system.

Co-authored-by: Nathan VanBenschoten <[email protected]>
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Mar 30, 2021
Fixes cockroachdb#60929.
Relates to cockroachdb#61986.
Relates to cockroachdb#61989.

This commit fixes a closed timestamp violation that could allow a
value/intent write at a timestamp below a range's closed timestamp. This
could allow for serializability violations if it allowed a follower read
to miss a write and could lead to a panic in the rangefeed processor if
a rangefeed was watching at the right time, as we saw in cockroachdb#60929.

In cockroachdb#60929, we found that this bug was caused by a range merge and a
lease transfer racing in such a way that the closed timestamp could
later be violated by a write to the subsumed portion of the joint range.
The root cause of this was an opportunistic optimization made in 7037b54
to consider a range's lease start time as an input to its closed
timestamp computation. This optimization did not account for the
possibility of serving writes to a newly subsumed keyspace below a
range's lease start time if that keyspace was merged into a range under
its current lease and with a freeze time below the current lease start
time. This bug is fixed by removing the optimization, which was on its
way out to allow for cockroachdb#61986 anyway.

Note that removing this optimization does not break
`TestClosedTimestampCanServeThroughoutLeaseTransfer`, because the v2
closed timestamp system does not allow for closed timestamp regressions,
even across leaseholders. This was one of the many benefits of the new
system.
@jlinder jlinder added the T-kv KV Team label Jun 16, 2021
andreimatei added a commit to andreimatei/cockroach that referenced this issue Aug 19, 2021
andreimatei added a commit to andreimatei/cockroach that referenced this issue Aug 26, 2021
andreimatei added a commit to andreimatei/cockroach that referenced this issue Aug 30, 2021
@craig craig bot closed this as completed in 793b4c8 Aug 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-closed-timestamps Relating to closed timestamps C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. T-kv KV Team
Projects
None yet
Development

No branches or pull requests

3 participants