Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: unify client and server-side refresh code paths #73557

Merged

Conversation

nvanbenschoten
Copy link
Member

This commit is a pure refactor that unifies various code paths that perform
transaction refreshing. It consolidates the conditions used to check whether a
refresh is permitted for a Transaction, the mechanism through which a
Transaction protobuf is updated during a Refresh (Transaction.Refresh), and,
most importantly, it consolidates the logic that determines the necessary
refresh timestamp from a roachpb.Error (TransactionRefreshTimestamp).

This lays the groundwork to eventually be able to perform server-side refreshes
for ReadWithinUncertaintyIntervalErrors. However, there is a bit of complexity
with doing so because of read latches, which is mentioned in a comment.

@nvanbenschoten nvanbenschoten requested a review from a team as a code owner December 7, 2021 17:40
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:LGTM: nice

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @andreimatei and @nvanbenschoten)


pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 462 at r1 (raw file):

	refreshTxn := ba.Txn.Clone()
	refreshTxn.Refresh(ba.Txn.WriteTimestamp)

consider renaming this Refresh method. It's very unsuggestive, I think. Something with the word Bump in it would be good


pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 462 at r1 (raw file):

	refreshTxn := ba.Txn.Clone()
	refreshTxn.Refresh(ba.Txn.WriteTimestamp)

consider putting a comment on tryUpdatingTxnSpans saying that the txn passed to it should be a clone.


pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 590 at r1 (raw file):

// auto-retries.
func (sr *txnSpanRefresher) canForwardReadTimestamp(txn *roachpb.Transaction) bool {
	return !txn.CommitTimestampFixed && sr.canAutoRetry

nit: consider inverting the order as the second condition is more likely to be false.


pkg/kv/kvserver/replica_evaluate.go, line 554 at r1 (raw file):

			// refreshes of ReadWithinUncertaintyIntervalErrors for now, even though
			// that is the eventual goal here. I'd like to lift this limitation in a
			// dedicated commit. The commit will likely need to be accompanied by an

nit: talking about a "dedicated commit" seems like a weird thing to merge


pkg/kv/kvserver/replica_evaluate.go, line 569 at r1 (raw file):

		} else {
			if !br.Txn.WriteTooOld {
				log.Fatalf(ctx, "expected the WriteTooOld flag to be set")

maybe take the opportunity to improve the function comment, which seems to imply that an error is always passed

@nvanbenschoten nvanbenschoten force-pushed the nvanbenschoten/unifyServerSideRefresh branch from 0acc506 to d0303a8 Compare December 11, 2021 19:12
This commit is a pure refactor that unifies various code paths that perform
transaction refreshing. It consolidates the conditions used to check whether a
refresh is permitted for a Transaction, the mechanism through which a
Transaction protobuf is updated during a Refresh (`Transaction.Refresh`), and,
most importantly, it consolidates the logic that determines the necessary
refresh timestamp from a `roachpb.Error` (`TransactionRefreshTimestamp`).

This lays the groundwork to eventually be able to perform server-side refreshes
for `ReadWithinUncertaintyIntervalError`s. However, there is a bit of complexity
with doing so because of read latches, which is mentioned in a comment.
@nvanbenschoten nvanbenschoten force-pushed the nvanbenschoten/unifyServerSideRefresh branch from d0303a8 to 9c09473 Compare December 11, 2021 19:14
Copy link
Member Author

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

bors r=andreimatei

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei)


pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 462 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

consider putting a comment on tryUpdatingTxnSpans saying that the txn passed to it should be a clone.

Done.


pkg/kv/kvclient/kvcoord/txn_interceptor_span_refresher.go, line 590 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

nit: consider inverting the order as the second condition is more likely to be false.

Done.


pkg/kv/kvserver/replica_evaluate.go, line 554 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

nit: talking about a "dedicated commit" seems like a weird thing to merge

Done.


pkg/kv/kvserver/replica_evaluate.go, line 569 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

maybe take the opportunity to improve the function comment, which seems to imply that an error is always passed

It mentions or the WriteTooOldFlag in br.Txn if there's no error in the first sentence, so I think it's good.

@craig
Copy link
Contributor

craig bot commented Dec 11, 2021

Build succeeded:

@craig craig bot merged commit 22df0a6 into cockroachdb:master Dec 11, 2021
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request Dec 13, 2021
Fixes cockroachdb#58459.
Informs cockroachdb#36431.

This commit fixes a long-standing correctness issue where non-transactional
requests did not ensure single-key linearizability even if they deferred their
timestamp allocation to the leaseholder of their (single) range. They still
don't entirely, because of cockroachdb#36431, but this change brings us one step closer to
the fix we plan to land for cockroachdb#36431 also applying to non-transactional requests.

The change addresses this by giving non-transactional requests uncertainty
intervals. This ensures that they also guarantee single-key linearizability even
with only loose (but bounded) clock synchronization. Non-transactional requests
that use a client-provided timestamp do not have uncertainty intervals and do
not make real-time ordering guarantees.

Unlike transactions, which establish an uncertainty interval on their
coordinator node during initialization, non-transactional requests receive
uncertainty intervals from their target leaseholder, using a clock reading
from the leaseholder's local HLC as the local limit and this clock reading +
the cluster's maximum clock skew as the global limit.

It is somewhat non-intuitive that non-transactional requests need uncertainty
intervals — after all, they receive their timestamp to the leaseholder of the
only range that they talk to, so isn't every value with a commit timestamp
above their read timestamp certainly concurrent? The answer is surprisingly
"no" for the following reasons, so they cannot forgo the use of uncertainty
interval:

1. the request timestamp is allocated before consulting the replica's lease.
   This means that there are times when the replica is not the leaseholder at
   the point of timestamp allocation, and only becomes the leaseholder later.
   In such cases, the timestamp assigned to the request is not guaranteed to
   be greater than the written_timestamp of all writes served by the range at
   the time of allocation. This is true despite invariants 1 & 2 from above,
   because the replica allocating the timestamp is not yet the leaseholder.

   In cases where the replica that assigned the non-transactional request's
   timestamp takes over as the leaseholder after the timestamp allocation, we
   expect minimumLocalLimitForLeaseholder to forward the local uncertainty
   limit above TimestampFromServerClock, to the lease start time.

2. ignoring reason #1, the request timestamp assigned by the leaseholder is
   equal to its local uncertainty limit and is guaranteed to lead the
   written_timestamp of all writes served by the range at the time the
   request arrived at the leaseholder node. However, this timestamp is still
   not guaranteed to lead the commit timestamp of all of these writes. As a
   result, the non-transactional request needs an uncertainty interval with a
   global uncertainty limit far enough in advance of the local HLC clock to
   ensure that it considers any value that was part of a transaction which
   could have committed before the request was received by the leaseholder to
   be uncertain. Concretely, the non-transactional request needs to consider
   values of the following form to be uncertain:

     written_timestamp < local_limit && commit_timestamp < global_limit

   The value that the non-transactional request is observing may have been
   written on the local leaseholder at time 10, its transaction may have been
   committed remotely at time 20, acknowledged, then the non-transactional
   request may have begun and received a timestamp of 15 from the local
   leaseholder, then finally the value may have been resolved asynchronously
   and moved to timestamp 20 (written_timestamp: 10, commit_timestamp: 20).
   The failure of the non-transactional request to observe this value would
   be a stale read.

----

Conveniently, because non-transactional requests are always scoped to a
single-range, those that hit uncertainty errors can always retry on the server,
so these errors never bubble up to the client that initiated the request.

However, before this commit, this wasn't actually possible because we did not
support server-side retries of `ReadWithinUncertaintyIntervalError`. The second
part of this commit adds support for this. This new support allows us to make
the guarantee we want about non-transactional requests never returning these
errors to the client, even though they use uncertainty intervals. It also serves
as a performance optimization for transactional requests, which now benefit from
this new capability. There's some complexity around supporting this form of
server-side retry, because it must be done above latching instead of below, but
recent refactors in cockroachdb#73557 and cockroachdb#73717 have made this possible to support
cleanly.

----

This change is related to cockroachdb#36431 in two ways. First, it allows non-transactional
requests to benefit from our incoming fix for that issue. Second, it unblocks
some of the clock refactors proposed in cockroachdb#72121 (comment),
and by extension cockroachdb#72663. Even though there are clear bugs today, I still don't
feel comfortable removing the `hlc.Clock.Update` in `Store.Send` until we make
this change. We know from cockroachdb#36431 that invariant 1 from
[`uncertainty.D6`](https://github.com/cockroachdb/cockroach/blob/22df0a6658a927e7b65dafbdfc9d790500791993/pkg/kv/kvserver/uncertainty/doc.go#L131)
doesn't hold, and yet I still do think the `hlc.Clock.Update` in `Store.Send`
masks the bugs in this area in many cases. Removing that clock update (I don't
actually plan to remove it, but I plan to disconnect it entirely from operation
timestamps) without first giving non-transactional requests uncertainty intervals
seems like it may reveal these bugs in ways we haven't seen in the past. So I'd
like to land this fix before making that change.

----

Release note (performance improvement): Certain forms of automatically retried
"read uncertainty" errors are now retried more efficiently, avoiding a network
round trip.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request Feb 3, 2022
Extracted from cockroachdb#73732, with relevant comments addressed.

This commit adds support for server-side refreshes of
`ReadWithinUncertaintyIntervalError`s. This serves as a performance optimization
for transactional requests, which now benefit from this new capability to
refresh away `ReadWithinUncertaintyIntervalErrors` early in their transaction,
before they've accumulated any refresh spans. There's some complexity around
supporting this form of server-side retry, because it must be done above
latching, instead of below. However, the recent refactoring in cockroachdb#73557 has made
this possible to support cleanly.

This is also a prerequisite to giving non-transactional requests uncertainty
intervals (cockroachdb#73732), because we don't want ReadWithinUncertaintyIntervalErrors to
reach the client for non-transactional requests. Conveniently, because
non-transactional requests are always scoped to a single-range, those that hit
uncertainty errors will always be able to retry on the server, so these errors
will never bubble up to the client that initiated the request.

Release note (performance improvement): Certain forms of automatically retried
"read uncertainty" errors are now retried more efficiently, avoiding a network
round trip.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request Feb 5, 2022
Extracted from cockroachdb#73732, with relevant comments addressed.

This commit adds support for server-side refreshes of
`ReadWithinUncertaintyIntervalError`s. This serves as a performance optimization
for transactional requests, which now benefit from this new capability to
refresh away `ReadWithinUncertaintyIntervalErrors` early in their transaction,
before they've accumulated any refresh spans. There's some complexity around
supporting this form of server-side retry, because it must be done above
latching, instead of below. However, the recent refactoring in cockroachdb#73557 has made
this possible to support cleanly.

This is also a prerequisite to giving non-transactional requests uncertainty
intervals (cockroachdb#73732), because we don't want ReadWithinUncertaintyIntervalErrors to
reach the client for non-transactional requests. Conveniently, because
non-transactional requests are always scoped to a single-range, those that hit
uncertainty errors will always be able to retry on the server, so these errors
will never bubble up to the client that initiated the request.

Release note (performance improvement): Certain forms of automatically retried
"read uncertainty" errors are now retried more efficiently, avoiding a network
round trip.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request Feb 5, 2022
Extracted from cockroachdb#73732, with relevant comments addressed.

This commit adds support for server-side refreshes of
`ReadWithinUncertaintyIntervalError`s. This serves as a performance optimization
for transactional requests, which now benefit from this new capability to
refresh away `ReadWithinUncertaintyIntervalErrors` early in their transaction,
before they've accumulated any refresh spans. There's some complexity around
supporting this form of server-side retry, because it must be done above
latching, instead of below. However, the recent refactoring in cockroachdb#73557 has made
this possible to support cleanly.

This is also a prerequisite to giving non-transactional requests uncertainty
intervals (cockroachdb#73732), because we don't want ReadWithinUncertaintyIntervalErrors to
reach the client for non-transactional requests. Conveniently, because
non-transactional requests are always scoped to a single-range, those that hit
uncertainty errors will always be able to retry on the server, so these errors
will never bubble up to the client that initiated the request.

Release note (performance improvement): Certain forms of automatically retried
"read uncertainty" errors are now retried more efficiently, avoiding a network
round trip.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this pull request Feb 8, 2022
Extracted from cockroachdb#73732, with relevant comments addressed.

This commit adds support for server-side refreshes of
`ReadWithinUncertaintyIntervalError`s. This serves as a performance optimization
for transactional requests, which now benefit from this new capability to
refresh away `ReadWithinUncertaintyIntervalErrors` early in their transaction,
before they've accumulated any refresh spans. There's some complexity around
supporting this form of server-side retry, because it must be done above
latching, instead of below. However, the recent refactoring in cockroachdb#73557 has made
this possible to support cleanly.

Specifically, we now handle `ReadWithinUncertaintyIntervalError` as a concurrency
error in the `executeWithConcurrencyRetries` retry loop. This is different from
other server-side retries, which are hit during writes and can be handled without
releasing latches. This difference stems from the difference in how read and write
latches behave. Write latches protect their MVCC timestamp and any later time.
Meanwhile, read latches protect their MVCC timestamp and any earlier time. This
means that a request holding read latches that hits an uncertainty error can't
refresh without dropping those latches and acquiring new ones.

This is also a prerequisite to giving non-transactional requests uncertainty
intervals (cockroachdb#73732), because we don't want ReadWithinUncertaintyIntervalErrors to
reach the client for non-transactional requests. Conveniently, because
non-transactional requests are always scoped to a single-range, those that hit
uncertainty errors will always be able to retry on the server, so these errors
will never bubble up to the client that initiated the request.

Release note (performance improvement): Certain forms of automatically retried
"read uncertainty" errors are now retried more efficiently, avoiding a network
round trip.
craig bot pushed a commit that referenced this pull request Feb 8, 2022
75905: kv: support server-side refreshes of uncertainty errors r=nvanbenschoten a=nvanbenschoten

Extracted from #73732, with relevant comments addressed.

This commit adds support for server-side refreshes of
`ReadWithinUncertaintyIntervalError`s. This serves as a performance optimization
for transactional requests, which now benefit from this new capability to
refresh away `ReadWithinUncertaintyIntervalErrors` early in their transaction,
before they've accumulated any refresh spans. There's some complexity around
supporting this form of server-side retry, because it must be done above
latching, instead of below. However, the recent refactoring in #73557 has made
this possible to support cleanly.

Specifically, we now handle `ReadWithinUncertaintyIntervalError` as a concurrency
error in the `executeWithConcurrencyRetries` retry loop. This is different from
other server-side retries, which are hit during writes and can be handled without
releasing latches. This difference stems from the difference in how read and write
latches behave. Write latches protect their MVCC timestamp and any later time.
Meanwhile, read latches protect their MVCC timestamp and any earlier time. This
means that a request holding read latches that hits an uncertainty error can't
refresh without dropping those latches and acquiring new ones.

This is also a prerequisite to giving non-transactional requests uncertainty
intervals (#73732), because we don't want ReadWithinUncertaintyIntervalErrors to
reach the client for non-transactional requests. Conveniently, because
non-transactional requests are always scoped to a single-range, those that hit
uncertainty errors will always be able to retry on the server, so these errors
will never bubble up to the client that initiated the request.

Release note (performance improvement): Certain forms of automatically retried
"read uncertainty" errors are now retried more efficiently, avoiding a network
round trip.

Co-authored-by: Nathan VanBenschoten <[email protected]>
RajivTS pushed a commit to RajivTS/cockroach that referenced this pull request Mar 6, 2022
Extracted from cockroachdb#73732, with relevant comments addressed.

This commit adds support for server-side refreshes of
`ReadWithinUncertaintyIntervalError`s. This serves as a performance optimization
for transactional requests, which now benefit from this new capability to
refresh away `ReadWithinUncertaintyIntervalErrors` early in their transaction,
before they've accumulated any refresh spans. There's some complexity around
supporting this form of server-side retry, because it must be done above
latching, instead of below. However, the recent refactoring in cockroachdb#73557 has made
this possible to support cleanly.

Specifically, we now handle `ReadWithinUncertaintyIntervalError` as a concurrency
error in the `executeWithConcurrencyRetries` retry loop. This is different from
other server-side retries, which are hit during writes and can be handled without
releasing latches. This difference stems from the difference in how read and write
latches behave. Write latches protect their MVCC timestamp and any later time.
Meanwhile, read latches protect their MVCC timestamp and any earlier time. This
means that a request holding read latches that hits an uncertainty error can't
refresh without dropping those latches and acquiring new ones.

This is also a prerequisite to giving non-transactional requests uncertainty
intervals (cockroachdb#73732), because we don't want ReadWithinUncertaintyIntervalErrors to
reach the client for non-transactional requests. Conveniently, because
non-transactional requests are always scoped to a single-range, those that hit
uncertainty errors will always be able to retry on the server, so these errors
will never bubble up to the client that initiated the request.

Release note (performance improvement): Certain forms of automatically retried
"read uncertainty" errors are now retried more efficiently, avoiding a network
round trip.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants