Skip to content

Commit

Permalink
storage: tolerate missing transaction records when pushing
Browse files Browse the repository at this point in the history
Informs #25437.
Informs #32971.

This is the second part of addressing #32971. The final part will be updating
the txn client to stop sending BeginTxn requests.

The change closely resembles what is laid out in the corresponding RFC sections
(`PushTxn` and `QueryTxn / txnwait.Queue`). Both `PushTxn` requests and `QueryTxn`
requests are adjusted to tolerate missing transaction records and to synthesize
them based on the information pulled from intents if necessary. By hiding these
details behind the request API abstraction, we don't actually need to modify
the `txnwait.Queue` at all!

The change does diverge from the RFC in two distinct ways. The first is that it
introduces a new invariant about transaction record creation. Transaction can
now only ever be created by requests originating from their own coordinator
(`BeginTxn`, `HeartbeatTxn`, and `EndTxn`). They can never be created in any
state (not even ABORTED) by concurrent actors. This is actually a much stronger
invariant than what previously existed (and even what was proposed in the RFC),
and it simplifies a lot of logic by dramatically refining the transaction state
machine. We now know that for a transaction record to exist, it must have been
created by its own coordinator and it must have checked with `CanCreateTxnRecord`.

Making this new invariant work required the second major divergence from the
RFC. The RFC suggested that we use the read timestamp cache for successful
transaction timestamp pushes before a transaction record was written. This PR
takes this a step further and uses the write timestamp cache for successful
transaction aborts before a transaction record was written. In doing this, we
emerge with a very sane timestamp cache policy - the read timestamp cache is
used to push transaction timestamps and the write timestamp cache is used to
abort them entirely (which it was already essentially being used for). The txnID
marker on these timestamp cache entries then becomes the transaction that
initiated the push/abort. Transactions then consult both of these sources before
creating their transaction record in `CanCreateTxnRecord`. In doing this, we
avoid an entire class of situations where concurrent transactions created and
abandoned transaction records for another transaction, requiring eventual GC.

Release note: None
  • Loading branch information
nvanbenschoten committed Jan 10, 2019
1 parent 73397b4 commit 3ac97e6
Show file tree
Hide file tree
Showing 19 changed files with 828 additions and 518 deletions.
2 changes: 1 addition & 1 deletion docs/RFCS/20181209_lazy_txn_record_creation.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
- Feature Name: Lazy Transaction Record Creation (a.k.a Deprecate BeginTransaction)
- Status: draft
- Status: in-progress
- Start Date: 2018-12-09
- Authors: Nathan VanBenschoten
- RFC PR: #32971
Expand Down
9 changes: 7 additions & 2 deletions pkg/roachpb/api.go
Original file line number Diff line number Diff line change
Expand Up @@ -1028,8 +1028,13 @@ func (*AdminChangeReplicasRequest) flags() int { return isAdmin | isAlone }
func (*AdminRelocateRangeRequest) flags() int { return isAdmin | isAlone }
func (*HeartbeatTxnRequest) flags() int { return isWrite | isTxn }
func (*GCRequest) flags() int { return isWrite | isRange }
func (*PushTxnRequest) flags() int { return isWrite | isAlone }
func (*QueryTxnRequest) flags() int { return isRead | isAlone }

// PushTxnRequest updates the read timestamp cache when pushing a transaction's
// timestamp and updates the write timestamp cache when aborting a transaction.
func (*PushTxnRequest) flags() int {
return isWrite | isAlone | updatesReadTSCache | updatesWriteTSCache
}
func (*QueryTxnRequest) flags() int { return isRead | isAlone }

// QueryIntent only updates the read timestamp cache when attempting
// to prevent an intent that is found missing from ever being written
Expand Down
240 changes: 120 additions & 120 deletions pkg/roachpb/api.pb.go

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions pkg/roachpb/api.proto
Original file line number Diff line number Diff line change
Expand Up @@ -742,8 +742,8 @@ message PushTxnRequest {
// this to PUSH_TOUCH to determine whether the pushee can be aborted
// due to inactivity (based on the now field).
PushTxnType push_type = 6;
// Forces the push by overriding the normal checks in PushTxn to
// either abort or push the timestamp.
// Forces the push by overriding the normal expiration and priority checks
// in PushTxn to either abort or push the timestamp.
bool force = 7;

reserved 8;
Expand Down
102 changes: 51 additions & 51 deletions pkg/roachpb/errors.pb.go

Large diffs are not rendered by default.

22 changes: 11 additions & 11 deletions pkg/roachpb/errors.proto
Original file line number Diff line number Diff line change
Expand Up @@ -123,12 +123,12 @@ enum TransactionAbortedReason {
// For backwards compatibility.
ABORT_REASON_UNKNOWN = 0;

// A BeginTransaction or EndTransaction(commit=true) request found an aborted
// transaction record. Another txn must have written this record - that other
// txn probably ran into one of our intents written before BeginTransaction
// (on a different range) and pushed it successfully. Or, a high-priority
// transaction simply pushed us, or we failed to heartbeat for a while and
// another txn (of any priority) considered us abandoned and pushed us.
// A BeginTransaction, HeartbeatTxn, or EndTransaction(commit=true) request
// found an aborted transaction record. Another txn must have written this
// record - that other txn probably ran into one of our intents and pushed our
// transaction record successfully. Either a high-priority transaction simply
// pushed us or we failed to heartbeat for a while and another txn (of any
// priority) considered us abandoned and pushed us.
ABORT_REASON_ABORTED_RECORD_FOUND = 1;

// The request attempting to create a transaction record has a timestamp below
Expand All @@ -152,11 +152,11 @@ enum TransactionAbortedReason {
// meantime because the transaction was aborted.
ABORT_REASON_ABORT_SPAN = 5;

// An EndTransaction encountered a timestamp cache entry for the txn key, and
// the entry identifies this transaction. This means that the transaction
// definitely committed or rolled back before.
// So, this EndTransaction is either a delayed replay of some sort, or it
// raced with an async abort and lost. If a client gets this
// A request attempting to create a transaction record encountered a write
// timestamp cache entry for the txn key, and the entry identifies this
// transaction. This means that the transaction definitely committed or rolled
// back before. So, this request is either a delayed replay of some sort, or
// it raced with an async abort and lost. If a client gets this
// TransactionAbortedError (without it being wrapped in an ambiguous error),
// it must be the latter case, and the transaction can be retried.
ABORT_REASON_ALREADY_COMMITTED_OR_ROLLED_BACK_POSSIBLE_REPLAY = 6;
Expand Down
4 changes: 2 additions & 2 deletions pkg/storage/batcheval/cmd_begin_transaction.go
Original file line number Diff line number Diff line change
Expand Up @@ -118,8 +118,8 @@ func BeginTransaction(
}

// Verify that it is safe to create the transaction record.
if ok, reason := cArgs.EvalCtx.CanCreateTxnRecord(reply.Txn); !ok {
return result.Result{}, roachpb.NewTransactionAbortedError(reason)
if err := CanCreateTxnRecord(cArgs.EvalCtx, reply.Txn); err != nil {
return result.Result{}, err
}

// Write the txn record.
Expand Down
16 changes: 8 additions & 8 deletions pkg/storage/batcheval/cmd_end_transaction.go
Original file line number Diff line number Diff line change
Expand Up @@ -186,8 +186,8 @@ func evalEndTransaction(
// to perform this verification for commits. Rollbacks can always write
// an aborted txn record.
if args.Commit {
if ok, reason := cArgs.EvalCtx.CanCreateTxnRecord(reply.Txn); !ok {
return result.Result{}, roachpb.NewTransactionAbortedError(reason)
if err := CanCreateTxnRecord(cArgs.EvalCtx, reply.Txn); err != nil {
return result.Result{}, err
}
}
} else {
Expand Down Expand Up @@ -350,12 +350,12 @@ func evalEndTransaction(
// subsequent replay of this EndTransaction call because the txn timestamp
// will be too old. Replays of requests which attempt to create a new txn
// record (BeginTransaction, HeartbeatTxn, or EndTransaction) never succeed
// because EndTransaction inserts in the write timestamp cache, forcing the
// call to CanCreateTxnRecord to return false, resulting in a transaction
// retry error. If the replay didn't attempt to create a txn record, any
// push will immediately succeed as a missing txn record on push where
// CanCreateTxnRecord returns false succeeds. In both cases, the txn will
// be GC'd on the slow path.
// because EndTransaction inserts in the write timestamp cache in Replica's
// updateTimestampCache method, forcing the call to CanCreateTxnRecord to
// return false, resulting in a transaction retry error. If the replay
// didn't attempt to create a txn record, any push will immediately succeed
// as a missing txn record on push where CanCreateTxnRecord returns false
// succeeds. In both cases, the txn will be GC'd on the slow path.
//
// We specify alwaysReturn==false because if the commit fails below Raft, we
// don't want the intents to be up for resolution. That should happen only
Expand Down
4 changes: 2 additions & 2 deletions pkg/storage/batcheval/cmd_heartbeat_txn.go
Original file line number Diff line number Diff line change
Expand Up @@ -72,8 +72,8 @@ func HeartbeatTxn(
}

// Verify that it is safe to create the transaction record.
if ok, reason := cArgs.EvalCtx.CanCreateTxnRecord(&txn); !ok {
return result.Result{}, roachpb.NewTransactionAbortedError(reason)
if err := CanCreateTxnRecord(cArgs.EvalCtx, &txn); err != nil {
return result.Result{}, err
}
}

Expand Down
Loading

0 comments on commit 3ac97e6

Please sign in to comment.