-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: allow transactions to run at a lower sequence #33001
storage: allow transactions to run at a lower sequence #33001
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 6 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/storage/replica.go, line 3429 at r1 (raw file):
For other reviewers - Ridwan and I discussed this and decided it was fine to remove the assertion.
I added that assertion in when trying to smoke out any operations that would violate the invariants necessary for transactional pipelining to work properly. Now that everything’s in, I think I’d be ok with removing the assertion entirely
It already isn’t foolproof
pkg/storage/replica_test.go, line 2894 at r1 (raw file):
} // TestReplicaAbortSpanStoredTxnRetryError verifies that if a cached
Does this test need a name and a new description?
pkg/storage/engine/mvcc.go, line 1238 at r1 (raw file):
(txn.Sequence < meta.Txn.Sequence || (txn.Sequence == meta.Txn.Sequence && txn.DeprecatedBatchIndex <= meta.Txn.DeprecatedBatchIndex)) {
If you feel up to it, we can remove all references to this DeprecatedBatchIndex
now.
Then we can make this condition if txn.Epoch == meta.Txn.Epoch && txn.Sequence <= meta.Txn.Sequence
.
pkg/storage/engine/mvcc.go, line 1241 at r1 (raw file):
// Find the previous value at this key so the current value can be computed. // If a previous intent value doesn't exist, get the last versioned value // and apply the valueFn where necessary.
Feel free to expand this comment past what we're doing. Explain why we're doing this too! It's a little non-intuitive if you stumble into it.
pkg/storage/engine/mvcc.go, line 1243 at r1 (raw file):
// and apply the valueFn where necessary. prevSeq, prevValueWritten := meta.GetPrevIntentSeq(txn.Sequence) if prevValueWritten {
Add comments to these cases that explain what they logically mean. Comments like:
// There was an intent on this key at a lower sequence number that this write.
and
// This was the first write in the transaction to this key.
As a more macro level nit, I this would be easier to narrate in comments if you did writtenValue, found := meta.GetIntentValue(txn.Sequence)
first. The flow would then be:
- find the result of this write, erroring if it isn't found
- find the value that preceded it
- verify that a re-evaluation would end up with the same result
- conclude that nothing needs to be done because we're idempotent!
This reveals that looking at the previous value is only necessary if valueFn != nil
.
pkg/storage/engine/mvcc.go, line 1263 at r1 (raw file):
var exVal *roachpb.Value var err error if exVal, _, _, err = mvccGetInternal(
Explain why this is inconsistent. You mentioned on slack that you thought that was incorrect. I'm curious why. It seems right to me.
pkg/storage/engine/mvcc.go, line 1264 at r1 (raw file):
var err error if exVal, _, _, err = mvccGetInternal( ctx, iter, metaKey, timestamp, false /* consistent */, safeValue, txn, getBuf); err != nil {
Can we pass unsafeValue
here?
pkg/storage/engine/mvcc.go, line 1274 at r1 (raw file):
} // Assert that the value computed is the same as the value written previously. writtenValue, found := meta.GetIntentValue(txn.Sequence)
Does this handle the case where we replay the last intent written, before it makes it into the intent history?
pkg/storage/engine/mvcc_test.go, line 3303 at r1 (raw file):
sequence int32 batchIndex int32 expRetry bool
Looks like we never retry anymore, which is exactly what we want!
There are two new variants of tests that we need, for the "different value after recomputing" error and for the "missing an intent with lower sequence" error.
pkg/storage/engine/enginepb/mvcc.go, line 140 at r1 (raw file):
// GetPrevIntentSeq goes through the intent history and finds the previous // intents sequence number given the current sequence
"intent's"
Also, period at the end.
pkg/storage/engine/enginepb/mvcc.go, line 142 at r1 (raw file):
// intents sequence number given the current sequence func (meta *MVCCMetadata) GetPrevIntentSeq(seq int32) (prevSeq int32, found bool) { for i := len(meta.IntentHistory) - 1; i > 0; i-- {
Think it's worth performing a binary search here?
pkg/storage/engine/enginepb/mvcc.go, line 145 at r1 (raw file):
intent := meta.IntentHistory[i] if seq == intent.Sequence { prevSeq = meta.IntentHistory[i-1].Sequence
nit: why not just return here?
pkg/storage/engine/enginepb/mvcc.go, line 158 at r1 (raw file):
for _, intent := range meta.IntentHistory { if seq == intent.Sequence { found = true
Same nit.
d2c0019
to
79a357a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/storage/replica.go, line 3429 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
For other reviewers - Ridwan and I discussed this and decided it was fine to remove the assertion.
I added that assertion in when trying to smoke out any operations that would violate the invariants necessary for transactional pipelining to work properly. Now that everything’s in, I think I’d be ok with removing the assertion entirely
It already isn’t foolproof
Ack.
pkg/storage/replica_test.go, line 2894 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Does this test need a name and a new description?
Done.
pkg/storage/engine/mvcc.go, line 1238 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
If you feel up to it, we can remove all references to this
DeprecatedBatchIndex
now.Then we can make this condition
if txn.Epoch == meta.Txn.Epoch && txn.Sequence <= meta.Txn.Sequence
.
Done. See 2nd commit.
pkg/storage/engine/mvcc.go, line 1241 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Feel free to expand this comment past what we're doing. Explain why we're doing this too! It's a little non-intuitive if you stumble into it.
Done.
pkg/storage/engine/mvcc.go, line 1243 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Add comments to these cases that explain what they logically mean. Comments like:
// There was an intent on this key at a lower sequence number that this write. and // This was the first write in the transaction to this key.
As a more macro level nit, I this would be easier to narrate in comments if you did
writtenValue, found := meta.GetIntentValue(txn.Sequence)
first. The flow would then be:
- find the result of this write, erroring if it isn't found
- find the value that preceded it
- verify that a re-evaluation would end up with the same result
- conclude that nothing needs to be done because we're idempotent!
This reveals that looking at the previous value is only necessary if
valueFn != nil
.
Done.
pkg/storage/engine/mvcc.go, line 1263 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Explain why this is inconsistent. You mentioned on slack that you thought that was incorrect. I'm curious why. It seems right to me.
Elaborated here. using nil
for the txn
param tho. Could you confirm if this is correct though? Seems alright to me.
pkg/storage/engine/mvcc.go, line 1264 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Can we pass
unsafeValue
here?
Done.
pkg/storage/engine/mvcc.go, line 1274 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Does this handle the case where we replay the last intent written, before it makes it into the intent history?
Done.
pkg/storage/engine/mvcc_test.go, line 3303 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Looks like we never retry anymore, which is exactly what we want!
There are two new variants of tests that we need, for the "different value after recomputing" error and for the "missing an intent with lower sequence" error.
Done. See the test I added.
pkg/storage/engine/enginepb/mvcc.go, line 140 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
"intent's"
Also, period at the end.
Done.
pkg/storage/engine/enginepb/mvcc.go, line 142 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Think it's worth performing a binary search here?
Well its always tiny, may not be worth it but lets leave no stone unturned. Done.
pkg/storage/engine/enginepb/mvcc.go, line 158 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Same nit.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just nits now, the logic all looks good.
@tbg might be interested in reviewing this too.
Reviewed 4 of 5 files at r2, 6 of 8 files at r3.
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/storage/replica.go, line 6380 at r3 (raw file):
// the TxnMeta, which is stored as part of an intent. ba.Txn.Sequence = seqNum } else {
I think we can remove this entire else
case and DisallowUnsequencedTransactionalWrites
.
pkg/storage/engine/mvcc.go, line 1238 at r1 (raw file):
Previously, ridwanmsharif (Ridwan Sharif) wrote…
Done. See 2nd commit.
👍
pkg/storage/engine/mvcc.go, line 1241 at r1 (raw file):
Previously, ridwanmsharif (Ridwan Sharif) wrote…
Done.
Nice comment!
pkg/storage/engine/mvcc.go, line 1263 at r1 (raw file):
Previously, ridwanmsharif (Ridwan Sharif) wrote…
Elaborated here. using
nil
for thetxn
param tho. Could you confirm if this is correct though? Seems alright to me.
Yep, it's good.
pkg/storage/engine/mvcc.go, line 1239 at r2 (raw file):
(txn.Sequence == meta.Txn.Sequence && txn.DeprecatedBatchIndex <= meta.Txn.DeprecatedBatchIndex)) { // Since transactions should be idempotent, we must be particularly careful
Do you think all of this logic should be pulled into a separate function? It would allow you to test the error cases more closely.
pkg/storage/engine/mvcc.go, line 1252 at r2 (raw file):
// Ensure all intents are found and that the values are always accurate for // transactional idempotency. writtenValue, found := meta.GetIntentValue(txn.Sequence)
Let's either not call this method at all when txn.Sequence == meta.Txn.Sequence
or assert that found == false
when txn.Sequence == meta.Txn.Sequence
. FWIW, I don't really care about the cost of calling GetIntentValue
, I'm just concerned about how this reads and how understandable it is.
pkg/storage/engine/mvcc.go, line 1283 at r2 (raw file):
prevVal, _ := meta.GetIntentValue(prevSeq) // TODO(ridwanmsharif): Confirm the timestamp is correct. value, err = valueFn(&roachpb.Value{RawBytes: prevVal, Timestamp: txn.Timestamp})
This is interesting. Do we ever use the timestamp? If not, I'd leave it out entirely.
pkg/storage/engine/mvcc.go, line 1316 at r2 (raw file):
// written. if !bytes.Equal(value, writtenValue) { return errors.Errorf("transaction %s has a different value %+v after recomputing from what was written: %+v",
Include the sequence number in this error.
pkg/storage/engine/mvcc.go, line 1235 at r3 (raw file):
return errors.Errorf("put with epoch %d came after put with epoch %d in txn %s", txn.Epoch, meta.Txn.Epoch, txn.ID) } else if txn.Epoch == meta.Txn.Epoch &&
minor nit: can we remove the line wrap now?
pkg/storage/engine/mvcc_test.go, line 4435 at r2 (raw file):
// Lay down an intent without increasing the sequence but with a different value. if err := MVCCPut(ctx, engine, nil, key, ts1, newValue, txn); err != nil { if !strings.Contains(err.Error(), "has a different value") {
Replace these type of assertions with testutils.IsError
/testutils.IsPError
.
pkg/storage/engine/enginepb/mvcc.go, line 148 at r2 (raw file):
}) if index > 0 && index < len(meta.IntentHistory) && meta.IntentHistory[index].Sequence == seq {
Should we require that a sequence equal to seq
exist? Seems a little overly restrictive.
6d78871
to
9a6c74d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR! Fixed all the nits!
Reviewable status: complete! 0 of 0 LGTMs obtained
pkg/storage/replica.go, line 6380 at r3 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I think we can remove this entire
else
case andDisallowUnsequencedTransactionalWrites
.
Done.
pkg/storage/engine/mvcc.go, line 1263 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Yep, it's good.
Done.
pkg/storage/engine/mvcc.go, line 1239 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Do you think all of this logic should be pulled into a separate function? It would allow you to test the error cases more closely.
Done.
pkg/storage/engine/mvcc.go, line 1252 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Let's either not call this method at all when
txn.Sequence == meta.Txn.Sequence
or assert thatfound == false
whentxn.Sequence == meta.Txn.Sequence
. FWIW, I don't really care about the cost of callingGetIntentValue
, I'm just concerned about how this reads and how understandable it is.
Done.
pkg/storage/engine/mvcc.go, line 1283 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
This is interesting. Do we ever use the timestamp? If not, I'd leave it out entirely.
Done. Just used for MVCCIncrement
so we can get rid of it.
pkg/storage/engine/mvcc.go, line 1316 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Include the sequence number in this error.
Done.
pkg/storage/engine/mvcc_test.go, line 4435 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Replace these type of assertions with
testutils.IsError
/testutils.IsPError
.
Done.
pkg/storage/engine/enginepb/mvcc.go, line 148 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Should we require that a sequence equal to
seq
exist? Seems a little overly restrictive.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained
Adds support to have transactions run at a lower sequence at a given key. It asserts the value it computes with the value written for the sequence in the intent history instead or returning a retry-able error. Release note: None
Removes all references to the DeprecatedBatchIndex field. Release note: None
9a6c74d
to
e2459a2
Compare
bors r+ |
33001: storage: allow transactions to run at a lower sequence r=ridwanmsharif a=ridwanmsharif Continuing off #32688, final part of #5861 (comment). Adds support to have transactions run at a lower sequence at a given key. It asserts the value it computes with the value written for the sequence in the intent history instead or returning a retry-able error. Release note: None Co-authored-by: Ridwan Sharif <[email protected]>
Build succeeded |
…tches This change builds on the introduction of idempotency properties introduced into the MVCC layer in cockroachdb#33001. It exploits this property to remove a previous restriction that could prevent transactions from refreshing and result in transaction retries. The restriction was that batches which had experienced some success while writing could not refresh even if another part of the batch required a refresh. This was because it was unclear which parts of the batch had succeeded in performing writes and which parts of the batch had not. Without this knowledge, it was unsafe to re-issue the batch because that could result in duplicating writes (e.g. performing an increment twice). A lot has changed since then. We now have proper sequence numbers on requests, we no longer send `BeginTransaction` requests (which threw errors on replays), and we now have an MVCC layer that is idempotent for writes within a transaction. With this foundation in place, we can now safely re-issue any write within a transaction that we're unsure about. As long as writes remain well sequenced (writes to the same key aren't reordered), everything should behave as expected. The best way to see that the change works is through the test cases in `TestTxnCoordSenderRetries` that now succeed. Without cockroachdb#33001, those all fail. This change will lead to the removal of `MixedSuccessError`s in 19.2! Release note: None
…tches This change builds on the introduction of idempotency properties introduced into the MVCC layer in cockroachdb#33001. It exploits this property to remove a previous restriction that could prevent transactions from refreshing and result in transaction retries. The restriction was that batches which had experienced some success while writing could not refresh even if another part of the batch required a refresh. This was because it was unclear which parts of the batch had succeeded in performing writes and which parts of the batch had not. Without this knowledge, it was unsafe to re-issue the batch because that could result in duplicating writes (e.g. performing an increment twice). A lot has changed since then. We now have proper sequence numbers on requests, we no longer send `BeginTransaction` requests (which threw errors on replays), and we now have an MVCC layer that is idempotent for writes within a transaction. With this foundation in place, we can now safely re-issue any write within a transaction that we're unsure about. As long as writes remain well sequenced (writes to the same key aren't reordered), everything should behave as expected. The best way to see that the change works is through the test cases in `TestTxnCoordSenderRetries` that now succeed. Without cockroachdb#33001, those all fail. This change will lead to the removal of `MixedSuccessError`s in 19.2! Release note: None
…tches This change builds on the introduction of idempotency properties introduced into the MVCC layer in cockroachdb#33001. It exploits this property to remove a previous restriction that could prevent transactions from refreshing and result in transaction retries. The restriction was that batches which had experienced some success while writing could not refresh even if another part of the batch required a refresh. This was because it was unclear which parts of the batch had succeeded in performing writes and which parts of the batch had not. Without this knowledge, it was unsafe to re-issue the batch because that could result in duplicating writes (e.g. performing an increment twice). A lot has changed since then. We now have proper sequence numbers on requests, we no longer send `BeginTransaction` requests (which threw errors on replays), and we now have an MVCC layer that is idempotent for writes within a transaction. With this foundation in place, we can now safely re-issue any write within a transaction that we're unsure about. As long as writes remain well sequenced (writes to the same key aren't reordered), everything should behave as expected. The best way to see that the change works is through the test cases in `TestTxnCoordSenderRetries` that now succeed. Without cockroachdb#33001, those all fail. Because we can now refresh mixed-success batches, this commit allows us to remove the `MixedSuccessError` type entirely. No longer will it haunt our sender interface with its ambiguity and awkward error wrapping. No longer will it stand in the way of read refreshes with its ignorant stubbornness. The days of its tyranny are over. The war is won. Release note (performance improvement): Transactions are able to refresh their read timestamp even after the partial success of a batch.
35140: storage: employ transactional idempotency to refresh mixed-success batches r=nvanbenschoten a=nvanbenschoten This change builds on the introduction of idempotency properties introduced into the MVCC layer in #33001. It exploits this property to remove a previous restriction that could prevent transactions from refreshing and result in transaction retries. The restriction was that batches which had experienced some success while writing could not refresh even if another part of the batch required a refresh. This was because it was unclear which parts of the batch had succeeded in performing writes and which parts of the batch had not. Without this knowledge, it was unsafe to re-issue the batch because that could result in duplicating writes (e.g. performing an increment twice). A lot has changed since then. We now have proper sequence numbers on requests, we no longer send `BeginTransaction` requests (which threw errors on replays), and we now have an MVCC layer that is idempotent for writes within a transaction. With this foundation in place, we can now safely re-issue any write within a transaction that we're unsure about. As long as writes remain well-sequenced (writes to the same key aren't reordered), everything should behave as expected. The best way to see that the change works is through the test cases in `TestTxnCoordSenderRetries` that now succeed. Without #33001, those all fail. Co-authored-by: Nathan VanBenschoten <[email protected]>
Continuing off #32688, final part of #5861 (comment).
Adds support to have transactions run at a lower sequence
at a given key. It asserts the value it computes with the
value written for the sequence in the intent history instead
or returning a retry-able error.
Release note: None