storage: refactor mvcc write parameters #107680

AlexTalks · 2023-07-27T04:50:09Z

This change introduces MVCCWriteOptions, a structure for bundling parameters for MVCCPut, MVCCDelete, and their many variants, and refactors usages of these functions across the code base in order to move the existing function arguments into this structure. In addition to allowing the code to eliminate specifying default values in many callers, this enables the ability to pass new flags to write operations such as the replay protection needed to address #103817.

Part of: #103817

Release note: None

cockroach-teamcity · 2023-07-27T04:50:19Z

This change is

nvanbenschoten

Reviewed 81 of 81 files at r1, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks and @itsbilal)

pkg/kv/kvserver/batcheval/cmd_conditional_put.go line 54 at r1 (raw file):

	h := cArgs.Header

	opts := storage.MVCCWriteOptions{

nit: move below the assignments of ts and handleMissing to mirror the order of the args.

pkg/kv/kvserver/batcheval/cmd_init_put.go line 35 at r1 (raw file):

	h := cArgs.Header

	opts := storage.MVCCWriteOptions{

Same point about ordering here and in a few other files. Just to stay consistent.

pkg/storage/mvcc.go line 3018 at r1 (raw file):

	max int64,
	timestamp hlc.Timestamp,
	opts MVCCWriteOptions,

nit: did you want to order this below returnKeys?

pkg/storage/bench_test.go line 1187 at r1 (raw file):

				hlc.MaxTimestamp,
				MVCCWriteOptions{
					Stats: &enginepb.MVCCStats{},

I know the previous code was constructing a stats object, but I think we can just omit the stats here.

erikgrinaker

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @AlexTalks, @itsbilal, and @nvanbenschoten)

pkg/storage/bench_test.go line 1187 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I know the previous code was constructing a stats object, but I think we can just omit the stats here.

We should pass stats here, to include the cost of stats computations in benchmarks. Passing nil will often disable stats computations entirely. This cost isn't always negligible, particularly in the case of MVCC range tombstones where it involves additional seeks to find boundary conditions.

Consider adding a comment to this effect.

This change introduces `MVCCWriteOptions`, a structure for bundling parameters for `MVCCPut`, `MVCCDelete`, and their many variants, and refactors usages of these functions across the code base in order to move the existing function arguments into this structure. In addition to allowing the code to eliminate specifying default values in many callers, this enables the ability to pass new flags to write operations such as the replay protection needed to address cockroachdb#103817. Part of: cockroachdb#103817 Release note: None

AlexTalks

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @itsbilal and @nvanbenschoten)

pkg/storage/mvcc.go line 3018 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: did you want to order this below returnKeys?

For the most part it made the most sense to me to replace the txn parameter, since that's the most common non-nil parameter that it encapsulates I think.

AlexTalks · 2023-07-29T00:02:21Z

bors r+

craig · 2023-07-29T01:19:36Z

Build succeeded:

Bazel Essential CI (Cockroach)

While previously, RPC failures were assumed to be retriable, as write operations (with the notable exception of `EndTxn`) were assumed to be idempotent, it has been seen in cockroachdb#67765 and documented in cockroachdb#103817 that RPC failures on write operations that occur in parallel with a commit (i.e. a partial batch where `withCommit==true`), it is not always possible to assume idempotency and retry the "ambiguous" writes. This is due to the fact that the retried write RPC could result in the transaction's `WriteTimestamp` being bumped, changing the commit timestamp of the transaction that may in fact already be implicitly committed if the initial "ambiguous" write actually succeeded. This change modifies the protocol of the DistSender to flag in subsequent retries that a batch with a commit has previously experienced ambiguity, as well as the handling of the retried write in the MVCC layer to detect this previous ambiguity and reject retries that change the write timestamp as a non-idempotent replay. The flag allows subsequent retries to "remember" the earlier ambiguous write and evaluate accordingly. The flag allows us to properly handle RPC failures (i.e. ambiguous writes) that occur on commit, as a transaction that is implicitly committed is eligible to be marked as explicitly committed by contending transactions via the `RecoverTxn` operation, resulting in a race between retries by the transaction coordinator and recovery by contending transactions that could result in either incorrectly reporting a transaction as having failed with a `RETRY_SERIALIZABLE` error (despite the possibility that it already was or could be recovered and successfully committed), or in attempting to explicitly commit an already-recovered and committed transaction, resulting in seeing an assertion failure due to `transaction unexpectedly committed`. The replay protection introduced here allows us to avoid both of these situations by detecting a replay that should be considered non-idempotent and returning an error, causing the original RPC error remembered by the DistSender to be propagated as an `AmbiguousResultError`. As such, this can be handled by application code by validating the success/failure of a transaction when receiving this error. Depends on cockroachdb#107680, cockroachdb#107323, cockroachdb#108154, cockroachdb#108001 Fixes: cockroachdb#103817 Release note (bug fix): Properly handles RPC failures on writes using the parallel commit protocol that execute in parallel to the commit operation, avoiding incorrect retriable failures and `transaction unexpectedly committed` assertions by detecting when writes cannot be retried idempotently, instead returning an `AmbiguousResultError`.

While previously, RPC failures were assumed to be retryable, as write operations (with the notable exception of `EndTxn`) were assumed to be idempotent, it has been seen in cockroachdb#67765 and documented in cockroachdb#103817 that RPC failures on write operations that occur in parallel with a commit (i.e. a partial batch where `withCommit==true`), it is not always possible to assume idempotency and retry the "ambiguous" writes. This is due to the fact that the retried write RPC could result in the transaction's `WriteTimestamp` being bumped, changing the commit timestamp of the transaction that may in fact already be implicitly committed if the initial "ambiguous" write actually succeeded. This change modifies the protocol of the DistSender to flag in subsequent retries that a batch with a commit has previously experienced ambiguity, as well as the handling of the retried write in the MVCC layer to detect this previous ambiguity and reject retries that change the write timestamp as a non-idempotent replay. The flag allows subsequent retries to "remember" the earlier ambiguous write and evaluate accordingly. The flag allows us to properly handle RPC failures (i.e. ambiguous writes) that occur on commit, as a transaction that is implicitly committed is eligible to be marked as explicitly committed by contending transactions via the `RecoverTxn` operation, resulting in a race between retries by the transaction coordinator and recovery by contending transactions that could result in either incorrectly reporting a transaction as having failed with a `RETRY_SERIALIZABLE` error (despite the possibility that it already was or could be recovered and successfully committed), or in attempting to explicitly commit an already-recovered and committed transaction, resulting in seeing an assertion failure due to `transaction unexpectedly committed`. The replay protection introduced here allows us to avoid both of these situations by detecting a replay that should be considered non-idempotent and returning an error, causing the original RPC error remembered by the DistSender to be propagated as an `AmbiguousResultError`. As such, this can be handled by application code by validating the success/failure of a transaction when receiving this error. Depends on cockroachdb#107680, cockroachdb#107323, cockroachdb#108154, cockroachdb#108001 Fixes: cockroachdb#103817 Release note (bug fix): Properly handles RPC failures on writes using the parallel commit protocol that execute in parallel to the commit operation, avoiding incorrect retryable failures and `transaction unexpectedly committed` assertions by detecting when writes cannot be retried idempotently, instead returning an `AmbiguousResultError`.

107658: kv: enable replay protection for ambiguous writes on commits r=AlexTalks a=AlexTalks While previously, RPC failures were assumed to be retriable, as write operations (with the notable exception of `EndTxn`) were assumed to be idempotent, it has been seen in #67765 and documented in #103817 that RPC failures on write operations that occur in parallel with a commit (i.e. a partial batch where `withCommit==true`), it is not always possible to assume idempotency and retry the "ambiguous" writes. This is due to the fact that the retried write RPC could result in the transaction's `WriteTimestamp` being bumped, changing the commit timestamp of the transaction that may in fact already be implicitly committed if the initial "ambiguous" write actually succeeded. This change modifies the protocol of the DistSender to flag in subsequent retries that a batch with a commit has previously experienced ambiguity, as well as the handling of the retried write in the MVCC layer to detect this previous ambiguity and reject retries that change the write timestamp as a non-idempotent replay. The flag allows subsequent retries to "remember" the earlier ambiguous write and evaluate accordingly. The flag allows us to properly handle RPC failures (i.e. ambiguous writes) that occur on commit, as a transaction that is implicitly committed is eligible to be marked as explicitly committed by contending transactions via the `RecoverTxn` operation, resulting in a race between retries by the transaction coordinator and recovery by contending transactions that could result in either incorrectly reporting a transaction as having failed with a `RETRY_SERIALIZABLE` error (despite the possibility that it already was or could be recovered and successfully committed), or in attempting to explicitly commit an already-recovered and committed transaction, resulting in seeing an assertion failure due to `transaction unexpectedly committed`. The replay protection introduced here allows us to avoid both of these situations by detecting a replay that should be considered non-idempotent and returning an error, causing the original RPC error remembered by the DistSender to be propagated as an `AmbiguousResultError`. As such, this can be handled by application code by validating the success/failure of a transaction when receiving this error. Depends on #107680, #107323, #108154, #108001 Fixes: #103817 Release note (bug fix): Properly handles RPC failures on writes using the parallel commit protocol that execute in parallel to the commit operation, avoiding incorrect retriable failures and `transaction unexpectedly committed` assertions by detecting when writes cannot be retried idempotently, instead returning an `AmbiguousResultError`. Co-authored-by: Alex Sarkesian <[email protected]>

While previously, RPC failures were assumed to be retryable, as write operations (with the notable exception of `EndTxn`) were assumed to be idempotent, it has been seen in cockroachdb#67765 and documented in cockroachdb#103817 that RPC failures on write operations that occur in parallel with a commit (i.e. a partial batch where `withCommit==true`), it is not always possible to assume idempotency and retry the "ambiguous" writes. This is due to the fact that the retried write RPC could result in the transaction's `WriteTimestamp` being bumped, changing the commit timestamp of the transaction that may in fact already be implicitly committed if the initial "ambiguous" write actually succeeded. This change modifies the protocol of the DistSender to flag in subsequent retries that a batch with a commit has previously experienced ambiguity, as well as the handling of the retried write in the MVCC layer to detect this previous ambiguity and reject retries that change the write timestamp as a non-idempotent replay. The flag allows subsequent retries to "remember" the earlier ambiguous write and evaluate accordingly. The flag allows us to properly handle RPC failures (i.e. ambiguous writes) that occur on commit, as a transaction that is implicitly committed is eligible to be marked as explicitly committed by contending transactions via the `RecoverTxn` operation, resulting in a race between retries by the transaction coordinator and recovery by contending transactions that could result in either incorrectly reporting a transaction as having failed with a `RETRY_SERIALIZABLE` error (despite the possibility that it already was or could be recovered and successfully committed), or in attempting to explicitly commit an already-recovered and committed transaction, resulting in seeing an assertion failure due to `transaction unexpectedly committed`. The replay protection introduced here allows us to avoid both of these situations by detecting a replay that should be considered non-idempotent and returning an error, causing the original RPC error remembered by the DistSender to be propagated as an `AmbiguousResultError`. As such, this can be handled by application code by validating the success/failure of a transaction when receiving this error. Depends on cockroachdb#107680, cockroachdb#107323, cockroachdb#108154, cockroachdb#108001 Fixes: cockroachdb#103817 Release note (bug fix): Properly handles RPC failures on writes using the parallel commit protocol that execute in parallel to the commit operation, avoiding incorrect retryable failures and `transaction unexpectedly committed` assertions by detecting when writes cannot be retried idempotently, instead returning an `AmbiguousResultError`.

AlexTalks requested review from a team as code owners July 27, 2023 04:50

AlexTalks requested a review from a team July 27, 2023 04:50

AlexTalks requested review from a team as code owners July 27, 2023 04:50

AlexTalks requested a review from itsbilal July 27, 2023 04:50

AlexTalks requested a review from nvanbenschoten July 27, 2023 04:50

AlexTalks mentioned this pull request Jul 27, 2023

kv: enable replay protection for ambiguous writes on commits #107658

Merged

nvanbenschoten approved these changes Jul 28, 2023

View reviewed changes

erikgrinaker reviewed Jul 28, 2023

View reviewed changes

AlexTalks force-pushed the refactor_mvcc_put_options branch from a583955 to 2b18f4f Compare July 28, 2023 21:30

AlexTalks commented Jul 28, 2023

View reviewed changes

craig bot merged commit f295bd8 into cockroachdb:master Jul 29, 2023

AlexTalks mentioned this pull request Sep 16, 2023

release-23.1: kv, storage: fix handling of ambiguous failures on commit #110757

Closed

AlexTalks mentioned this pull request Sep 21, 2023

release-23.1: kv, storage: fix handling of ambiguous failures on commit #111017

Closed

AlexTalks mentioned this pull request Oct 5, 2023

release-23.1: storage: refactor mvcc write parameters #111870

Merged

AlexTalks mentioned this pull request Oct 5, 2023

release-23.1: kv: enable replay protection for ambiguous writes on commits #111876

Merged

pav-kv mentioned this pull request Nov 2, 2023

kvserver: 23.2 replication microbenchmark regressions #111561

Closed

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: refactor mvcc write parameters #107680

storage: refactor mvcc write parameters #107680

AlexTalks commented Jul 27, 2023

cockroach-teamcity commented Jul 27, 2023

nvanbenschoten left a comment

erikgrinaker left a comment

AlexTalks left a comment

AlexTalks commented Jul 29, 2023

craig bot commented Jul 29, 2023

storage: refactor mvcc write parameters #107680

storage: refactor mvcc write parameters #107680

Conversation

AlexTalks commented Jul 27, 2023

cockroach-teamcity commented Jul 27, 2023

nvanbenschoten left a comment

Choose a reason for hiding this comment

erikgrinaker left a comment

Choose a reason for hiding this comment

AlexTalks left a comment

Choose a reason for hiding this comment

AlexTalks commented Jul 29, 2023

craig bot commented Jul 29, 2023