Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
storage: implement intent resolution using LockTableIterator
Informs #109648. This commit implements intent resolution (point and ranged) using a `LockTableIterator`, configured to return all locks for the transaction being resolved and no locks from other transactions. This is the first step towards releasing replicated locks during intent resolution. While switching to a LockTableIterator, the commit is also able to remove separatedIntentAndVersionIter, iterForKeyVersions and mvccGetIntent, which were all used to avoid handing an MVCCMetadata directly to mvccResolveWriteIntent. Instead of continuing to treat intents as interleaved, we switch to handling intents entirely separately from their provisional value during intent resolution, which avoids jumping through these hoops and makes the code simpler. The change to `TestMVCCResolveTxnRangeResumeWithManyVersions` is immaterial and has to do with the transaction ID filter being applied before the key limit (inside LockTableIterator), instead of after. The new behavior is actually better. ---- One concern I have about this change is that it removes the call to `SeekIntentGE` in `MVCCResolveWriteIntent`, which was added in d1c91e0 to guard against the case where many pebble tombstones from prior intents on a key surround the intent being resolved. Conceptually, we'd like to push optimizations that avoid scanning over these tombstones into the `LockTableIterator` like we plan to do for skipping over non-conflicting locks. Doing so would benefit all lock strengths. It would also benefit the case where an intent is not found and the seek hits tombstones from prior intents on later versions. However, it's not clear how to do this with the current Pebble API. Pebble exposes a `SeekGEWithLimit` method, but this "limit" value is expressed as a key and not as a number of steps. How would we construct a limit key to bound the number of tombstones a seek observes before seeking directly to a specific (txn_id, lock_strength) version? One option would be to seek to specific versions in the `LockTableIterator` when advancing the iterator in cases where the iterator is configured to match a specific txn ID. For example, performing the following translations: ``` SeekGE({Key: k}) -> SeekGE({Key: k, Strength: Intent, TxnID: <txn_id>}) Next() -> SeekGE({Key: k, Strength: Exclusive, TxnID: <txn_id>}) Next() -> SeekGE({Key: k, Strength: Shared, TxnID: <txn_id>}) ``` Of course, this gets more complicated when some of these locks are not found and the iterator advances past them while seeking. In such cases, we're back to paying the cost of scanning over the tombstones. If we knew which lock strengths we had acquired on a key, we could avoid some of this cost, but that would require API changes and client buy-in to track lock spans on a per-strength basis. I'll capture the impact of this change on the following benchmarks and evaluate: * BenchmarkIntentResolution * BenchmarkIntentRangeResolution * BenchmarkIntentScan Release note: Nonet
- Loading branch information