storage: new error interface in `MVCCIterator` #82589

erikgrinaker · 2022-06-08T08:49:19Z

Currently, key decoding errors are silently discarded in e.g. pebbleIterator.UnsafeKey(), because the interface is infallible. Also, we need to repeatedly call Valid() both during iteration and also on every call to HasPointAndRange() (which must return false when Valid() return false), which has additional cost. See e.g.

cockroach/pkg/storage/pebble_iterator.go

Lines 322 to 339 in 28e600d

    
           // The MVCCIterator interface is broken in that it silently discards 
        
           // the error when UnsafeKey(), Key() are unable to parse the key as 
        
           // an MVCCKey. This is especially problematic if the caller is 
        
           // accidentally iterating into the lock table key space, since that 
        
           // parsing will fail. We do a cheap check here to make sure we are 
        
           // not in the lock table key space. 
        
           // 
        
           // TODO(sumeer): fix this properly by changing those method signatures. 
        
           k := p.iter.Key() 
        
           if len(k) == 0 { 
        
           	return false, errors.Errorf("iterator encountered 0 length key") 
        
           } 
        
           // Last byte is the version length + 1 or 0. 
        
           versionLen := int(k[len(k)-1]) 
        
           if versionLen == engineKeyVersionLockTableLen+1 { 
        
           	p.mvccDone = true 
        
           	return false, nil 
        
           }

We should improve this by making the interface fallible. For example, this could mean getting rid of Valid(), making all positioning operations return (bool, error), and returning error from decoding functions like Key() and RangeKeys(). We should also consider how to minimize the amount of checks during typical iterations, and measure the effect this has on performance.

The EngineIterator interface already does this, see it for examples. The interface should be discussed with storage before implementation.

Jira issue: CRDB-16523

The text was updated successfully, but these errors were encountered:

blathers-crl · 2022-06-08T08:49:21Z

cc @cockroachdb/replication

erikgrinaker · 2022-06-16T12:45:55Z

As discussed in #82691 (review), while we're overhauling the interface here anyway, let's consider getting rid of Key() and Value(). It's trivial for callers to clone as needed, e.g. iter.UnsafeKey().Clone() or append(buf[:0], iter.UnsafeKey()...) depending.

jbowens · 2023-03-13T20:23:43Z

@itsbilal — this might be worth tackling as a part of simplifying CheckSSTConflicts during the 23.2 development cycle. There's been at least one bug from accidentally omitting the Valid() call, and all the boilerplate of those Valid() calls really hamper readability.

The MVCCIterator interface previously exposed two methods for accessing the current iterator postion as a MVCC key—UnsafeKey and Key. Key() was equivalent of calling UnsafeKey().Clone(). This commit removes the Key() variant, pushing the onus of key copying onto the caller. This reduces the interface surface area, avoids accidental key copying (some of which is addressed within this key), and does not impose any unreasonable burden on callers. Epic; None Informs cockroachdb#82589. Release note: None

95789: pkg/util/log: don't falsify tenant ID tag in logs if none in ctx r=andreimatei a=abarganier Previously, I made the decision to always tag a log entry with a tenant ID, even if no tenant ID was found in the context associated with the log entry. In this case, the system tenant ID was used in the tag, instead of omitting a tenant ID tag altogether. I received some feedback that this is confusing. For example, imagine testing a feature, expecting log entries to come from a secondary tenant, and the context being used in that feature is not annotated with a tenant ID. With the previous behavior, the log entry would default to being tagged with the system tenant ID instead of having empty tags (or at least, no tenant ID tag). In this scenario, how do I tell the actual state of the log entry? Did the log entry indeed come from a goroutine belonging to the system tenant? Or was the context just missing the tenant ID annotation, but otherwise came from the correct tenant? This ambiguity is not helpful. By falsifying a tenant ID tag we confuse the log reader about the actual state of the system. Furthermore, our eventual goal should be that almost no context objects in the system exist without a tenant ID (except for perhaps at startup before tenant initialization). Tagging with the system tenant ID in the case of a missing tenant ID annotation in the context makes it difficult to track down offending context objects. This patch removes this default behavior from the logging package. Now, if no tenant ID is found in the context, we do not tag the entry with a tenant ID. Note however that on the *decode* side, we will maintain this default tenant ID tagging behavior. If a log entry does not have a tenant ID tag, then we must assume that only the system tenant has privilege to view said log entry, since the owner is ambiguous. Release note: none Epic CRDB-14486 98175: cdc: show all changefeed jobs in `SHOW CHANGEFEED JOBS` r=HonoreDB a=jayshrivastava ### cdc: show all changefeed jobs in SHOW CHANGEFEED JOBS Release note (general change): Previously, the output of `SHOW CHANGEFEED JOBS` was limited to show unfinished jobs and finished jobs from the last 14 days. This change makes the command show all changefeed jobs, regardless of if they finished and when they finished. Note that jobs still obey the cluster setting `jobs.retention_time`. Completed jobs older than that time are deleted. Fixes: #97883 ### jobs: add virtual index for job_type in crdb_internal.jobs This change adds a virtual index on the `job_type` column of `crdb_internal.jobs`. This change should make queries on that table which filter on job type (such as `SHOW CHANGEFEED JOBS`) more efficient. Release note: None Epic: None 98515: kvserver: deflake test store capacity after split r=andrewbaptist a=kvoli This commit defales `TestStoreCapacityAfterSplit`. Previously it was possible for the replica load stats which underpins Capacity to be reset. The reset caused the recording duration to fall short of min stats duration, which led to a 0 value being reported for writes in store capacity. This commit bumps the manual clock twice and removes redundant leaseholder checks within a retry loop. The combination of these two changes makes the test much less likely to flake. The test is now unskipped. ``` dev test pkg/kv/kvserver -f TestStoreCapacityAfterSplit -v --stress ... 4410 runs so far, 0 failures, over 6m10s ``` Resolves: #92677 Release note: None 98521: ui: don't continue polling endpoints that return 403 errors r=dhartunian a=abarganier It was brought to our attention that endpoints such as `v1/settings` would continue to be polled by DB Console even if they returned 403 errors. If an endpoint returns 403 errors, we should not continue to poll it since the required access is not present for the current user. This patch updates the polling mechanism to short-circuit the `refresh` process if a 403 error is encountered throughout the lifecycle of the poller. Release note: none Fixes: #98356 98536: kvserver: deflake learner joint cfg relocate range r=andrewbaptist a=kvoli Previously, in `TestLearnerOrJointConfigAdminRelocateRange` it was possible for there to be an in-flight snapshot towards a learner, prior to sending `AdminRelocateRange` command. When this occurred, the test would fail as `AdminRelocateRange` returns an error when finding any in-flight snapshots to learners. This situation occurred infrequently, causing the test to flake. This commit updates the `TestLearnerOrJointConfigAdminRelocateRange` test to first assert that there are the expected number of learners, then assert that there are no in-flight snapshots towards learners before beginning the main testing logic. The test is now unskipped. ``` dev test pkg/kv/kvserver \ -f TestLearnerOrJointConfigAdminRelocateRange \ -v --stress ... 5652 runs so far, 0 failures, over 12m30s ``` Resolves: #95500 Release note: None 98542: storage: remove MVCCIterator.Key method r=jbowens a=jbowens The MVCCIterator interface previously exposed two methods for accessing the current iterator postion as a MVCC key—UnsafeKey and Key. Key() was equivalent to UnsafeKey().Clone(). This commit removes the Key() variant, pushing the onus of key copying onto the caller. This reduces the interface surface area, avoids accidental key copying (some of which is addressed within this commit), and does not impose any unreasonable burden on callers. Epic: None Informs #82589. Release note: None 98543: allocator: fix lease io enforcement setting typo r=andrewbaptist a=kvoli This commit updates the "do nothing" lease IO overload enforcement (`kv.allocator.lease_io_overload_threshold_enforcement`) setting to be correctly spelled "ignore" rather than "ingore". Part of: #96508 Release note (ops change): The `kv.allocator.lease_io_overload_threshold_enforcement` setting value which disables enforcement is updated to be spelled correctly as "ignore" rather than "ingore". 98600: server: change conn close error to warning r=knz,abarganier a=dhartunian Resolves: #98523 Epic: None Release note: None Co-authored-by: Alex Barganier <[email protected]> Co-authored-by: Jayant Shrivastava <[email protected]> Co-authored-by: Austen McClernon <[email protected]> Co-authored-by: Jackson Owens <[email protected]> Co-authored-by: David Hartunian <[email protected]>

Adapt the MVCCIterator and its sub-interface SimpleMVCCIterator, changing the signature of all positioning methods. Positioning methods now return a boolean indicating whether the iterator is now positioned at a valid key, and any error encountered during positioning. This commit only updates the interface and implementations. Subsequent work will update clients to use the new interface and remove Valid. Epic: None Informs: cockroachdb#82589 Release note: None

erikgrinaker added C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. A-storage Relating to our storage engine (Pebble) on-disk storage. T-kv-replication labels Jun 8, 2022

blathers-crl bot added the A-kv-replication Relating to Raft, consensus, and coordination. label Jun 8, 2022

erikgrinaker added the C-performance Perf of queries or internals. Solution not expected to change functional behavior. label Jun 8, 2022

This was referenced Jun 16, 2022

storage: add range key support for MVCCIncrementalIterator #82691

Merged

storage: optimize range key iteration #83049

Closed

nicktrav added the T-storage Storage Team label Jul 13, 2022

exalate-issue-sync bot removed the T-kv-replication label Aug 3, 2022

erikgrinaker mentioned this issue Nov 21, 2022

kvserver,raftlog: dedup Raft log iteration #92143

Merged

jbowens mentioned this issue Mar 13, 2023

storage: remove MVCCIterator.Key method #98542

Merged

erikgrinaker removed the A-kv-replication Relating to Raft, consensus, and coordination. label Mar 14, 2023

jbowens mentioned this issue Apr 4, 2023

storage: adapt MVCCIterator interface to return validity and error #100635

Draft

jbowens added this to [Deprecated] Storage Jun 4, 2024

jbowens moved this to 24.2 candidates in [Deprecated] Storage Jun 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

storage: new error interface in `MVCCIterator` #82589

storage: new error interface in `MVCCIterator` #82589

erikgrinaker commented Jun 8, 2022 •

edited by exalate-issue-sync bot

Loading

blathers-crl bot commented Jun 8, 2022

erikgrinaker commented Jun 16, 2022

jbowens commented Mar 13, 2023

storage: new error interface in MVCCIterator #82589

storage: new error interface in MVCCIterator #82589

Comments

erikgrinaker commented Jun 8, 2022 • edited by exalate-issue-sync bot Loading

blathers-crl bot commented Jun 8, 2022

erikgrinaker commented Jun 16, 2022

jbowens commented Mar 13, 2023

storage: new error interface in `MVCCIterator` #82589

storage: new error interface in `MVCCIterator` #82589

erikgrinaker commented Jun 8, 2022 •

edited by exalate-issue-sync bot

Loading