Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
storage: fix (the semantics of) MVCCStats.GCBytesAge
The semantics for computing GCBytesAge were incorrect and are fixed in this commit. Prior to this commit, a non-live write would accrue GCBytesAge from its own timestamp on. That is, if you wrote two versions of a key at 1s and 2s, then when the older version is replaced (at 2s) it would start out with one second of age (from 1s to 2s). However, the key *really* became non-live at 2s, and should have had an age of zero. By extension, imagine a table with lots of writes all dating back to early 2017, and assume that today (early 2018) all these writes are deleted (i.e. a tombstone placed on top of them). Prior to this commit, each key would immediately get assigned an age of `(early 2018) - early 2017)`, i.e. a very large number. Yet, the GC queue could only garbage collect them after (early 2018) + TTL`, so by default 25 hours after the deletion. We use GCBytesAge to trigger the GC queue, so that would cause the GC queue to run without ever getting to remove anything, for the TTL. This was a big problem bound to be noticed by users. This commit changes the semantics to what the GCQueue (and the layman) expects: 1. when a version is shadowed, it becomes non-live at that point and also starts accruing GCBytesAge from that point on. 2. deletion tombstones are an exception: They accrue age from their own timestamp on. This makes sense because a tombstone can be deleted whenever it's older than the TTL (as opposed to a value, which can only be deleted once it's been *shadowed* for longer than the TTL). This work started out by updating `ComputeStatsGo` to have the desired semantics, fixing up existing tests, and then stress testing `TestMVCCStatsRandomized` with short history lengths to discover failure modes which were then transcribed into small unit tests. When no more such failures were discoverable, the resulting logic in the various incremental MVCCStats update helpers was simplified and documented, and `ComputeStats` updated accordingly. In turn, `TestMVCCStatsBasic` was removed: it was notoriously hard to read and maintain, and does not add any coverage at this point. The recomputation of the stats in existing clusters is addressed in cockroachdb#21345. Fixes cockroachdb#20554. Release note (bug fix): fix a problem that could cause spurious GC activity, in particular after dropping a table.
- Loading branch information