Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

batcheval: AddSSTable does not adjust GCBytesAge #82920

Closed
erikgrinaker opened this issue Jun 15, 2022 · 3 comments · Fixed by #87303 or #88578
Closed

batcheval: AddSSTable does not adjust GCBytesAge #82920

erikgrinaker opened this issue Jun 15, 2022 · 3 comments · Fixed by #87303 or #88578
Assignees
Labels
A-kv-replication Relating to Raft, consensus, and coordination. A-storage Relating to our storage engine (Pebble) on-disk storage. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-storage Storage Team

Comments

@erikgrinaker
Copy link
Contributor

erikgrinaker commented Jun 15, 2022

When AddSSTable checks for conflicts or shadowing via CheckSSTConflicts, it is supposed to accurately adjust the SST's MVCCStats to reflect the effect it has on the existing data. However, it makes no attempt at all to adjust GCBytesAge, leading to incorrect stats.

To see this, change e.g. TestEvalAddSSTable to use timestamps that are larger than 1 second, i.e. 1 billion nanoseconds -- so use 1e9 instead of 1 in timestamps -- and pass through the request timestamp to all stats calculations. The stats assertions will then immediately fail. This is because GCBytesAge computations only care about whole seconds, rounded down, so timestamps less than 1e9 will always result in 0 GCBytesAge.

There is some draft work in this branch: https://github.com/erikgrinaker/cockroach/tree/addsstable-stats

There are several scenarios that need handling:

  • The SST shadows an existing value.
  • An existing value shadows the SST value.
  • A value in the SST shadows another value in the SST.
  • All of the above, but with tombstones rather than values.

A fix for the simplest case is e.g.:

https://github.com/erikgrinaker/cockroach/blob/5c20fd3ea286fee34df577c2d92ef62c60e2b4fa/pkg/storage/sst.go#L197-L215

Jira issue: CRDB-16730

Epic CRDB-2624

@erikgrinaker erikgrinaker added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-kv-replication labels Jun 15, 2022
@blathers-crl
Copy link

blathers-crl bot commented Jun 15, 2022

cc @cockroachdb/replication

@blathers-crl blathers-crl bot added the A-kv-replication Relating to Raft, consensus, and coordination. label Jun 15, 2022
craig bot pushed a commit that referenced this issue Jun 16, 2022
82925: batcheval: allow writing tombstones in `AddSSTable` r=aliher1911 a=erikgrinaker

This patch allows writing MVCC point tombstones in `AddSSTable`, which
will be needed by cluster-to-cluster replication. Previously, SSTs
containing tombstones could error or be silently accepted. The version
gate `AddSSTableTombstones` has been added to allow clients to rely on
this behavior.

Note that `AddSSTable` has never made any attempt at correctly
calculating `MVCCStats.GCBytesAge` when shadowing existing data, and
this patch does not attempt to either. This will be addressed later.

Resolves #82922.
Touches #82920.

Release note: None

82952: builtins: add pg_backend_pid() r=otan a=rafiss

fixes #82564

Release note (sql change): Updated the pg_backend_pid() builtin function
so it matches with the data in the query cancellation key created during
session initialization. The function is just for compatibility, and it
does not return a real process ID.

Co-authored-by: Erik Grinaker <[email protected]>
Co-authored-by: Rafi Shamim <[email protected]>
lunevalex added a commit to lunevalex/cockroach that referenced this issue Jun 28, 2022
Touched cockroachdb#82920

There is at least one known issue in MVCC stats calculation and
there maybe more. This could lead to the MVCC GC Queue spinning on
ranges with bad stats. To prevent the queue from spinning it should
recompute the stats if it detects that they are wrong. The easiest
mechanism to do that is to check if the GC score wants to queue this
range again after finishing GC, if it does it likely indicates something
fishy with the stats.

Release note: None
@exalate-issue-sync exalate-issue-sync bot added T-storage Storage Team and removed T-kv-replication labels Jul 6, 2022
@blathers-crl blathers-crl bot added the A-storage Relating to our storage engine (Pebble) on-disk storage. label Jul 6, 2022
lunevalex added a commit to lunevalex/cockroach that referenced this issue Jul 10, 2022
Touched cockroachdb#82920

There is at least one known issue in MVCC stats calculation and
there maybe more. This could lead to the MVCC GC Queue spinning on
ranges with bad stats. To prevent the queue from spinning it should
recompute the stats if it detects that they are wrong. The easiest
mechanism to do that is to check if the GC score wants to queue this
range again after finishing GC, if it does it likely indicates something
fishy with the stats.

Release note: None
lunevalex added a commit to lunevalex/cockroach that referenced this issue Jul 21, 2022
Touched cockroachdb#82920

There is at least one known issue in MVCC stats calculation and
there maybe more. This could lead to the MVCC GC Queue spinning on
ranges with bad stats. To prevent the queue from spinning it should
recompute the stats if it detects that they are wrong. The easiest
mechanism to do that is to check if the GC score wants to queue this
range again after finishing GC, if it does it likely indicates something
fishy with the stats.

Release note: Change the MVCC GC queue to recompute MVCC stats on a
range, if after doing a GC run it still thinks there is garbage in
the range.
lunevalex added a commit to lunevalex/cockroach that referenced this issue Jul 21, 2022
Touched cockroachdb#82920

There is at least one known issue in MVCC stats calculation and
there maybe more. This could lead to the MVCC GC Queue spinning on
ranges with bad stats. To prevent the queue from spinning it should
recompute the stats if it detects that they are wrong. The easiest
mechanism to do that is to check if the GC score wants to queue this
range again after finishing GC, if it does it likely indicates something
fishy with the stats.

Release note: Change the MVCC GC queue to recompute MVCC stats on a
range, if after doing a GC run it still thinks there is garbage in
the range.
craig bot pushed a commit that referenced this issue Jul 22, 2022
83194: kvserver: recompute stats after mvcc gc r=lunevalex a=lunevalex

Touched #82920

There is at least one known issue in MVCC stats calculation and
there maybe more. This could lead to the MVCC GC Queue spinning on
ranges with bad stats. To prevent the queue from spinning it should
recompute the stats if it detects that they are wrong. The easiest
mechanism to do that is to check if the GC score wants to queue this
range again after finishing GC, if it does it likely indicates something
fishy with the stats.

Release note: Change the MVCC GC queue to recompute MVCC stats on a
range, if after doing a GC run it still thinks there is garbage in
the range.

84922: bazel: bump size of `gc` test r=jlinder a=rickystewart

This has timed out in CI.

Release note: None

Co-authored-by: Alex Lunev <[email protected]>
Co-authored-by: Ricky Stewart <[email protected]>
itsbilal added a commit to itsbilal/cockroach that referenced this issue Sep 1, 2022
Currently, the stats calculations in `AddSSTable` and `CheckSSTConflicts`
does not adjust for any GCBytesAge differences that arise from sst keys
deleting or shadowing engine keys. This results in mismatching GCBytesAge
values if we were to run the existing TestEvalAddSSTable test with
sufficiently-large "now" timestamps to accrue GCBytesAge.

This change updates TestEvalAddSSTable to multiply each timestamp
with 1e9 so that it operates in full-second increments to see more
interesting GCBytesAge behaviour. It also updates the code path
where we mass-update SST timestamps to make the corresponding
GCBytesAge adjustment. Finally, it ensures that GCBytesAge
is accrued from the right timestamps in cases of sst keys
/ tombstones shadowing engine keys / tombstones.

Note that this change only fixes GCBytesAge for point keys;
a fix for range keys will come in a follow-up.

Fixes cockroachdb#82920.

Release note: None.
craig bot pushed a commit that referenced this issue Sep 8, 2022
87303: kvserver,storage: calculate GCBytesAge in AddSSTable for points r=erikginaker a=itsbilal

Currently, the stats calculations in `AddSSTable` and `CheckSSTConflicts`
does not adjust for any GCBytesAge differences that arise from sst keys
deleting or shadowing engine keys. This results in mismatching GCBytesAge
values if we were to run the existing TestEvalAddSSTable test with
sufficiently-large "now" timestamps to accrue GCBytesAge.

This change updates TestEvalAddSSTable to multiply each timestamp
with 1e9 so that it operates in full-second increments to see more
interesting GCBytesAge behaviour. It also updates the code path
where we mass-update SST timestamps to make the corresponding
GCBytesAge adjustment. Finally, it ensures that GCBytesAge
is accrued from the right timestamps in cases of sst keys
/ tombstones shadowing engine keys / tombstones.

Note that this change only fixes GCBytesAge for point keys;
a fix for range keys will come in a follow-up.

Fixes #82920.

Release note: None.

Co-authored-by: Bilal Akhtar <[email protected]>
@craig craig bot closed this as completed in 4f3d318 Sep 8, 2022
@erikgrinaker
Copy link
Contributor Author

@itsbilal We should address GCBytesAge for range keys as well. Should we reopen this issue for that, or alternatively create a new one?

jbowens pushed a commit to jbowens/cockroach that referenced this issue Sep 12, 2022
Currently, the stats calculations in `AddSSTable` and `CheckSSTConflicts`
does not adjust for any GCBytesAge differences that arise from sst keys
deleting or shadowing engine keys. This results in mismatching GCBytesAge
values if we were to run the existing TestEvalAddSSTable test with
sufficiently-large "now" timestamps to accrue GCBytesAge.

This change updates TestEvalAddSSTable to multiply each timestamp
with 1e9 so that it operates in full-second increments to see more
interesting GCBytesAge behaviour. It also updates the code path
where we mass-update SST timestamps to make the corresponding
GCBytesAge adjustment. Finally, it ensures that GCBytesAge
is accrued from the right timestamps in cases of sst keys
/ tombstones shadowing engine keys / tombstones.

Note that this change only fixes GCBytesAge for point keys;
a fix for range keys will come in a follow-up.

Fixes cockroachdb#82920.

Release note: None.

Release justification:
@itsbilal
Copy link
Member

@erikgrinaker I'll just re-reference the same issue for now, so let's reopen it for range keys.

@itsbilal itsbilal reopened this Sep 23, 2022
itsbilal added a commit to itsbilal/cockroach that referenced this issue Sep 23, 2022
Change cockroachdb#87303 fixed GCBytesAge in `CheckSSTConflicts` for
points, but not for range keys. This change fixes it for
range keys too, by ensuring the changes to statsDiff happen
at the right timestamp using statsDiff.AgeTo.

Fixes cockroachdb#82920.

Release note: None.
itsbilal added a commit to itsbilal/cockroach that referenced this issue Sep 26, 2022
Change cockroachdb#87303 fixed GCBytesAge in `CheckSSTConflicts` for
points, but not for range keys. This change fixes it for
range keys too, by ensuring the changes to statsDiff happen
at the right timestamp using statsDiff.AgeTo.

Fixes cockroachdb#82920.

Release note: None.
itsbilal added a commit to itsbilal/cockroach that referenced this issue Sep 26, 2022
Change cockroachdb#87303 fixed GCBytesAge in `CheckSSTConflicts` for
points, but not for range keys. This change fixes it for
range keys too, by ensuring the changes to statsDiff happen
at the right timestamp using statsDiff.AgeTo.

Fixes cockroachdb#82920.

Release note: None.
craig bot pushed a commit that referenced this issue Sep 27, 2022
88578: storage: Fix GCBytesAge in CheckSSTConflicts with range keys r=erikgrinaker a=itsbilal

Change #87303 fixed GCBytesAge in `CheckSSTConflicts` for points, but not for range keys. This change fixes it for range keys too, by ensuring the changes to statsDiff happen at the right timestamp using statsDiff.AgeTo.

Fixes #82920.

Release note: None.

88736: Unlink func_as and replace with SCONST r=chengxiong-ruan a=stbof

Replacement needed to prevent missing `func_as` target in SQL grammar.

Release note: None
Release justification: low-risk bugfix. Required to build UDF docs.

Co-authored-by: Bilal Akhtar <[email protected]>
Co-authored-by: Stephanie Bodoff <[email protected]>
blathers-crl bot pushed a commit that referenced this issue Sep 27, 2022
Change #87303 fixed GCBytesAge in `CheckSSTConflicts` for
points, but not for range keys. This change fixes it for
range keys too, by ensuring the changes to statsDiff happen
at the right timestamp using statsDiff.AgeTo.

Fixes #82920.

Release note: None.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-kv-replication Relating to Raft, consensus, and coordination. A-storage Relating to our storage engine (Pebble) on-disk storage. C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. T-storage Storage Team
Projects
None yet
2 participants