-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: improve visibility into low-level Pebble internal keys #94659
Labels
A-kv-observability
A-observability-inf
A-storage
Relating to our storage engine (Pebble) on-disk storage.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-storage
Storage Team
Comments
jbowens
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
A-storage
Relating to our storage engine (Pebble) on-disk storage.
T-storage
Storage Team
labels
Jan 3, 2023
We could also make it MVCC and gc-threshold aware and additionally count the bytes in older versions (redundant with |
jbowens
added a commit
to jbowens/cockroach
that referenced
this issue
Jan 4, 2023
Document the [Internal]{Step,Seek}Count fields within ExecutionStats. These statistics are low-level statistics, that while very useful are full of nuance. They require some care to interpret appropriately. Future work (cockroachdb#94659) will improve the observability here. Epic: None Release note: None Informs cockroachdb#94659.
jbowens
added a commit
to jbowens/cockroach
that referenced
this issue
Jan 5, 2023
Document the [Internal]{Step,Seek}Count fields within ExecutionStats. These statistics are low-level statistics, that while very useful are full of nuance. They require some care to interpret appropriately. Future work (cockroachdb#94659) will improve the observability here. Epic: None Release note: None Informs cockroachdb#94659.
jbowens
added a commit
to jbowens/cockroach
that referenced
this issue
Jan 6, 2023
Document the [Internal]{Step,Seek}Count fields within ExecutionStats. These statistics are low-level statistics, that while very useful are full of nuance. They require some care to interpret appropriately. Future work (cockroachdb#94659) will improve the observability here. Epic: None Release note: None Informs cockroachdb#94659.
craig bot
pushed a commit
that referenced
this issue
Jan 6, 2023
94708: sql: document storage ExecutionStats fields r=jbowens a=jbowens Document the [Internal]{Step,Seek}Count fields within ExecutionStats. These statistics are low-level statistics, that while very useful are full of nuance. They require some care to interpret appropriately. Future work (#94659) will improve the observability here. Epic: None Release note: None Close #94665. Informs #94659. Co-authored-by: Jackson Owens <[email protected]>
raggar
added a commit
to raggar/cockroach
that referenced
this issue
Jul 10, 2023
…_metrics()` Added a column to indicate how many bytes the given keyspan makes up. Informs: cockroachdb#94659 Release note: None
raggar
added a commit
to raggar/cockroach
that referenced
this issue
Jul 11, 2023
…_metrics()` Added a column to indicate how many bytes the given keyspan makes up. Informs: cockroachdb#94659 Release note: None
craig bot
pushed a commit
that referenced
this issue
Jul 11, 2023
106527: builtins: Added `ApproximateSpanBytes` column r=RahulAggarwal1016 a=RahulAggarwal1016 Added a column (`ApproximateSpanBytes`) to the `crdb_internal.sstable_metrics()` builtin generator function to display the number of bytes that the given user key span overlaps with. Informs: #94659 Release note: None 106600: dev: use `bazel test` to run acceptance tests, not `bazel run` r=rail a=rickystewart This is probably some oversight or historical artifact, but `bazel test` is conceptually correct and what we should be using in CI. Epic: CRDB-17171 Release note: None Co-authored-by: Rahul Aggarwal <[email protected]> Co-authored-by: Ricky Stewart <[email protected]>
raggar
pushed a commit
to raggar/cockroach
that referenced
this issue
Jul 28, 2023
This new builtin is used to gather specific pebble metrics for a node and store id (within an given keyspan). The builtin returns information about the different types of keys (including snapshot pinned keys) as well as bytes. Informs: cockroachdb#94659 Release-note: None
raggar
pushed a commit
to raggar/cockroach
that referenced
this issue
Jul 31, 2023
This new builtin is used to gather specific pebble metrics for a node and store id (within an given keyspan). The builtin returns information about the different types of keys (including snapshot pinned keys) as well as bytes. Informs: cockroachdb#94659 Release-note: None
raggar
pushed a commit
to raggar/cockroach
that referenced
this issue
Jul 31, 2023
This new builtin is used to gather specific pebble metrics for a node and store id (within an given keyspan). The builtin returns information about the different types of keys (including snapshot pinned keys) as well as bytes. Informs: cockroachdb#94659 Release-note: None
raggar
pushed a commit
to raggar/cockroach
that referenced
this issue
Jul 31, 2023
This new builtin is used to gather specific pebble metrics for a node and store id (within an given keyspan). The builtin returns information about the different types of keys (including snapshot pinned keys) as well as bytes. Informs: cockroachdb#94659 Release-note: None
raggar
pushed a commit
to raggar/cockroach
that referenced
this issue
Aug 7, 2023
This new builtin is used to gather specific pebble metrics for a node and store id (within an given keyspan). The builtin returns information about the different types of keys (including snapshot pinned keys) as well as bytes. Informs: cockroachdb#94659 Release note: None
raggar
pushed a commit
to raggar/cockroach
that referenced
this issue
Aug 10, 2023
This new builtin is used to gather specific pebble metrics for a node and store id (within an given keyspan). The builtin returns information about the different types of keys (including snapshot pinned keys) as well as bytes. Informs: cockroachdb#94659 Release note: None
raggar
pushed a commit
to raggar/cockroach
that referenced
this issue
Aug 10, 2023
This new builtin is used to gather specific pebble metrics for a node and store id (within an given keyspan). The builtin returns information about the different types of keys (including snapshot pinned keys) as well as bytes. Informs: cockroachdb#94659 Release note: None
raggar
pushed a commit
to raggar/cockroach
that referenced
this issue
Aug 10, 2023
This new builtin is used to gather specific pebble metrics for a node and store id (within an given keyspan). The builtin returns information about the different types of keys (including snapshot pinned keys) as well as bytes. Informs: cockroachdb#94659 Release note: None
craig bot
pushed a commit
that referenced
this issue
Aug 10, 2023
106525: sql: add trie tree based workload index recommendations r=qiyanghe1998 a=qiyanghe1998 #### sql: add trie tree based workload index recommendations This commit adds the trie and the logic for getting the workload index recommendations. In addition, it fills the gap between built-in functions and backend implementation for workload index recommendations. The whole process consists of collecting candidates and finding representative indexes. All the index recommendations in the table `system.statement_statistics` (satisfying some time requirement) will be collected as the candidates and then inserted to the trie. The trie is designed for all the indexes of one table. The indexed columns will be regarded as the key to insert into the tree in their original orders. The storing part will be attached to the node after the insertion of indexed columns. The general idea of finding representative indexes is to use all the indexes represented by the leaf nodes. One optimization is to use the remove the storings that are covered by some leaf nodes. Next, we will push down all the storings attached to the internal nodes to the shallowest leaf nodes (You can find the reasons in RFC). Finally, all the indexes represented by the leaf nodes will be returned. As for the `DROP INDEX`, since we collect all the indexes represented by the leaf nodes (a superset of dropped indexes), so we can directly drop all of them. Release note (sql change): new builtin functions `workload_index_recs()` and `workload_index_recs(timestamptz)`, return workload level index recommendations (columns of string, each string represent an index recommendation) from statement level index recommendations (as candidates) in `system.statement_statistics`. If the timestamptz is given, it will only consider those candidates who is generated after that timestampsz. Epic: None 107743: sql: Add new builtin generator function `crdb_internal.scan_storage_internal_keys()` r=RahulAggarwal1016 a=RahulAggarwal1016 This new builtin is used to gather specific pebble metrics for a node and store id (within an given keyspan). The builtin returns information about the different types of keys (including snapshot pinned keys) and bytes. Informs: #94659 Release-note: None 108516: backupccl: fix error message for descriptor version mismatch r=adityamaru a=renatolabs Expected and actual versions were swapped. Epic: none Release note: None 108520: backupccl: remove spurious print line in test r=msbutler a=msbutler Print statements should not be in commited code Epic: none Release note: none 108531: roachtest: Ensure tpcc workloads runs for a bit r=miretskiy a=miretskiy An issue in roachtest #108530 prevents clean test termination when calling Wait() on a test monitor that did not have at least 1 task started. This cause `cdc/kafka-oauth` test to hang. Add a '30s' duration to the tpcc task to go around this problem. Fixes #108507 Release note: None Co-authored-by: qiyanghe1998 <[email protected]> Co-authored-by: craig[bot] <[email protected]> Co-authored-by: Renato Costa <[email protected]> Co-authored-by: Michael Butler <[email protected]> Co-authored-by: Yevgeniy Miretskiy <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-kv-observability
A-observability-inf
A-storage
Relating to our storage engine (Pebble) on-disk storage.
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-storage
Storage Team
We surface Pebble Iterator stats in traces to help diagnose slow scans. When there's a large difference between external and internal steps, it indicates the iterator observes many more internal keys than visible keys. However, it's not clear which internal keys exist and why. There can be many reasons for duplicate internal keys (eg, necessary compactions not being scheduled, snapshots pinning shadowed keys, visibility filtering). During these incidents, we may compact a key range suspecting uncompacted point tombstones, but this is slow and doesn't provide much low-level visibility into the exact cause.
@sumeerbhola suggested a facility to scan a KV range on a store and report back statistics. A scan of a single 512MB KV range should not be prohibitively expensive, and even if throttled to a few MB/s, should complete within a few minutes. See cockroachdb/pebble#1996 for a Pebble issue filed from a similar observability dearth.
A purpose-built scan operation could:
Jira issue: CRDB-23069
Epic: CRDB-26603
The text was updated successfully, but these errors were encountered: