-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv: return information about number of keys/bytes processed #57322
Comments
+1 cc @tbg I think there was some talk that this might be added this release |
We didn't commit to it, but this is something that would be reasonably easy with always-on tracing in place and that is how I thought we would implement it. All we need to do is to make a proto, populate the proto (when there's a threshold of tombstones skipped, or some other condition is met that avoids this happening in the hot path), and add it to the trace. The SQL layer can fish it out. Btw, we had this (using a more ad-hoc mechanism) in the RocksDB days using their perf counters and then printing something into the trace. It was dirt slow for some reason, and probably that reason has gone away now. |
From @tbg: We can do this now.
|
hey @jordanlewis doing some Friday triage and noticed you have an active PR for this so moved it over to SQL Queries. Let us know if you need more input from us and we can move it back / collab |
Hey @mwang1026 I'm moving it back to KV - I do have a PR open about it but it's the observability side for steps/seeks, which Sumeer provided on behalf of storage. We still would need information from KV about the number of keys/bytes processed to display that information. |
64503: sql: add MVCC steps and seeks to EXPLAIN ANALYZE r=jordanlewis a=jordanlewis Updates #57322. It doesn't track number of bytes scanned over, not sure how to efficiently do that. This commit adds 4 new fields to EXPLAIN ANALYZE for each operator that reads data from disk: MVCC steps and seeks, as well as MVCC "internal" steps and seeks when verbose mode is toggled. MVCC steps is the number of times that the underlying storage iterator stepped forward during the work to serve the operator's reads, including stepping over MVCC keys that were too old for user in a scan. MVCC seeks is the number of times that the underlying storage iterator jumped (seeked) to a different data location. Seeks are more expensive than steps. Comparing MVCC steps to KVs read helps indicate to a user when a scan might be slower than expected due to MVCC garbage in the middle of the keyspace being scanned. A high number of MVCC seeks might also indicate a lot of skipped MVCC garbage, especially when doing a sequential scan. Release note (sql change): EXPLAIN ANALYZE now contains more information about the MVCC behavior of operators that scan data from disk. See commit message for more details. 70039: ui: update clear sql stats link r=maryliag a=maryliag Update "clear SQL stats" on Statements and Transactions page to match remaining action links (all lower case and right blue tone" Before <img width="329" alt="Screen Shot 2021-09-10 at 12 43 54 PM" src="https://user-images.githubusercontent.com/1017486/132888427-9b9cba26-0c9c-4895-a8e8-9da2fd126ea7.png"> After <img width="358" alt="Screen Shot 2021-09-10 at 12 42 55 PM" src="https://user-images.githubusercontent.com/1017486/132888444-aa3872e4-b626-4225-823e-6b986cda010c.png"> hover <img width="243" alt="Screen Shot 2021-09-10 at 12 45 06 PM" src="https://user-images.githubusercontent.com/1017486/132888587-9b8cba3f-8c29-45e1-bbec-ba61ddc39115.png"> Release justification: Category 4 Release note (ui change): Update 'clear SQL stats' color and change to lower case on Transaction and Statements page Co-authored-by: Jordan Lewis <[email protected]> Co-authored-by: Marylia Gutierrez <[email protected]>
We have marked this issue as stale because it has been inactive for |
We have had various situations in which a scan did not return a lot of rows but was very slow, and the reason was because there were a lot of "invisible" keys with older MVCC values. Currently, there is no way for the SQL layer to obtain information about this (and show it with EXPLAIN ANALYZE).
I propose that KV returns (in
ScanResponse
) information about the number of underlying keys (and total bytes) that had to be processed to satisfy aScanRequest
. If collecting this information comes with performance overhead, we can add a flag inScanRequest
to only request it when necessary. This may apply toGet
as well.CC @asubiotto
Jira issue: CRDB-2835
The text was updated successfully, but these errors were encountered: