-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/opt: add telemetry for statistics forecast usage #86356
Labels
A-sql-logging-and-telemetry
Issues related to slow query log, SQL audit log, SQL internal logging telemetry, etc.
A-sql-optimizer
SQL logical planning and optimizations.
A-sql-table-stats
Table statistics (and their automatic refresh).
A-telemetry
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-sql-queries
SQL Queries Team
Comments
michae2
added
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
A-sql-optimizer
SQL logical planning and optimizations.
A-telemetry
A-sql-table-stats
Table statistics (and their automatic refresh).
T-sql-queries
SQL Queries Team
A-sql-logging-and-telemetry
Issues related to slow query log, SQL audit log, SQL internal logging telemetry, etc.
labels
Aug 17, 2022
michae2
added a commit
to michae2/cockroach
that referenced
this issue
Sep 23, 2022
Fixes: cockroachdb#86356 Release note: None
michae2
added a commit
to michae2/cockroach
that referenced
this issue
Oct 3, 2022
Add a few fields to the sampled_query telemetry events that will help us measure how useful table statistics forecasting is in practice. Fixes: cockroachdb#86356 Release note (ops change): Add five new fields to the sampled_query telemetry events: - `ScanCount`: Number of scans in the query plan. - `ScanWithStatsCount`: Number of scans using statistics (including forecasted statistics) in the query plan. - `ScanWithStatsForecastCount`: Number of scans using forecasted statistics in the query plan. - `TotalScanRowsWithoutForecastsEstimate`: Total number of rows read by all scans in the query, as estimated by the optimizer without using forecasts. - `NanosSinceStatsForecasted`: The maximum number of nanoseconds that have passed since the forecast time (or until the forecast time, if it is in the future) for any table with forecasted stats scanned by this query.
michae2
added a commit
to michae2/cockroach
that referenced
this issue
Oct 7, 2022
Add a few fields to the sampled_query telemetry events that will help us measure how useful table statistics forecasting is in practice. Fixes: cockroachdb#86356 Release note (ops change): Add five new fields to the sampled_query telemetry events: - `ScanCount`: Number of scans in the query plan. - `ScanWithStatsCount`: Number of scans using statistics (including forecasted statistics) in the query plan. - `ScanWithStatsForecastCount`: Number of scans using forecasted statistics in the query plan. - `TotalScanRowsWithoutForecastsEstimate`: Total number of rows read by all scans in the query, as estimated by the optimizer without using forecasts. - `NanosSinceStatsForecasted`: The greatest quantity of nanoseconds that have passed since the forecast time (or until the forecast time, if it is in the future, in which case it will be negative) for any table with forecasted stats scanned by this query.
craig bot
pushed a commit
that referenced
this issue
Oct 10, 2022
88539: sql: add telemetry for statistics forecast usage r=rytaft a=michae2 Add a few fields to the sampled_query telemetry events that will help us measure how useful table statistics forecasting is in practice. Fixes: #86356 Release note (ops change): Add five new fields to the sampled_query telemetry events: - `ScanCount`: Number of scans in the query plan. - `ScanWithStatsCount`: Number of scans using statistics (including forecasted statistics) in the query plan. - `ScanWithStatsForecastCount`: Number of scans using forecasted statistics in the query plan. - `TotalScanRowsWithoutForecastsEstimate`: Total number of rows read by all scans in the query, as estimated by the optimizer without using forecasts. - `NanosSinceStatsForecasted`: The greatest quantity of nanoseconds that have passed since the forecast time (or until the forecast time, if it is in the future, in which case it will be negative) for any table with forecasted stats scanned by this query. 89418: sqlstats: always enable tracing first time fingerprint is seen r=j82w a=j82w Fixes: #89185 The first time a fingerprint is seen tracing should be enabled. This currently is broken if sql.metrics.statement_details.plan_collection.enabled is set to false. This can cause crdb_internal.transaction_contention_events to be empty because tracing was never enabled to the contention event was never recorded. To properly fix this a new value needs to be returned on ShouldSample to tell if it is the first time a fingerprint is seen. This will remove the dependency on plan_collection feature switch. Release justification: Bug fixes and low-risk updates to new functionality. Release note (bug fix): Always enable tracing the frist time a fingerprint is seen. 89502: kvserver: do not report diff in consistency checks r=erikgrinaker,tbg a=pavelkalinnikov This commit removes reporting of the diff between replicas in case of consistency check failures. With the increased range sizes the previous approach has become infeasible. One possible way to inspect an inconsistency after this change is: 1. Run `cockroach debug range-data` tool to extract the range data from each replica's checkpoint. 2. Use standard OS tools like `diff` to analyse them. In the meantime, we are researching the UX of this alternative approach, and seeing if there can be a better tooling support. Part of #21128 Epic: none Release note (sql change): The `crdb_internal.check_consistency` function now does not include the diff between inconsistent replicas, should they occur. If an inconsistency occurs, the storage engine checkpoints should be inspected. This change is made because previously the range size limit has been increased from 64 MiB to O(GiB), so inlining diffs in consistency checks does not scale. 89529: changefeedccl: update parallel consumer metrics r=jayshrivastava a=jayshrivastava Previously, both changefeed.nprocs_flush_nanos and changefeed.nprocs_consume_event_nanos were counters that monotonically increased. This was not that useful when determining the average time it takes to consume or flush an event. Changing them to a histogram fixes this issue and allows for percentile values like p90, p99. This change also updates changefeed.nprocs_in_flight_count to sample values when incrementing inFlight variable. Previously, it was showing up at 0 in the UI. This change makes it show the actual value. Fixes #89654 Release note: None Epic: none 89660: tree: use FastIntSet during typechecking r=jordanlewis a=jordanlewis Previously, the typecheck phase used several slices of ordinals into lists. This is a perfect use case for FastIntSet, because the ordinals tend to be quite small. This commit switches to use FastIntSet instead. ``` name old time/op new time/op delta TypeCheck-10 3.68µs ± 3% 3.52µs ± 2% -4.52% (p=0.000 n=9+10) name old alloc/op new alloc/op delta TypeCheck-10 744B ± 0% 576B ± 0% -22.58% (p=0.000 n=10+10) name old allocs/op new allocs/op delta TypeCheck-10 32.0 ± 0% 18.0 ± 0% -43.75% (p=0.000 n=10+10) ``` Issue: None Epic: None Release note: None 89662: scop, scdep: Rename `IndexValidator` to `Validator` r=Xiang-Gu a=Xiang-Gu I've recently done work to enable adding/dropping check constraints in the declarative schema changer. It is a big PR with many commits. I think it's nicer to separate them further into multiple PRs. This is the first PR in that effort, which is merely renaming and should be easy to review: commit 1: ValidateCheckConstraint now uses ConstraintID, instead of constraint name; commit 2: Rename `IndexValidator` to `Validator`. We previously had a file under scdep called `index_validator.go` where we implement logic for validating an index. Now that we are going to validate a check constraint, we renamed them so they will also validate check constraints. Informs #89665 Release note: None Co-authored-by: Michael Erickson <[email protected]> Co-authored-by: j82w <[email protected]> Co-authored-by: Pavel Kalinnikov <[email protected]> Co-authored-by: Jayant Shrivastava <[email protected]> Co-authored-by: Jordan Lewis <[email protected]> Co-authored-by: Xiang Gu <[email protected]>
michae2
added a commit
to michae2/cockroach
that referenced
this issue
Oct 12, 2022
Add a few fields to the sampled_query telemetry events that will help us measure how useful table statistics forecasting is in practice. Fixes: cockroachdb#86356 Release note (ops change): Add five new fields to the sampled_query telemetry events: - `ScanCount`: Number of scans in the query plan. - `ScanWithStatsCount`: Number of scans using statistics (including forecasted statistics) in the query plan. - `ScanWithStatsForecastCount`: Number of scans using forecasted statistics in the query plan. - `TotalScanRowsWithoutForecastsEstimate`: Total number of rows read by all scans in the query, as estimated by the optimizer without using forecasts. - `NanosSinceStatsForecasted`: The greatest quantity of nanoseconds that have passed since the forecast time (or until the forecast time, if it is in the future, in which case it will be negative) for any table with forecasted stats scanned by this query.
michae2
added a commit
to michae2/cockroach
that referenced
this issue
Oct 14, 2022
Add a few fields to the sampled_query telemetry events that will help us measure how useful table statistics forecasting is in practice. Fixes: cockroachdb#86356 Release note (ops change): Add five new fields to the sampled_query telemetry events: - `ScanCount`: Number of scans in the query plan. - `ScanWithStatsCount`: Number of scans using statistics (including forecasted statistics) in the query plan. - `ScanWithStatsForecastCount`: Number of scans using forecasted statistics in the query plan. - `TotalScanRowsWithoutForecastsEstimate`: Total number of rows read by all scans in the query, as estimated by the optimizer without using forecasts. - `NanosSinceStatsForecasted`: The greatest quantity of nanoseconds that have passed since the forecast time (or until the forecast time, if it is in the future, in which case it will be negative) for any table with forecasted stats scanned by this query.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-sql-logging-and-telemetry
Issues related to slow query log, SQL audit log, SQL internal logging telemetry, etc.
A-sql-optimizer
SQL logical planning and optimizations.
A-sql-table-stats
Table statistics (and their automatic refresh).
A-telemetry
C-enhancement
Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
T-sql-queries
SQL Queries Team
Follow up from #79872: we should add telemetry for statistics forecast usage. Maybe something similar to
cockroach/pkg/sql/exec_log.go
Line 420 in 85ae31d
Jira issue: CRDB-18712
The text was updated successfully, but these errors were encountered: