Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql/opt: add telemetry for statistics forecast usage #86356

Closed
michae2 opened this issue Aug 17, 2022 · 0 comments · Fixed by #88539
Closed

sql/opt: add telemetry for statistics forecast usage #86356

michae2 opened this issue Aug 17, 2022 · 0 comments · Fixed by #88539
Assignees
Labels
A-sql-logging-and-telemetry Issues related to slow query log, SQL audit log, SQL internal logging telemetry, etc. A-sql-optimizer SQL logical planning and optimizations. A-sql-table-stats Table statistics (and their automatic refresh). A-telemetry C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-queries SQL Queries Team

Comments

@michae2
Copy link
Collaborator

michae2 commented Aug 17, 2022

Follow up from #79872: we should add telemetry for statistics forecast usage. Maybe something similar to

NanosSinceStatsCollected: int64(p.curPlan.instrumentation.nanosSinceStatsCollected),

Jira issue: CRDB-18712

@michae2 michae2 added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-sql-optimizer SQL logical planning and optimizations. A-telemetry A-sql-table-stats Table statistics (and their automatic refresh). T-sql-queries SQL Queries Team A-sql-logging-and-telemetry Issues related to slow query log, SQL audit log, SQL internal logging telemetry, etc. labels Aug 17, 2022
@michae2 michae2 self-assigned this Aug 17, 2022
michae2 added a commit to michae2/cockroach that referenced this issue Sep 23, 2022
michae2 added a commit to michae2/cockroach that referenced this issue Oct 3, 2022
Add a few fields to the sampled_query telemetry events that will help us
measure how useful table statistics forecasting is in practice.

Fixes: cockroachdb#86356

Release note (ops change): Add five new fields to the sampled_query
telemetry events:
- `ScanCount`: Number of scans in the query plan.
- `ScanWithStatsCount`: Number of scans using statistics (including
  forecasted statistics) in the query plan.
- `ScanWithStatsForecastCount`: Number of scans using forecasted
  statistics in the query plan.
- `TotalScanRowsWithoutForecastsEstimate`: Total number of rows read by
  all scans in the query, as estimated by the optimizer without using
  forecasts.
- `NanosSinceStatsForecasted`: The maximum number of nanoseconds that
  have passed since the forecast time (or until the forecast time, if it
  is in the future) for any table with forecasted stats scanned by this
  query.
michae2 added a commit to michae2/cockroach that referenced this issue Oct 7, 2022
Add a few fields to the sampled_query telemetry events that will help us
measure how useful table statistics forecasting is in practice.

Fixes: cockroachdb#86356

Release note (ops change): Add five new fields to the sampled_query
telemetry events:
- `ScanCount`: Number of scans in the query plan.
- `ScanWithStatsCount`: Number of scans using statistics (including
  forecasted statistics) in the query plan.
- `ScanWithStatsForecastCount`: Number of scans using forecasted
  statistics in the query plan.
- `TotalScanRowsWithoutForecastsEstimate`: Total number of rows read by
  all scans in the query, as estimated by the optimizer without using
  forecasts.
- `NanosSinceStatsForecasted`: The greatest quantity of nanoseconds that
  have passed since the forecast time (or until the forecast time, if it
  is in the future, in which case it will be negative) for any table
  with forecasted stats scanned by this query.
craig bot pushed a commit that referenced this issue Oct 10, 2022
88539: sql: add telemetry for statistics forecast usage r=rytaft a=michae2

Add a few fields to the sampled_query telemetry events that will help us
measure how useful table statistics forecasting is in practice.

Fixes: #86356

Release note (ops change): Add five new fields to the sampled_query
telemetry events:
- `ScanCount`: Number of scans in the query plan.
- `ScanWithStatsCount`: Number of scans using statistics (including
  forecasted statistics) in the query plan.
- `ScanWithStatsForecastCount`: Number of scans using forecasted
  statistics in the query plan.
- `TotalScanRowsWithoutForecastsEstimate`: Total number of rows read by
  all scans in the query, as estimated by the optimizer without using
  forecasts.
- `NanosSinceStatsForecasted`: The greatest quantity of nanoseconds that
  have passed since the forecast time (or until the forecast time, if it
  is in the future, in which case it will be negative) for any table
  with forecasted stats scanned by this query.

89418: sqlstats: always enable tracing first time fingerprint is seen r=j82w a=j82w

Fixes: #89185

The first time a fingerprint is seen tracing should be enabled. This currently is broken if
sql.metrics.statement_details.plan_collection.enabled is set to false. This can cause crdb_internal.transaction_contention_events to be empty because tracing was never enabled to the contention event was never recorded.

To properly fix this a new value needs to be returned on ShouldSample to tell if it is the first time a fingerprint is seen. This will remove the dependency on plan_collection feature switch.

Release justification: Bug fixes and low-risk updates to new functionality.

Release note (bug fix): Always enable tracing the frist time a fingerprint is seen.

89502: kvserver: do not report diff in consistency checks r=erikgrinaker,tbg a=pavelkalinnikov

This commit removes reporting of the diff between replicas in case of consistency check failures. With the increased range sizes the previous approach has become infeasible.

One possible way to inspect an inconsistency after this change is:
1. Run `cockroach debug range-data` tool to extract the range data from each replica's checkpoint.
2. Use standard OS tools like `diff` to analyse them.

In the meantime, we are researching the UX of this alternative approach, and seeing if there can be a better tooling support.

Part of #21128
Epic: none

Release note (sql change): The `crdb_internal.check_consistency` function now does not include the diff between inconsistent replicas, should they occur. If an inconsistency occurs, the storage engine checkpoints should be inspected. This change is made because previously the range size limit has been increased from 64 MiB to O(GiB), so inlining diffs in consistency checks does not scale.

89529: changefeedccl: update parallel consumer metrics r=jayshrivastava a=jayshrivastava

Previously, both changefeed.nprocs_flush_nanos and changefeed.nprocs_consume_event_nanos were counters that monotonically increased. This was not that useful when determining the average time it takes to consume or flush an event. Changing them to a histogram fixes this issue and allows for percentile values like p90, p99.

This change also updates changefeed.nprocs_in_flight_count to sample values when incrementing inFlight variable. Previously, it was showing up at 0 in the UI. This change makes it show the actual value.

Fixes #89654

Release note: None
Epic: none

89660: tree: use FastIntSet during typechecking r=jordanlewis a=jordanlewis

Previously, the typecheck phase used several slices of ordinals into lists. This is a perfect use case for FastIntSet, because the ordinals tend to be quite small. This commit switches to use FastIntSet instead.

```
name          old time/op    new time/op    delta
TypeCheck-10    3.68µs ± 3%    3.52µs ± 2%   -4.52%  (p=0.000 n=9+10)

name          old alloc/op   new alloc/op   delta
TypeCheck-10      744B ± 0%      576B ± 0%  -22.58%  (p=0.000 n=10+10)

name          old allocs/op  new allocs/op  delta
TypeCheck-10      32.0 ± 0%      18.0 ± 0%  -43.75%  (p=0.000 n=10+10)
```
Issue: None
Epic: None
Release note: None

89662: scop, scdep: Rename `IndexValidator` to `Validator` r=Xiang-Gu a=Xiang-Gu

I've recently done work to enable adding/dropping check constraints in the declarative
schema changer. It is a big PR with many commits. I think it's nicer to separate them
further into multiple PRs. 

This is the first PR in that effort, which is merely renaming and should be easy to review:
commit 1: ValidateCheckConstraint now uses ConstraintID, instead of constraint name;

commit 2: Rename `IndexValidator` to `Validator`.
We previously had a file under scdep called `index_validator.go` where we implement
logic for validating an index. Now that we are going to validate a check constraint, we 
renamed them so they will also validate check constraints.

Informs #89665
Release note: None

Co-authored-by: Michael Erickson <[email protected]>
Co-authored-by: j82w <[email protected]>
Co-authored-by: Pavel Kalinnikov <[email protected]>
Co-authored-by: Jayant Shrivastava <[email protected]>
Co-authored-by: Jordan Lewis <[email protected]>
Co-authored-by: Xiang Gu <[email protected]>
@craig craig bot closed this as completed in ab38868 Oct 10, 2022
michae2 added a commit to michae2/cockroach that referenced this issue Oct 12, 2022
Add a few fields to the sampled_query telemetry events that will help us
measure how useful table statistics forecasting is in practice.

Fixes: cockroachdb#86356

Release note (ops change): Add five new fields to the sampled_query
telemetry events:
- `ScanCount`: Number of scans in the query plan.
- `ScanWithStatsCount`: Number of scans using statistics (including
  forecasted statistics) in the query plan.
- `ScanWithStatsForecastCount`: Number of scans using forecasted
  statistics in the query plan.
- `TotalScanRowsWithoutForecastsEstimate`: Total number of rows read by
  all scans in the query, as estimated by the optimizer without using
  forecasts.
- `NanosSinceStatsForecasted`: The greatest quantity of nanoseconds that
  have passed since the forecast time (or until the forecast time, if it
  is in the future, in which case it will be negative) for any table
  with forecasted stats scanned by this query.
michae2 added a commit to michae2/cockroach that referenced this issue Oct 14, 2022
Add a few fields to the sampled_query telemetry events that will help us
measure how useful table statistics forecasting is in practice.

Fixes: cockroachdb#86356

Release note (ops change): Add five new fields to the sampled_query
telemetry events:
- `ScanCount`: Number of scans in the query plan.
- `ScanWithStatsCount`: Number of scans using statistics (including
  forecasted statistics) in the query plan.
- `ScanWithStatsForecastCount`: Number of scans using forecasted
  statistics in the query plan.
- `TotalScanRowsWithoutForecastsEstimate`: Total number of rows read by
  all scans in the query, as estimated by the optimizer without using
  forecasts.
- `NanosSinceStatsForecasted`: The greatest quantity of nanoseconds that
  have passed since the forecast time (or until the forecast time, if it
  is in the future, in which case it will be negative) for any table
  with forecasted stats scanned by this query.
@mgartner mgartner moved this to Done in SQL Queries Jul 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-logging-and-telemetry Issues related to slow query log, SQL audit log, SQL internal logging telemetry, etc. A-sql-optimizer SQL logical planning and optimizations. A-sql-table-stats Table statistics (and their automatic refresh). A-telemetry C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant