Capture index usage metrics in telemetry logging channel #72486

kevin-v-ngo · 2021-11-05T18:49:45Z

This issue tracks capturing index usage metrics for all customer workloads. This will help us determine how many customers are suffering from unused indexes and measure our progress in helping developers design and tune their indexing strategy. We should take at least 8 hour periodic snapshots of this information

We should capture the envelope including the following event payload:

CREATE TABLE telemetry_logs_index_usage_statistics_raw (
database_name NOT NULL STRING,
database_id INT8 NOT NULL,
table_name NOT NULL STRING,
table_id INT8 NOT NULL,
ti.index_name INT8 NOT NULL,
index_id INT8 NOT NULL,
index_type NOT NULL STRING,
total_reads INT8 NOT NULL,
is_unique BOOL NOT NULL,
is_inverted BOOL NOT NULL,
last_read TIMESTAMPTZ NULL
)

crdb_internal tables that contain this information:

crdb_internal.index_usage_statistics
crdb_internal.table_indexes
crdb_internal.databases

Sample user query:

SELECT
current_database() as database_name,
 ti.descriptor_name as table_name,
 ti.descriptor_id as table_id,
 ti.index_name,
 ti.index_id,
 ti.index_type,
 ti.is_unique,
 ti.is_inverted,
 total_reads,
 last_read
FROM crdb_internal.index_usage_statistics AS us
JOIN crdb_internal.table_indexes ti
ON us.index_id = ti.index_id
 AND us.table_id = ti.descriptor_id
ORDER BY total_reads ASC;

Epic: CRDB-10495

Jira issue: CRDB-11177

The text was updated successfully, but these errors were encountered:

Resolves: cockroachdb#72486 Previously, we did not capture index usage metrics to our telemetry logging channel. The intention of this change is to introduce a scheduler for jobs that emit logging events. Currently, we want to capture index usage statistics to the telemetry logging channel on a time interval, but being able to emit scheduled logs for different statistics/events would be useful telemetry going forward, which motivates the creation of a scheduler that supports different logging jobs. This change introduces the `scheduledloggingjobs` package. `CreateScheduledLoggingJob` in `scheduledloggingjobs`, allows you to create a scheduler with parameters for: - label - schedule interval - executor type - schedule details - execution arguments The Resumer and Executor logic for the scheduled jobs resides in `sql/scheduled_logging_jobs.go`. The Resumer receives a callback function that returns the logging events it wants to emit. The Executor receives the job details/progress (used to create the job record when the job is executed) and job type (to check that a job of the same type isn't already running when we try to create the job). The `scheduledloggingjobs` package also contains a `Controller` whose purpose is to `Start` all the scheduled logging jobs in its interface. The `Controller`'s `Start` method is hooked into the startup of the `sql.Server`. Assuming its `jobs.proto` definitions and `wrap.go` cases are covered, a new logging job can be added to this by: - defining its resume callback function - registering its log resumer and executor with `RegisterLogResumerAndScheduleExecutor` - defining its arguments (label, schedule interval, executor type, schedule details, execution arguments) and calling `CreateScheduledLoggingJob` - adding it to the `Controller` interface, and `Start` method (if you want to create the scheduled job on server startup) This change was made for internal use in mind, I didn't consider a SQL interface to create these schedules. Release note (sql change): Initial implementation of a general logging job, used to create a job that captures index usage statistics to the telemetry logging channel.

kevin-v-ngo added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-sql-logging-and-telemetry Issues related to slow query log, SQL audit log, SQL internal logging telemetry, etc. labels Nov 5, 2021

blathers-crl bot added the T-sql-observability label Nov 5, 2021

kevin-v-ngo mentioned this issue Nov 24, 2021

Enrich the existing query log format for cloud telemetry #71328

Open

11 tasks

maryliag assigned THardy98 Dec 9, 2021

THardy98 mentioned this issue Feb 17, 2022

sql: create general logging job to capture index usage stats #76752

Closed

THardy98 mentioned this issue Feb 22, 2022

sql: scheduled logger to capture index usage stats #76886

Merged

craig bot closed this as completed in 9bab39f Mar 1, 2022

This was referenced Mar 25, 2022

release-21.2: sql: scheduled logger to capture index usage stats #78518

Closed

release-21.2: sql: scheduled logger to capture index usage stats #78522

Merged

kevin-v-ngo mentioned this issue Jul 14, 2022

Add create_date to our index usage statistics telmetry #84458

Closed

kevin-v-ngo mentioned this issue Aug 10, 2022

Add schema_name to the index usage statistics telemetry log #85932

Closed

kevin-v-ngo mentioned this issue Nov 18, 2022

Add Live and Non-live metrics to the existing index usage statistics telemetry log #92170

Open

kevin-v-ngo mentioned this issue Mar 27, 2024

Capture diagnostics and observability information for jobs in our telemetry pipeline #121233

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Capture index usage metrics in telemetry logging channel #72486

Capture index usage metrics in telemetry logging channel #72486

kevin-v-ngo commented Nov 5, 2021 •

edited by cockroach-jira-scripts

Loading

Capture index usage metrics in telemetry logging channel #72486

Capture index usage metrics in telemetry logging channel #72486

Comments

kevin-v-ngo commented Nov 5, 2021 • edited by cockroach-jira-scripts Loading

kevin-v-ngo commented Nov 5, 2021 •

edited by cockroach-jira-scripts

Loading