sql: index usage stats doesn't capture pk scan during index join #76173

Azhng · 2022-02-07T17:20:28Z

Suppose we have a plan as follow:

root@:26257/defaultdb> explain SELECT c1,c2,c3
    FROM scandirection
    WHERE big_id BETWEEN 10 and 20;
                                         info
--------------------------------------------------------------------------------------
  distribution: local
  vectorized: true

  • index join
  │ estimated row count: 0
  │ table: scandirection@primary
  │
  └── • scan
        estimated row count: 0 (<0.01% of the table; stats collected 23 minutes ago)
        table: scandirection@idx_rev
        spans: [/20 - /10]

The index usage virtual table doesn't reflect the scan on the pk

root@:26257/defaultdb> SELECT ti.descriptor_name AS table_name, ti.index_name, total_reads, last_read
FROM crdb_internal.index_usage_statistics AS us
JOIN crdb_internal.table_indexes ti ON us.index_id = ti.index_id AND us.table_id = ti.descriptor_id
ORDER BY total_reads ASC;
   table_name   | index_name | total_reads |           last_read
----------------+------------+-------------+--------------------------------
  scandirection | primary    |           0 | NULL
  b2            | idx_big_id |           0 | NULL
  scandirection | idx_fw     |           0 | NULL
  contend       | primary    |           0 | NULL
  u             | primary    |           0 | NULL
  b1            | primary    |           0 | NULL
  b2            | primary    |           0 | NULL
  o             | primary    |           0 | NULL
  order         | primary    |           0 | NULL
  a             | primary    |           0 | NULL
  scandirection | idx_rev    |        2001 | 2022-02-07 17:01:01.901473+00

Jira issue: CRDB-13010

The text was updated successfully, but these errors were encountered:

Azhng · 2022-02-07T17:23:57Z

Seems like during ConstructIndexJoin, the Scan operation on the Pk is planned directly without going through the ConstructScan method, (where the index usage is captured).

This means that we are definitely missing the usage stats during index join.

glennfawcett · 2022-02-07T18:13:11Z

Additionally, the total_reads column is useful but what I need even more is the total_read_rows to show how many rows were retrieved by for each index and pk.

kevin-v-ngo · 2022-02-09T00:34:12Z

Additionally, the total_reads column is useful but what I need even more is the total_read_rows to show how many rows were retrieved by for each index and pk.

Thanks for the feedback @glennfawcett! This is an improvement we're tracking. To help understand the scenario a bit more, what would you do/action you'd take knowing the number of rows retrieved for a particular index?

Fixes cockroachdb#76173 Previously, the construction of index joins skipped the recording of the primary key for index usage stats. This commit records the use of the primary key if the statement is not of the EXPLAIN variety. Release note (bug fix): Index usage stats are now properly captured for index joins.

75126: roachprod: move default env. vars into config.go and remove the duplicates r=srosenberg a=srosenberg Both, COCKROACH_ENABLE_RPC_COMPRESSION=false and COCKROACH_UI_RELEASE_NOTES_SIGNUP_DISMISSED=true are passed into ENV_VARS of the systemd startup script (start.sh) by default. The same behavior is preserved _modulo_ duplication. Since roachprod can now be initialized either via CLI or API, default env. vars have been consolidated into config.go (used by both). Release note: None 78333: sql: record index usage stats in index joins r=maryliag,Azhng a=xinhaoz Fixes #76173 Previously, the construction of index joins skipped the recording of the primary key for index usage stats. This commit records the use of the primary key if the statement is not of the EXPLAIN variety. Release note (bug fix): Index usage stats are now properly captured for index joins. ---------------------------------- Example: We should expect to see a read from the pkey after an index join. ``` root@:26257/defaultdb> CREATE TABLE test (c1 INT, c2 INT, c3 INT, INDEX(c1, c2)); CREATE TABLE Time: 120ms total (execution 120ms / network 0ms) root@:26257/defaultdb> EXPLAIN select c3 from TEST WHERE c2 > 1 AND c1 = 3; info ------------------------------------------------------------------------------------------------ distribution: local vectorized: true • index join │ estimated row count: 1 │ table: test@test_pkey │ └── • scan estimated row count: 1 (100% of the table; stats collected 49 seconds ago) table: test@test_c1_c2_idx spans: [/3/2 - /3] index recommendations: 1 1. type: index replacement SQL commands: CREATE INDEX ON test (c1, c2) STORING (c3); DROP INDEX test@test_c1_c2_idx; (15 rows) Time: 1ms total (execution 1ms / network 0ms) root@:26257/defaultdb> select c3 from TEST WHERE c2 > 1 AND c1 = 3; c3 ------ (0 rows) Time: 1ms total (execution 1ms / network 0ms) root@:26257/defaultdb> select index_name, index_type, total_reads from crdb_internal.index_usage_statistics AS us JOIN crdb_internal.table_indexes ti ON us.index_id = ti.index_id AND us.table_id = ti.descriptor_id ORDER BY total_reads asc; index_name | index_type | total_reads -----------------+------------+-------------- test_pkey | primary | 1 test_c1_c2_idx | secondary | 1 (2 rows) Time: 7ms total (execution 6ms / network 0ms) ``` 78446: sql: new SQL Stats cluster settings to improve write traffic r=Azhng a=Azhng Resolves #78339 Previously, SQL Stats flushed to system table as soon as the in-memory buffer is full. This means the size of the system tables that back SQL Stats could grow faster than the cleanup job. Additionally, when the SQL Stats flush is disabled, the SQL Stats is unable to collect any more new statement / transaction statistics when the in-memory store is full. This commit introduces two non-public cluster settings: * `sql.stats.flush.minimum_interval`: this setting limits minimum interval between each flush operation. If a flush operation is triggered sooner than what is allowed by the minimum interval, (e.g. when the in-memory SQL Stats store is full), the flush operation is aborted. * `sql.stats.flush.force_cleanup.enabled`: which allows the in-memory SQL Stats to be cleared at the interval specified by `sql.stats.flush.interval`, even if the SQL Stats flush is disabled. This commit also updated the stmt_grouping_in_explicit_txn data driven test to ensure the output order is deterministic. Release note: None 78526: ci: add scripts for bazel-based `ui test`, `ui lint` ci jobs r=rail,jsbarag a=rickystewart These scripts just do the same logic that `dev` does for the same functions. Release note: None Co-authored-by: Stan Rosenberg <[email protected]> Co-authored-by: Xin Hao Zhang <[email protected]> Co-authored-by: Azhng <[email protected]> Co-authored-by: Ricky Stewart <[email protected]>

Fixes cockroachdb#76173 Previously, the construction of index joins skipped the recording of the primary key for index usage stats. This commit records the use of the primary key if the statement is not of the EXPLAIN variety. Release note (bug fix): Index usage stats are now properly captured for index joins.

Azhng added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-sql-observability Related to observability of the SQL layer T-sql-observability labels Feb 7, 2022

maryliag assigned THardy98 Mar 3, 2022

xinhaoz assigned xinhaoz and unassigned THardy98 Mar 22, 2022

xinhaoz mentioned this issue Mar 23, 2022

sql: record index usage stats in index joins #78333

Merged

craig bot closed this as completed in a57aa7c Mar 29, 2022

xinhaoz mentioned this issue Apr 1, 2022

release 21.2: sql: record index usage stats in index joins #79240

Merged

xinhaoz mentioned this issue Apr 1, 2022

release-22.1: sql: record index usage stats in index joins #79241

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: index usage stats doesn't capture pk scan during index join #76173

sql: index usage stats doesn't capture pk scan during index join #76173

Azhng commented Feb 7, 2022 •

edited by cockroach-jira-scripts

Loading

Azhng commented Feb 7, 2022

glennfawcett commented Feb 7, 2022

kevin-v-ngo commented Feb 9, 2022

sql: index usage stats doesn't capture pk scan during index join #76173

sql: index usage stats doesn't capture pk scan during index join #76173

Comments

Azhng commented Feb 7, 2022 • edited by cockroach-jira-scripts Loading

Azhng commented Feb 7, 2022

glennfawcett commented Feb 7, 2022

kevin-v-ngo commented Feb 9, 2022

Azhng commented Feb 7, 2022 •

edited by cockroach-jira-scripts

Loading