Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: added indexes to system.statement_statistics, system.transaction_statistics #99042

Merged
merged 1 commit into from
Mar 21, 2023

Conversation

ericharmeling
Copy link
Contributor

Fixes #98624.

This commit adds indexes on new computed columns to the system.statement_statistics and system.transaction_statistics tables.

Epic: none

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@ericharmeling ericharmeling force-pushed the indexes-on-system-stats branch 3 times, most recently from 21ca4b0 to 99728c0 Compare March 20, 2023 19:51
@ericharmeling ericharmeling marked this pull request as ready for review March 20, 2023 19:58
@ericharmeling ericharmeling requested a review from a team March 20, 2023 19:58
@ericharmeling ericharmeling requested a review from a team as a code owner March 20, 2023 19:58
@ericharmeling ericharmeling requested a review from a team March 20, 2023 19:58
@ericharmeling ericharmeling requested a review from a team as a code owner March 20, 2023 19:58
@ericharmeling ericharmeling requested review from msirek and removed request for a team March 20, 2023 19:58
@ericharmeling ericharmeling force-pushed the indexes-on-system-stats branch from 99728c0 to d081e5a Compare March 20, 2023 21:10
@ericharmeling
Copy link
Contributor Author

Here's some benchmark results: BenchmarkSqlStatsPersistedresults.txt

Copy link
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 17 of 17 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ericharmeling and @msirek)


pkg/sql/catalog/systemschema/system.go line 124 at r1 (raw file):

	indexUsageComputeExpr           = `(statistics->'statistics':::STRING)->'indexes':::STRING`
	executionCountComputeExpr       = `((statistics->'execution_statistics':::STRING)->'cnt':::STRING)::INT8`

do you need to convert this to STRING and then to INT8? can you do INT8 right away? same things for other below

also, this should bt the count from statistics -> statistics -> cnt


pkg/sql/catalog/systemschema/system.go line 569 at r1 (raw file):

			plan,
			index_recommendations,
      execution_count,

nit: alignment is off


pkg/sql/catalog/systemschema/system.go line 615 at r1 (raw file):

			metadata,
			statistics,
      execution_count,

nit: alignment


pkg/upgrade/upgrades/create_computed_indexes_sql_statistics.go line 69 at r1 (raw file):

ALTER TABLE system.statement_statistics
ADD COLUMN IF NOT EXISTS "total_estimated_execution_time" FLOAT 
AS ((statistics->'execution_statistics'->>'cnt')::float * (statistics->'statistics'->'svcLat'->>'mean')::FLOAT) STORED

you should use statistics -> statistics -> cnt, because the count for exec stats is not the same and can have less that was used for the latency on stats.


pkg/upgrade/upgrades/create_computed_indexes_sql_statistics.go line 82 at r1 (raw file):

ALTER TABLE system.transaction_statistics
ADD COLUMN IF NOT EXISTS "execution_count" INT8 
AS ((statistics->'execution_statistics'->'cnt')::INT8) STORED

ditto


pkg/upgrade/upgrades/create_computed_indexes_sql_statistics.go line 126 at r1 (raw file):

ALTER TABLE system.transaction_statistics
ADD COLUMN IF NOT EXISTS "total_estimated_execution_time" FLOAT 
AS ((statistics->'execution_statistics'->>'cnt')::float * (statistics->'statistics'->'svcLat'->>'mean')::FLOAT) STORED

ditto


pkg/sql/sqlstats/persistedsqlstats/bench_test.go line 184 at r1 (raw file):

				// compute expressions
				executionCount := "((statistics->'execution_statistics'->'cnt')::INT8)"

you will need to update all references of count on the test to use the statistics one

@ericharmeling ericharmeling force-pushed the indexes-on-system-stats branch from d081e5a to b9bca30 Compare March 21, 2023 01:03
Copy link
Contributor Author

@ericharmeling ericharmeling left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @maryliag and @msirek)


pkg/sql/catalog/systemschema/system.go line 124 at r1 (raw file):

do you need to convert this to STRING and then to INT8? can you do INT8 right away? same things for other below

Originally, I was not explicitly casting anything except the final type (e.g., INT8 here), but TestSystemTableLiterals was failing the compute expressions. The compute expressions here are what that test expects.

also, this should bt the count from statistics -> statistics -> cnt

Done.


pkg/sql/catalog/systemschema/system.go line 569 at r1 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

nit: alignment is off

Done.


pkg/sql/catalog/systemschema/system.go line 615 at r1 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

nit: alignment

Done.


pkg/upgrade/upgrades/create_computed_indexes_sql_statistics.go line 69 at r1 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

you should use statistics -> statistics -> cnt, because the count for exec stats is not the same and can have less that was used for the latency on stats.

Done.


pkg/upgrade/upgrades/create_computed_indexes_sql_statistics.go line 82 at r1 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

ditto

Done.


pkg/upgrade/upgrades/create_computed_indexes_sql_statistics.go line 126 at r1 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

ditto

Done.


pkg/sql/sqlstats/persistedsqlstats/bench_test.go line 184 at r1 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

you will need to update all references of count on the test to use the statistics one

Done.

@ericharmeling ericharmeling requested a review from maryliag March 21, 2023 01:03
Copy link
Contributor

@maryliag maryliag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 5 of 5 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @msirek)

@ericharmeling ericharmeling force-pushed the indexes-on-system-stats branch from b9bca30 to d2eb362 Compare March 21, 2023 13:38
@ericharmeling
Copy link
Contributor Author

Here are the queries that these indexes optimize:

SELECT * FROM ((SELECT
  aggregated_ts,
  fingerprint_id,
  transaction_fingerprint_id,
  app_name,
  execution_count,
  service_latency,
  cpu_sql_nanos,
  contention_time,
  total_estimated_execution_time,
  metadata,
  statistics
FROM system.statement_statistics
WHERE app_name NOT LIKE '$ internal%' AND aggregated_ts > (now() - INTERVAL '1 hour') ORDER BY execution_count DESC LIMIT 500) UNION (SELECT
  aggregated_ts,
  fingerprint_id,
  transaction_fingerprint_id,
  app_name,
  execution_count,
  service_latency,
  cpu_sql_nanos,
  contention_time,
  total_estimated_execution_time,
  metadata,
  statistics
FROM system.statement_statistics
WHERE app_name NOT LIKE '$ internal%' AND aggregated_ts > (now() - INTERVAL '1 hour')  ORDER BY service_latency DESC LIMIT 500) UNION (SELECT
  aggregated_ts,
  fingerprint_id,
  transaction_fingerprint_id,
  app_name,
  execution_count,
  service_latency,
  cpu_sql_nanos,
  contention_time,
  total_estimated_execution_time,
  metadata,
  statistics
FROM system.statement_statistics
WHERE app_name NOT LIKE '$ internal%' AND aggregated_ts > (now() - INTERVAL '1 hour')  ORDER BY cpu_sql_nanos DESC LIMIT 500) UNION (SELECT
  aggregated_ts,
  fingerprint_id,
  transaction_fingerprint_id,
  app_name,
  execution_count,
  service_latency,
  cpu_sql_nanos,
  contention_time,
  total_estimated_execution_time,
  metadata,
  statistics
FROM system.statement_statistics
WHERE app_name NOT LIKE '$ internal%' AND aggregated_ts > (now() - INTERVAL '1 hour')  ORDER BY contention_time DESC LIMIT 500) UNION (SELECT
  aggregated_ts,
  fingerprint_id,
  transaction_fingerprint_id,
  app_name,
  execution_count,
  service_latency,
  cpu_sql_nanos,
  contention_time,
  total_estimated_execution_time,
  metadata,
  statistics
FROM system.statement_statistics
WHERE app_name NOT LIKE '$ internal%' AND aggregated_ts > (now() - INTERVAL '1 hour')  ORDER BY total_estimated_execution_time DESC LIMIT 500))
SELECT * FROM ((SELECT
  aggregated_ts,
  fingerprint_id,
  app_name,
  execution_count,
  service_latency,
  cpu_sql_nanos,
  contention_time,
  total_estimated_execution_time,
  metadata,
  statistics
FROM system.transaction_statistics
WHERE app_name NOT LIKE '$ internal%' AND aggregated_ts > (now() - INTERVAL '1 hour')  ORDER BY execution_count DESC LIMIT 500) UNION (SELECT
  aggregated_ts,
  fingerprint_id,
  app_name,
  execution_count,
  service_latency,
  cpu_sql_nanos,
  contention_time,
  total_estimated_execution_time,
  metadata,
  statistics
FROM system.transaction_statistics
WHERE app_name NOT LIKE '$ internal%' AND aggregated_ts > (now() - INTERVAL '1 hour')  ORDER BY service_latency DESC LIMIT 500) UNION (SELECT
  aggregated_ts,
  fingerprint_id,
  app_name,
  execution_count,
  service_latency,
  cpu_sql_nanos,
  contention_time,
  total_estimated_execution_time,
  metadata,
  statistics
FROM system.transaction_statistics
WHERE app_name NOT LIKE '$ internal%' AND aggregated_ts > (now() - INTERVAL '1 hour')  ORDER BY cpu_sql_nanos DESC LIMIT 500) UNION (SELECT
  aggregated_ts,
  fingerprint_id,
  app_name,
  execution_count,
  service_latency,
  cpu_sql_nanos,
  contention_time,
  total_estimated_execution_time,
  metadata,
  statistics
FROM system.transaction_statistics
WHERE app_name NOT LIKE '$ internal%' AND aggregated_ts > (now() - INTERVAL '1 hour')  ORDER BY contention_time DESC LIMIT 500) UNION (SELECT
  aggregated_ts,
  fingerprint_id,
  app_name,
  execution_count,
  service_latency,
  cpu_sql_nanos,
  contention_time,
  total_estimated_execution_time,
  metadata,
  statistics
FROM system.transaction_statistics
WHERE app_name NOT LIKE '$ internal%' AND aggregated_ts > (now() - INTERVAL '1 hour')  ORDER BY total_estimated_execution_time DESC LIMIT 500))

These queries will be used to populate the tables introduced in #98885.

@rytaft
Copy link
Collaborator

rytaft commented Mar 21, 2023

If any of these queries are performance sensitive, I suggest adding them to your new sql_statistics_persisted file or another optimizer test.

@ericharmeling
Copy link
Contributor Author

@rytaft Thanks for looking at this!

If any of these queries are performance sensitive, I suggest adding them to your new sql_statistics_persisted file or another optimizer test.

I tried doing this in a first revision, but the execbuilder tests were failing on non-deterministic output from EXPLAIN (VERBOSE) (the spans are over aggregated_ts values, which change each run). Any advice on how to write an optimizer test for the queries I posted above?

@rytaft
Copy link
Collaborator

rytaft commented Mar 21, 2023

Just change now() to some arbitrary fixed timestamp. It shouldn't change the plan.

…_statistics

Fixes cockroachdb#98624.

This commit adds indexes on new computed columns to the
system.statement_statistics and system.transaction_statistics
tables.

Epic: none

Release note: None
@ericharmeling ericharmeling force-pushed the indexes-on-system-stats branch from d2eb362 to b6d5bb7 Compare March 21, 2023 15:12
@ericharmeling
Copy link
Contributor Author

Just change now() to some arbitrary fixed timestamp. It shouldn't change the plan.

🤦 Of course. Thank you!

Copy link
Contributor

@j82w j82w left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @maryliag and @msirek)

@ericharmeling
Copy link
Contributor Author

bors r+

ericharmeling added a commit to ericharmeling/cockroach that referenced this pull request Mar 21, 2023
…, system.transaction_statistics

Part of cockroachdb#98624.
Follows cockroachdb#99042.

This commit adds a new computed column (p99_latency) and partial
index (p99_latency_idx) to the system.statement_statistics and
system.transaction_statistics tables.

Epic: none

Release note: None
@craig
Copy link
Contributor

craig bot commented Mar 21, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Mar 21, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Mar 21, 2023

Build succeeded:

@craig craig bot merged commit 77029b1 into cockroachdb:master Mar 21, 2023
ericharmeling added a commit to ericharmeling/cockroach that referenced this pull request Mar 21, 2023
…, system.transaction_statistics

Part of cockroachdb#98624.
Follows cockroachdb#99042.

This commit adds a new computed column (p99_latency) and partial
index (p99_latency_idx) to the system.statement_statistics and
system.transaction_statistics tables.

Epic: none

Release note: None
ericharmeling added a commit to ericharmeling/cockroach that referenced this pull request Mar 21, 2023
…, system.transaction_statistics

Part of cockroachdb#98624.
Follows cockroachdb#99042.

This commit adds a new computed column (p99_latency) and partial
index (p99_latency_idx) to the system.statement_statistics and
system.transaction_statistics tables.

The commit also increases the hard-coded fingerprint execution
count for the flush benchmark to the maximum for in-memory
statement stats that trigger a flush.

Epic: none

Release note: None
ericharmeling added a commit to ericharmeling/cockroach that referenced this pull request Mar 22, 2023
…, system.transaction_statistics

Part of cockroachdb#98624.
Follows cockroachdb#99042.

This commit adds a new computed column (p99_latency) and partial
index (p99_latency_idx) to the system.statement_statistics and
system.transaction_statistics tables.

The commit also increases the hard-coded fingerprint execution
count for the flush benchmark to the maximum for in-memory
statement stats that trigger a flush.

Epic: none

Release note: None
ericharmeling added a commit to ericharmeling/cockroach that referenced this pull request Mar 22, 2023
Follows cockroachdb#99042, cockroachdb#99148.

This commit adds the following computed columns to both
crdb_internal.statement_statistics_persisted and
crdb_internal.transaction_statistics_persisted views:
-execution_count
-service_latency
-cpu_sql_nanos
-contention_time
-total_estimated_execution_time
-p99_latency

These computed columns are indexed in the system
tables from which these views are created.

Epic: None

Release note: The crdb_internal.statement_statistics_persisted
and crdb_internal.transaction_statistics_persisted views now
include computed columns, to optimize the performance of
frequently-queried expressions.
@msirek msirek removed their request for review March 22, 2023 17:42
craig bot pushed a commit that referenced this pull request Mar 22, 2023
99148: sql: add p99 computed column and index to system.statement_statistics, system.transaction_statistics r=ericharmeling a=ericharmeling

Part of #98624.
Follows #99042.

This commit adds a new computed column (p99_latency) and partial
index (p99_latency_idx) to the system.statement_statistics and
system.transaction_statistics tables.

Epic: none

Release note: None

99231: sql: fix partitioned table format in statement bundle schema.sql r=mgartner a=mgartner

Fixes #99026

Release note: None

Co-authored-by: Eric Harmeling <[email protected]>
Co-authored-by: Marcus Gartner <[email protected]>
blathers-crl bot pushed a commit that referenced this pull request Mar 22, 2023
…, system.transaction_statistics

Part of #98624.
Follows #99042.

This commit adds a new computed column (p99_latency) and partial
index (p99_latency_idx) to the system.statement_statistics and
system.transaction_statistics tables.

The commit also increases the hard-coded fingerprint execution
count for the flush benchmark to the maximum for in-memory
statement stats that trigger a flush.

Epic: none

Release note: None
ericharmeling added a commit to ericharmeling/cockroach that referenced this pull request Mar 22, 2023
Follows cockroachdb#99042, cockroachdb#99148.

This commit adds the following computed columns to both
crdb_internal.statement_statistics_persisted and
crdb_internal.transaction_statistics_persisted views:
-execution_count
-service_latency
-cpu_sql_nanos
-contention_time
-total_estimated_execution_time
-p99_latency

These computed columns are indexed in the system
tables from which these views are created.

Epic: None

Release note: The crdb_internal.statement_statistics_persisted
and crdb_internal.transaction_statistics_persisted views now
include computed columns, to optimize the performance of
frequently-queried expressions.
craig bot pushed a commit that referenced this pull request Mar 23, 2023
99236: roachprod: retry GetInternalIP on error r=srosenberg a=smg260

Epic: none
Fixes: #98285, #98342

Release note: None

99240: sql: add computed columns to crdb_internal persisted sql stats views r=ericharmeling a=ericharmeling

Follows #99042, #99148.

This commit adds the following computed columns to both `crdb_internal.statement_statistics_persisted` and `crdb_internal.transaction_statistics_persisted` views:
-`execution_count`
-`service_latency`
-`cpu_sql_nanos`
-`contention_time`
-`total_estimated_execution_time`
-`p99_latency`

These computed columns are indexed in the system tables from which these views are created.

Epic: None

Release note: The `crdb_internal.statement_statistics_persisted` and `crdb_internal.transaction_statistics_persisted` views now include computed columns, to optimize the performance of frequently-queried expressions.

99318: multitenant: don't panic if reader doesn't exist yet r=knz,ajstorm,arulajmani a=stevendanna

While shared-process tenant servers are not likely to make requests
before the capability reader exists, the limiter factory looks up the
relevant tenant ID from the start key of the range descriptor and it
isn't unlikely that we'll see requests against tenant ranges before we
have a capability reader available.

Epic: none

Release note: None

Release justification: Low risk fix to avoid panics in tests.

99369: build: use same roachtest parallelism and cpuquota for AWS and GCE r=rickystewart a=renatolabs

The parallelism and cpuquota passed to AWS is much lower than that for GCE when invoking roachtest nightly builds. The AWS values have been imported from TeamCity ~3 years ago [1] and haven't changed since then. However, we are starting to see Roachtest Nightly builds time out on AWS [2] now that teams are starting to write more roachtests that run on AWS.

This commit removes the custom `PARALLELISM` and `CPUQUOTA` settings we had in place for AWS, making it consistent with GCE.

[1] see 8219a7f
[2] https://teamcity.cockroachdb.com/viewLog.html?buildId=9182737&buildTypeId=Cockroach_Nightlies_RoachtestNightlyAwsBazel

Epic: none

Release note: None

Co-authored-by: Miral Gadani <[email protected]>
Co-authored-by: Eric Harmeling <[email protected]>
Co-authored-by: Steven Danna <[email protected]>
Co-authored-by: Renato Costa <[email protected]>
blathers-crl bot pushed a commit that referenced this pull request Mar 23, 2023
Follows #99042, #99148.

This commit adds the following computed columns to both
crdb_internal.statement_statistics_persisted and
crdb_internal.transaction_statistics_persisted views:
-execution_count
-service_latency
-cpu_sql_nanos
-contention_time
-total_estimated_execution_time
-p99_latency

These computed columns are indexed in the system
tables from which these views are created.

Epic: None

Release note: The crdb_internal.statement_statistics_persisted
and crdb_internal.transaction_statistics_persisted views now
include computed columns, to optimize the performance of
frequently-queried expressions.
kvoli pushed a commit to kvoli/cockroach that referenced this pull request Mar 23, 2023
Follows cockroachdb#99042, cockroachdb#99148.

This commit adds the following computed columns to both
crdb_internal.statement_statistics_persisted and
crdb_internal.transaction_statistics_persisted views:
-execution_count
-service_latency
-cpu_sql_nanos
-contention_time
-total_estimated_execution_time
-p99_latency

These computed columns are indexed in the system
tables from which these views are created.

Epic: None

Release note: The crdb_internal.statement_statistics_persisted
and crdb_internal.transaction_statistics_persisted views now
include computed columns, to optimize the performance of
frequently-queried expressions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: index frequently-used fields in system statistics tables
5 participants