sql: create system tables for SQL stats #65374

Azhng · 2021-05-18T01:29:17Z

This PR creates two new system tables: system.sql_statement_stats
and system.sql_transaction_stats per RFC #63752.

This is the initial step that addresses #64743.

Release note: None

cockroach-teamcity · 2021-05-18T01:29:24Z

This change is

maryliag

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @Azhng)

pkg/sql/catalog/systemschema/system.go, line 370 at r1 (raw file):

	SQLStatementStatsTableSchema = `
CREATE TABLE system.sql_statement_stats (

nit: I think those two tables should use plural, otherwise looks like is several stats for one statement/transaction. So sql_statements_stats and sql_transactions_stats
Also update in all references, e.g. SQLStatementsStatsTableSchema

pkg/sql/catalog/systemschema/system.go, line 386 at r1 (raw file):

    PRIMARY KEY (aggregated_ts, fingerprint_id, plan_hash, app_name, node_id)
      USING HASH WITH BUCKET_COUNT = 8,

You mentioned on the RFC that the size of bucket as 8 was arbitrary and that would require some benchmark testing to choose the value. Was that done? If not, should we add a comment here explaining and that this value could be replaced?

pkg/sql/catalog/systemschema/system.go, line 1852 at r1 (raw file):

	})
	// SQLStatementStatsTable is the descriptor for the SQL statement stats table.
	// It stores statistics for statistics for statement fingerprints.

nit: It stores statistics for statements fingerprints.

pkg/sql/catalog/systemschema/system.go, line 1952 at r1 (raw file):

	// SQLTransactionStatsTable is the descriptor for the SQL transaction stats
	// table. It stores statistics for statistics for transaction fingerprints.

nit: It stores statistics for transactions fingerprints.

Azhng

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @kevin-v-ngo and @maryliag)

pkg/sql/catalog/systemschema/system.go, line 370 at r1 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

nit: I think those two tables should use plural, otherwise looks like is several stats for one statement/transaction. So sql_statements_stats and sql_transactions_stats
Also update in all references, e.g. SQLStatementsStatsTableSchema

I think currently our system view is named crdb_internal.node_statement_statistics. So it's slightly breaking the convention here if we named it plural. @kevin-v-ngo What do you think?

pkg/sql/catalog/systemschema/system.go, line 386 at r1 (raw file):

Previously, maryliag (Marylia Gutierrez) wrote…

You mentioned on the RFC that the size of bucket as 8 was arbitrary and that would require some benchmark testing to choose the value. Was that done? If not, should we add a comment here explaining and that this value could be replaced?

Done.

pbardea · 2021-05-18T22:14:34Z

(Bulk IO changes look like mainly plumbing which LGTM, so going to untag myself)

Azhng

Thank you. Who should I tag for changes related to system table schema?

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @kevin-v-ngo and @maryliag)

nihalpednekar · 2021-05-18T23:40:48Z

Thank you. Who should I tag for changes related to system table schema?

@rytaft might be a good reviewer to add, as a contributor to the file and pkg/sql/stats owner.

rytaft

although I haven't touched most of this code in several years. Might be good to also check with someone on the SQL schema team...

Reviewed 8 of 11 files at r1, 29 of 29 files at r2.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @Azhng, @kevin-v-ngo, and @maryliag)

pkg/sql/catalog/bootstrap/metadata.go, line 352 at r2 (raw file):

	// Tables introduced in 21.2.
	target.AddDescriptor(keys.SystemDatabaseID, systemschema.SQLStatementStatsTable)

nit: add a blank line above to be consistent with previous versions

pkg/sql/catalog/systemschema/system.go, line 456 at r2 (raw file):

	singleID1              = []descpb.ColumnID{1}
	sqlStmtHashComputeExpr = "mod(fnv32(COALESCE(CAST(aggregated_ts AS STRING), '':::STRING)) + (fnv32(COALESCE(CAST(app_name AS STRING), '':::STRING)) + (fnv32(COALESCE(CAST(fingerprint_id AS STRING), '':::STRING)) + (fnv32(COALESCE(CAST(node_id AS STRING), '':::STRING)) + fnv32(COALESCE(CAST(plan_hash AS STRING), '':::STRING))))), 8:::INT8)"
	sqlTxnHashComputeExpr  = "mod(fnv32(COALESCE(CAST(aggregated_ts AS STRING), '':::STRING)) + (fnv32(COALESCE(CAST(app_name AS STRING), '':::STRING)) + (fnv32(COALESCE(CAST(fingerprint_id AS STRING), '':::STRING)) + fnv32(COALESCE(CAST(node_id AS STRING), '':::STRING)))), 8:::INT8)"

where did these expressions come from? A comment would help

pkg/sql/tests/system_table_test.go, line 172 at r2 (raw file):

		LocalOnlySessionData: sessiondata.LocalOnlySessionData{
			EnableUniqueWithoutIndexConstraints: true,
			HashShardedIndexesEnabled:           true,

instead of adding the sessionDataOverride, is there a reason you can't just update the session data in the test file and set this to true? Does it break anything?

Azhng

Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @maryliag and @rytaft)

pkg/sql/catalog/systemschema/system.go, line 456 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

where did these expressions come from? A comment would help

Done.

pkg/sql/tests/system_table_test.go, line 172 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

instead of adding the sessionDataOverride, is there a reason you can't just update the session data in the test file and set this to true? Does it break anything?

Done.

Azhng

Table names have been renamed to "system.statement_statistics" and "system.transaction_statistics" per feedback on the RFC.

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @maryliag and @rytaft)

rytaft

Might want to ask @ajwerner for a quick glance at the system table stuff.

Reviewed 27 of 27 files at r3.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @maryliag)

ajwerner

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @Azhng, @maryliag, and @rytaft)

pkg/sql/catalog/systemschema/system.go, line 422 at r4 (raw file):

    PRIMARY KEY (aggregated_ts, fingerprint_id, app_name, node_id)
      USING HASH WITH BUCKET_COUNT = 8,

I wonder if we're going to be horribly sad about this soon. @mgartner, IIUC, is going to look at switching hash sharded indexes to no longer be physically stored columns. It's a bummer that we need to create this extra hash column and makes the implementation somewhat messy. I worry we'll carry this vestige of the original implementation with us forever if we choose to use it now.

What was the back-of-the-envelope for write throughput of a cluster that convinced us that we needed to hash this index?

Azhng

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @maryliag, @mgartner, and @rytaft)

pkg/sql/catalog/systemschema/system.go, line 422 at r4 (raw file):

Previously, ajwerner wrote…

I wonder if we're going to be horribly sad about this soon. @mgartner, IIUC, is going to look at switching hash sharded indexes to no longer be physically stored columns. It's a bummer that we need to create this extra hash column and makes the implementation somewhat messy. I worry we'll carry this vestige of the original implementation with us forever if we choose to use it now.

What was the back-of-the-envelope for write throughput of a cluster that convinced us that we needed to hash this index?

Is the switch to no longer physically storing hash column going to happen this release? IIUC, we can always change the system table schema without worrying about migrations before the 21.2 happens.

I think the back-of-the-envelope for write throughput based on the data we have in snowflake for CC, the worst case would be 14 GB/hr for a 32 node cluster. Though it is one of the extreme cases. Most of the other clusters would generate tens to hundreds of MB of writes.

mgartner · 2021-05-25T16:39:15Z

pkg/sql/catalog/systemschema/system.go, line 422 at r4 (raw file):

Is the switch to no longer physically storing hash column going to happen this release?

We're exploring simplifying the hash-sharded index implementation this release by using virtual computed columns. But there are still some unanswered questions, like, does this even make sense given that the hash-sharded column has to be park of the PK.

But I don't expect the syntax for creating a hash-sharded index to change. What problems would arise if the underlying hash-sharded implementation changes if there's already this existing system table in a cluster?

ajwerner

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @Azhng, @maryliag, @mgartner, and @rytaft)

pkg/sql/catalog/systemschema/system.go, line 422 at r4 (raw file):

But there are still some unanswered questions, like, does this even make sense given that the hash-sharded column has to be park of the PK.

In this case, you mean? Generally, the hash-sharded column does not need to be part of the primary key. You are right that if we keep it in the primary key, then it's a total wash.

I think the back-of-the-envelope for write throughput based on the data we have in snowflake for CC, the worst case would be 14 GB/hr for a 32 node cluster. Though it is one of the extreme cases.

That is a lot.

pkg/sql/catalog/systemschema/system.go, line 423 at r4 (raw file):

    PRIMARY KEY (aggregated_ts, fingerprint_id, app_name, node_id)
      USING HASH WITH BUCKET_COUNT = 8,
    INDEX "fingerprint_stats_idx" (fingerprint_id, aggregated_ts, app_name, node_id),

I'm somewhat suspect of this choice of primary and secondary index. It'd be nice to have some data generation and then some benchmarking. Changing this schema may be somewhat painful after it merges, at least to some extent. I worry that, in the long term, we'll wish that the hash-sharded index had small values and that the primary index were more scattered. Did you consider using fingerprint_stats_idx as the primary index and the current primary index as a secondary index? I think that'll allow for fast loading of a page for a given fingerprint and reduce the size of data that needs to be replicated to the hash sharded index by several orders of magnitude.

Azhng

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner, @maryliag, @mgartner, and @rytaft)

pkg/sql/catalog/systemschema/system.go, line 422 at r4 (raw file):

You are right that if we keep it in the primary key, then it's a total wash.

I'm not quite following this. Can you elaborate more?

That is a lot.

The write throughput scales based on unique number of fingerprints (~1 KiB per fingerprint) and number of nodes in the cluster. During the RFC discussion, knz pointed out that the performance sensitive clusters tend to tune their workload to have small number of unique number of fingerprints. This means in these clusters, we will be seeing a lot less unique fingerprints.

pkg/sql/catalog/systemschema/system.go, line 423 at r4 (raw file):

Did you consider using fingerprint_stats_idx as the primary index and the current primary index as a secondary index?

We did consider this as the original design. However, this table also will power the loading for the statement page, which displays aggregated execution statistics for all statement fingerprints for a given time interval. Since we considered (grep for 'statement page') loading statement page to be a more important use case compared to loading statistics for a single fingerprint, we chose the design of using hash sharded PK.

Changing this schema may be somewhat painful after it merges, at least to some extent

Even if we change prior to the 21.2 release ?

mgartner · 2021-05-25T17:28:39Z

pkg/sql/catalog/systemschema/system.go, line 422 at r4 (raw file):

In this case, you mean? Generally, the hash-sharded column does not need to be part of the primary key. You are right that if we keep it in the primary key, then it's a total wash.

For a hash-sharded primary index, the hash-sharded column is the first column in the PK, no?

ajwerner

Can I make a procedural recommendation here that may derail this stuff a tad? I just went through the RFC a bit and I think this project could really benefit from some go interfaces. I'm increasingly on board with this layout etc but think everything will fall into place better with some interface structure that we put in place first rather that last. I don't want to get too in your way here, I know this is way more than you bargained for.

I think we're in a bit of a mess right now with the in-memory implementation and how we're just injecting the entire *sql.Server into the status APIs and calling arbitrary methods.

A pattern that has been working well from my perspective lately is a set of low-dependency interfaces sitting in a package and some implementations sitting in separate subpackages. This way we can trivially do things like compose the retrieval and combination of persisted stats with in-memory stats without changing any clients. We can also test things about the implementation via the interface.

How do you feel about something like a stmtstats package which has interfaces like Reader and Writer that provide access to the various layers? Then we can compose some of these implementations, simplify clients, and dramatically simplify testing of different layers.

Maybe you've got a code structure plan in mind. I'd love to reach some alignment before we just glom more arbitrary functionality onto gigantic objects like *sql.Server.

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @Azhng, @maryliag, and @rytaft)

pkg/sql/catalog/systemschema/system.go, line 422 at r4 (raw file):

In this case, you mean? Generally, the hash-sharded column does not need to be part of the primary key. You are right that if we keep it in the primary key, then it's a total wash.

For a hash-sharded primary index, the hash-sharded column is the first column in the PK, no?

Correct

pkg/sql/catalog/systemschema/system.go, line 423 at r4 (raw file):

Changing this schema may be somewhat painful after it merges, at least to some extent

Even if we change prior to the 21.2 release ?

Not too painful, just non-zero.

knz

👍 on Andrew's suggestion.

Reviewed 2 of 27 files at r3, 7 of 7 files at r4.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @Azhng and @maryliag)

Azhng

Discussed with @ajwerner offline and decided to move forward with the refactoring. Also filed a tracking issue

Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @Azhng and @maryliag)

This PR creates two new system tables: system.statement_stats and system.transaction_stats per RFC cockroachdb#63752. This is the initial step that addresses cockroachdb#64743. Release note: None

Azhng · 2021-07-05T21:22:07Z

Rebased to resolve the merge conflict. See follow up PR #67090 for the SQL Stats flush mechanism. @ajwerner can you take another look.

ajwerner

Reviewed 3 of 31 files at r5.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @knz, @maryliag, and @rytaft)

Azhng · 2021-07-06T15:56:25Z

TFTR!

bors r+

craig · 2021-07-06T15:56:28Z

🕐 Waiting for PR status (Github check) to be set, probably by CI. Bors will automatically try to run when all required PR statuses are set.

craig · 2021-07-06T16:28:13Z

GitHub status checks took too long to complete, so bors is giving up. You can adjust bors configuration to have it wait longer if you like.

Azhng · 2021-07-06T16:50:26Z

bors r+

craig · 2021-07-06T17:38:47Z

Build succeeded:

GitHub CI (Cockroach)

Azhng requested a review from a team May 18, 2021 01:29

Azhng changed the title ~~sql: create system tables for SQL stats.~~ sql: create system tables for SQL stats May 18, 2021

Azhng force-pushed the stats-sys-table branch 2 times, most recently from b9638ca to 285f2e5 Compare May 18, 2021 01:32

maryliag reviewed May 18, 2021

View reviewed changes

Azhng force-pushed the stats-sys-table branch from 285f2e5 to fc69990 Compare May 18, 2021 18:56

Azhng requested a review from a team May 18, 2021 18:56

Azhng requested a review from a team as a code owner May 18, 2021 18:56

Azhng requested review from nihalpednekar and removed request for a team May 18, 2021 18:56

Azhng commented May 18, 2021

View reviewed changes

nihalpednekar requested review from pbardea and removed request for nihalpednekar May 18, 2021 20:19

Azhng force-pushed the stats-sys-table branch from fc69990 to 7b5fb52 Compare May 18, 2021 20:39

pbardea removed their request for review May 18, 2021 22:14

Azhng commented May 18, 2021

View reviewed changes

nihalpednekar requested a review from rytaft May 18, 2021 23:41

rytaft approved these changes May 19, 2021

View reviewed changes

Azhng force-pushed the stats-sys-table branch from 7b5fb52 to 42f3ecb Compare May 19, 2021 22:26

Azhng commented May 19, 2021

View reviewed changes

Azhng requested a review from a team May 19, 2021 22:27

Azhng force-pushed the stats-sys-table branch 2 times, most recently from 3d53ca3 to 88026aa Compare May 20, 2021 00:44

rytaft approved these changes May 20, 2021

View reviewed changes

Azhng force-pushed the stats-sys-table branch from 88026aa to 7c26f95 Compare May 22, 2021 02:32

Azhng requested a review from ajwerner May 22, 2021 02:33

ajwerner reviewed May 24, 2021

View reviewed changes

Azhng commented May 25, 2021

View reviewed changes

ajwerner reviewed May 25, 2021

View reviewed changes

Azhng commented May 25, 2021

View reviewed changes

ajwerner reviewed May 25, 2021

View reviewed changes

knz reviewed May 26, 2021

View reviewed changes

Azhng commented May 26, 2021

View reviewed changes

Azhng mentioned this pull request Jun 30, 2021

sql: periodically flush sqlstats #67090

Merged

sql: create system tables for SQL stats

0a1e719

This PR creates two new system tables: system.statement_stats and system.transaction_stats per RFC cockroachdb#63752. This is the initial step that addresses cockroachdb#64743. Release note: None

Azhng force-pushed the stats-sys-table branch from 7c26f95 to 0a1e719 Compare July 5, 2021 21:20

ajwerner approved these changes Jul 6, 2021

View reviewed changes

Azhng mentioned this pull request Jul 6, 2021

sql: persistent SQL Stats main tracking issue #64743

Closed

24 tasks

craig bot merged commit 3fce327 into cockroachdb:master Jul 6, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: create system tables for SQL stats #65374

sql: create system tables for SQL stats #65374

Azhng commented May 18, 2021

cockroach-teamcity commented May 18, 2021

maryliag left a comment

Azhng left a comment

pbardea commented May 18, 2021

Azhng left a comment

nihalpednekar commented May 18, 2021

rytaft left a comment

Azhng left a comment

Azhng left a comment

rytaft left a comment

ajwerner left a comment

Azhng left a comment

mgartner commented May 25, 2021

ajwerner left a comment

Azhng left a comment

mgartner commented May 25, 2021

ajwerner left a comment

knz left a comment

Azhng left a comment

Azhng commented Jul 5, 2021

ajwerner left a comment

Azhng commented Jul 6, 2021

craig bot commented Jul 6, 2021

craig bot commented Jul 6, 2021

Azhng commented Jul 6, 2021

craig bot commented Jul 6, 2021

sql: create system tables for SQL stats #65374

sql: create system tables for SQL stats #65374

Conversation

Azhng commented May 18, 2021

cockroach-teamcity commented May 18, 2021

maryliag left a comment

Choose a reason for hiding this comment

Azhng left a comment

Choose a reason for hiding this comment

pbardea commented May 18, 2021

Azhng left a comment

Choose a reason for hiding this comment

nihalpednekar commented May 18, 2021

rytaft left a comment

Choose a reason for hiding this comment

Azhng left a comment

Choose a reason for hiding this comment

Azhng left a comment

Choose a reason for hiding this comment

rytaft left a comment

Choose a reason for hiding this comment

ajwerner left a comment

Choose a reason for hiding this comment

Azhng left a comment

Choose a reason for hiding this comment

mgartner commented May 25, 2021

ajwerner left a comment

Choose a reason for hiding this comment

Azhng left a comment

Choose a reason for hiding this comment

mgartner commented May 25, 2021

ajwerner left a comment

Choose a reason for hiding this comment

knz left a comment

Choose a reason for hiding this comment

Azhng left a comment

Choose a reason for hiding this comment

Azhng commented Jul 5, 2021

ajwerner left a comment

Choose a reason for hiding this comment

Azhng commented Jul 6, 2021

craig bot commented Jul 6, 2021

craig bot commented Jul 6, 2021

Azhng commented Jul 6, 2021

craig bot commented Jul 6, 2021