Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: optimize system.statement_statistics and system.transaction_statistics for console #98882

Closed
j82w opened this issue Mar 17, 2023 · 0 comments · Fixed by #100807 or #101916
Closed
Assignees
Labels
branch-master Failures and bugs on the master branch. branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) GA-blocker

Comments

@j82w
Copy link
Contributor

j82w commented Mar 17, 2023

Problem:
system.statement_statistics and system.transaction_statistics are large tables which can take any where from several seconds to minutes to get results. This is not the ideal user experience. We need a design that allows fast response even if the users has a million rows in the system tables.

Solution:
Upon investigating it was discovered that most users don't need all the stmts/txns. Instead they only need the top 100 by a few different key columns. To take advantage of this 2 new tables should be created. They should follow a similar column structure as the existing tables. They will add new columns with indexes for the most popularly sorted fields. A new job will be created that will run once an hour and copy the top 500 stmts/txns for each of the 5 most popular columns to the new table. This way the table will only have a max of 2500 rows for a given time interval which will allow for a fast response. This will also reduce the overhead on the cluster as it will not need to query the million rows in the system table.

Jira issue: CRDB-25566

@j82w j82w added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-sql-observability Related to observability of the SQL layer T-sql-observability labels Mar 17, 2023
@j82w j82w self-assigned this Mar 17, 2023
@maryliag maryliag added branch-master Failures and bugs on the master branch. release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. GA-blocker branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 labels Mar 18, 2023
craig bot pushed a commit that referenced this issue Mar 22, 2023
98709: sql: Add optional From clause for update help msg r=chengxiong-ruan a=lyang24

Fixes: #98662


Testing: 
<img width="397" alt="Screen Shot 2023-03-15 at 1 05 17 PM" src="https://user-images.githubusercontent.com/20375035/225430872-4e5f50f3-8af1-4fd8-8c76-073c2efe12b0.png">
tested on the demo sql shell, please let me know the places i can add additional tests.

Release note (sql change): fix helper message on update sql to include optional from cause.

98885: sql: add system.statement_activity and system.transaction_activity r=j82w a=j82w

The system.statement_activity and system.transaction_activity tables 
are being added to allow the ui to get fast and reliable results of the
most important statements and transactions.

A separate PR will add a new crdb_internal view to allow non-admins
to access the data, and another PR to add a job that will run once every 
hour to add the new data and cleanup old data.

Epic: none
Part of: #98882

Release note (sql): Add system.statement_activity
and system.transaction_activity. These tables will be used to populate 
the SQL Activity pages, and will contain the top N stmts/txns based on 
different key columns.

99001: logictest: don't use hardcoded ports for mixed version test r=rafiss a=rafiss

This should resolve flakes we've seen where the server attempts to bind a port that is already in use.

informs #95247
informs #98549
informs #98612
informs #98594

We won't know if it fixes them until after CI runs for a few days.

Release note: None

Co-authored-by: Eric.Yang <[email protected]>
Co-authored-by: j82w <[email protected]>
Co-authored-by: Rafi Shamim <[email protected]>
blathers-crl bot pushed a commit that referenced this issue Mar 22, 2023
The system.statement_activity and system.transaction_activity tables
are being added to allow the ui to get fast and reliable results of the
most important statements and transactions.

A separate PR will add a new crdb_internal view to allow non-admins
to access the data, and another PR to add a job that will run once every
hour to add the new data and cleanup old data.

Epic: none
Part of: #98882

Release note (sql change): Add system.statement_activity
and system.transaction_activity. These tables will
be used to populate the SQL Activity pages, and
will contain the top N stmts/txns based on different
key columns.
@maryliag maryliag added T-cluster-observability and removed A-sql-observability Related to observability of the SQL layer T-sql-observability release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. labels Mar 27, 2023
@maryliag maryliag added release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. blocks-23.1.0-beta.1 labels Mar 27, 2023
craig bot pushed a commit that referenced this issue Mar 30, 2023
99735: sql: remove node_id column from system activity tables r=j82w a=j82w

The ui does not use the node id. Removing the node id aligns with the 
data necessary for the ui and reduces the amount of data being stored.

Epic: none
Part of: #98882

Release note: none

100111: kvserver: deflake `TestRejectedLeaseDoesntDictateClosedTimestamp` r=erikgrinaker a=erikgrinaker

And friends.

In 40cb075 we decoupled the lease duration from the Raft election timeout. Some tests still relied on the election timeout to control lease durations.

Resolves #98938.
Resolves #99639.

Epic: none
Release note: None

Co-authored-by: j82w <[email protected]>
Co-authored-by: Erik Grinaker <[email protected]>
@maryliag maryliag removed release-blocker Indicates a release-blocker. Use with branch-release-2x.x label to denote which branch is blocked. blocks-23.1.0-beta.1 labels Mar 30, 2023
craig bot pushed a commit that referenced this issue Apr 20, 2023
100807: jobs: add job to populate system activity tables r=j82w a=j82w

The system statistics table grow to large for the
ui to quickly return results. The new activity
tables job does the aggregation and only records 
the top 500 of each of the 6 columns. This means 
for a given hour there is a limit of 3000 rows. 
This allows the ui to return results fast and reliably. 

If the job detects there are less than 3k
rows it will just copy all the rows to
the activity tables.

Epic: none
closes: #98882

Release note: none

101811: kv: deflake and unskip TestRangeLocalUncertaintyLimitAfterNewLease r=andrewbaptist a=nvanbenschoten

Fixes #99527.
Fixes #100491.

This commit deflakes and unskips TestRangeLocalUncertaintyLimitAfterNewLease. The test was flaky because it was pausing the clock for some but not all nodes in the cluster. This was because it was accidentally using a 3-node cluster instead of a 2-node cluster, like it was intending.

Release note: None

101890: go.mod: bump Pebble to 6002e39ce756, enable TestCreateCheckpoint_SpanConstrained r=RaduBerinde a=RaduBerinde

#### go.mod: bump Pebble to 6002e39ce756

6002e39c db: fix bug with restricted checkpoints
7fdab075 objstorage: add mutex in sharedReadable
4bfe993a objstorage: simplify ReadAt for objects
ed8f90b0 sstable: don't try to read past end of object
bd6e947b vfs: fix MemFS ReadAt EOF behavior
914e8582 objstorage: test shared objects with TestNotExistError
101876aa *: add skip-shared iteration mode to ScanInternal
d54c3292 objstorage: fix bug in manifest
20d15bd8 tool: add EnableSharedStorage method
60cfeb46 metamorphic: export package
4e468412 sstable: include SINGLEDELs within NumDeletions table property
691db988 db: create new type for file numbers on disk
fa2c2ec6 internal/base: add parsing for human-readable INGESTSST,LOGDATA kinds
c0ccd694 internal/base: correct InternalKeyKind prefix on IngestSST key kind

Release note: None
Epic: none

#### storage: enable TestCreateCheckpoint_SpanConstrained

The problem was fixed in Pebble.

Fixes: #100935

Release note: None


101942: kvserver: always bump epoch when acquiring expired epoch lease r=erikgrinaker a=erikgrinaker

Previously, a lease acquisition only bumped the epoch of an expired epoch leaseholder if the new lease was also an epoch lease. However, we have to bump the epoch regardless of the new lease type, to properly invalidate the old leaseholder's leases. This patch bumps the epoch regardless of the new lease type.

Resolves #101836.

Epic: none
Release note: None

Co-authored-by: j82w <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Radu Berinde <[email protected]>
Co-authored-by: Erik Grinaker <[email protected]>
@craig craig bot closed this as completed in 53afe66 Apr 20, 2023
craig bot pushed a commit that referenced this issue Apr 21, 2023
101916: sql: add crdb_internal views for system activity tables r=j82w a=j82w

This commit adds two new crdb_internal views:

crdb_internal.statement_activity, which surfaces the system.statement_activity table

crdb_internal.transaction_activity, which surfaces the system.transaction_statistics table

Epic: none
closes: #98882

Release note (sql change): Added two views to the crdb_internal catalog: crdb_internal.statement_activity, which surfaces data in the persisted system.statement_activity table, and crdb_internal.transaction_activity, which surfaces the system.transaction_activity table.

Co-authored-by: j82w <[email protected]>
craig bot pushed a commit that referenced this issue Apr 21, 2023
102021: sql: fix crdb_internal statment_activity and transaction_activity tables r=j82w a=j82w

The virtual table is selecting the wrong columns 
from the system table causing it to fail. This fixes 
the column names and adds tests.

Release note: None
Part of #98882
Epic: None

Co-authored-by: j82w <[email protected]>
blathers-crl bot pushed a commit that referenced this issue May 10, 2023
The virtual table is selecting the wrong columns from the system
table causing it to fail. This fixes the column names and adds tests.

Release note: None
Part of #98882
Epic: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
branch-master Failures and bugs on the master branch. branch-release-23.1 Used to mark GA and release blockers, technical advisories, and bugs for 23.1 C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) GA-blocker
Projects
None yet
3 participants