Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cluster-ui: move database details endpoint to use sql-over-http #93209

Merged
merged 3 commits into from
Mar 15, 2023

Conversation

THardy98
Copy link

@THardy98 THardy98 commented Dec 7, 2022

Part of: #89429
Addresses: #90261

This change moves the existing database details endpoint to use sql-over-http.

The data for database details is populated using 7 queries, which fetch
the database's:

  • ID
  • grants
  • tables
  • replicas and regions
  • index usage statistics
  • zone configuration
  • span statistics

This PR also changes the structure of the database details response.
Each field within the overlying database details response encapsulates
the response of one of the queries. Query failures are captured at the
top-level of the response and within the corresponding response field
for the query (scoping query failures to each transaction).

Additionally, we no longer fetch missing_tables as we are no longer fetching table stats per table, but in a single query.

DEMO
https://www.loom.com/share/3a51bf5a8f3c41089fc5acae9ff29fcf

Release note: None

@THardy98 THardy98 requested a review from a team December 7, 2022 19:06
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@THardy98 THardy98 force-pushed the sql_api_database_details branch from d2b6166 to 867fe2f Compare December 7, 2022 19:26
@THardy98 THardy98 marked this pull request as draft December 19, 2022 14:06
@THardy98
Copy link
Author

Converted to draft due to upcoming changes for SHOW RANGES: #93644

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice.

Reviewed 4 of 4 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @THardy98)


pkg/ui/workspaces/cluster-ui/src/api/databaseDetailsApi.ts line 175 at r1 (raw file):

): DatabaseDetailsQuery<DatabaseRangesRow> => {
  const stmt: SqlStatement = {
    sql: `SELECT

if you want this to be robust and version-independent, you could expand it into a join across crdb_internal.tables, using crdb_internal.table_span, then join on crdb_internal.ranges_no_leases, then call crdb_internal.range_stats manually to get the size.

or use SHOW RANGES FROM DATABASE when it's there.


pkg/ui/workspaces/cluster-ui/src/api/util.ts line 50 at r1 (raw file):

//   "region=us-east3,az=d"
// ]
export function parseReplicaLocalities(localities: string[]): string[] {

Ask Eric to teach you about SQL unnest and split_part, he got to it recently and this seems to be relevant here.

@THardy98 THardy98 force-pushed the sql_api_database_details branch from 867fe2f to 229d2e7 Compare January 12, 2023 21:25
Copy link
Author

@THardy98 THardy98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz)


pkg/ui/workspaces/cluster-ui/src/api/databaseDetailsApi.ts line 175 at r1 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

if you want this to be robust and version-independent, you could expand it into a join across crdb_internal.tables, using crdb_internal.table_span, then join on crdb_internal.ranges_no_leases, then call crdb_internal.range_stats manually to get the size.

or use SHOW RANGES FROM DATABASE when it's there.

Followed your first suggestion. Let me know if this is accurate to what you had in mind. Particularly uncertain with joining on the range's start key with crdb_internal.table_span(t.table_id)[1].

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @THardy98)


pkg/ui/workspaces/cluster-ui/src/api/databaseDetailsApi.ts line 175 at r1 (raw file):

Previously, THardy98 (Thomas Hardy) wrote…

Followed your first suggestion. Let me know if this is accurate to what you had in mind. Particularly uncertain with joining on the range's start key with crdb_internal.table_span(t.table_id)[1].

yeah that condition isn't sufficient.
Run SHOW RANGES FROM DATABASE xxx WITH EXPLAIN in a shell and look at the generated SQL. It will tell you what you need.


pkg/ui/workspaces/cluster-ui/src/api/util.ts line 50 at r1 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

Ask Eric to teach you about SQL unnest and split_part, he got to it recently and this seems to be relevant here.

Still applicable.

@THardy98 THardy98 force-pushed the sql_api_database_details branch 2 times, most recently from 3239dda to b2f9c30 Compare January 13, 2023 16:47
Copy link
Author

@THardy98 THardy98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz)


pkg/ui/workspaces/cluster-ui/src/api/databaseDetailsApi.ts line 175 at r1 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

yeah that condition isn't sufficient.
Run SHOW RANGES FROM DATABASE xxx WITH EXPLAIN in a shell and look at the generated SQL. It will tell you what you need.

Used the join condition from the EXPLAIN, joining on table_spans and the start/end key conditionals between table_spans and ranges_no_leases.

Aside, is calling range_stats on just the range's start key correct to get the range's statistics?


pkg/ui/workspaces/cluster-ui/src/api/util.ts line 50 at r1 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

Still applicable.

Using unnest and split_part to parse regions, return as an ARRAY.

@THardy98 THardy98 force-pushed the sql_api_database_details branch from b2f9c30 to 647d2fe Compare January 13, 2023 17:01
@THardy98 THardy98 requested a review from knz January 16, 2023 14:24
Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 2 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @THardy98)


pkg/ui/workspaces/cluster-ui/src/api/databaseDetailsApi.ts line 175 at r1 (raw file):

Aside, is calling range_stats on just the range's start key correct to get the range's statistics?

yes.
But it gives you range statistics. Not table statistics.
So this means this way to gather size information is incorrect if your goal is to get per-table size estimates. See: https://cockroachlabs.atlassian.net/browse/CRDB-22711


pkg/ui/workspaces/cluster-ui/src/api/databaseDetailsApi.ts line 187 at r4 (raw file):

            coalesce((crdb_internal.range_stats(s.start_key)->>'range_val_bytes')::INT, 0) AS range_size
          FROM crdb_internal.tables as t
          JOIN ${dbName}.crdb_internal.table_spans as s ON s.descriptor_id = t.table_id

I'm not familiar with this syntax ${dbName}. What does it do?

Copy link
Author

@THardy98 THardy98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz)


pkg/ui/workspaces/cluster-ui/src/api/databaseDetailsApi.ts line 175 at r1 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

Aside, is calling range_stats on just the range's start key correct to get the range's statistics?

yes.
But it gives you range statistics. Not table statistics.
So this means this way to gather size information is incorrect if your goal is to get per-table size estimates. See: https://cockroachlabs.atlassian.net/browse/CRDB-22711

Yes, but as I understand it, there currently is no alternative to get per-table size estimates. Consequently, I decided to follow the approach on the existing structured endpoint. I suppose the alternative is to simply omit this data but not sure if that would reflect well for UX if we remove metrics that existed on former versions.


pkg/ui/workspaces/cluster-ui/src/api/databaseDetailsApi.ts line 187 at r4 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

I'm not familiar with this syntax ${dbName}. What does it do?

The ${...} syntax is shorthand for string formatting, it adds the variable string value dbName to the query string. Similar in function to fmt.Sprintf. With that said, I feel like this isn't particularly safe practice, but I'm not sure how to achieve the same thing without string formatting (as SQL arguments in this case do not work).

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @j82w and @THardy98)


pkg/ui/workspaces/cluster-ui/src/api/databaseDetailsApi.ts line 175 at r1 (raw file):

Previously, THardy98 (Thomas Hardy) wrote…

Yes, but as I understand it, there currently is no alternative to get per-table size estimates. Consequently, I decided to follow the approach on the existing structured endpoint. I suppose the alternative is to simply omit this data but not sure if that would reflect well for UX if we remove metrics that existed on former versions.

Yes it's ok to keep this as-is for now but with the knowledge it's broken. I encourage you to try this manually (make two tables: one with little/no data and the other one with more data, e.g. via generate_series, and then look at the result) and report what you find to your team for tracking.


pkg/ui/workspaces/cluster-ui/src/api/databaseDetailsApi.ts line 187 at r4 (raw file):

Previously, THardy98 (Thomas Hardy) wrote…

The ${...} syntax is shorthand for string formatting, it adds the variable string value dbName to the query string. Similar in function to fmt.Sprintf. With that said, I feel like this isn't particularly safe practice, but I'm not sure how to achieve the same thing without string formatting (as SQL arguments in this case do not work).

It's super important that you figure this out. Also ensure there is a unit test that uses a database name containing a space.
I know that @j82w looked at this area recently.

I suspect we already have something elsewhere in the UI code; it's should be a common problem? Worth asking on obs-infra.

Copy link
Author

@THardy98 THardy98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @j82w)


pkg/ui/workspaces/cluster-ui/src/api/databaseDetailsApi.ts line 187 at r4 (raw file):

Previously, knz (Raphael 'kena' Poss) wrote…

It's super important that you figure this out. Also ensure there is a unit test that uses a database name containing a space.
I know that @j82w looked at this area recently.

I suspect we already have something elsewhere in the UI code; it's should be a common problem? Worth asking on obs-infra.

Talked with Jake about it and opened discussion for this on the obs-infra channel.

@THardy98 THardy98 force-pushed the sql_api_database_details branch 3 times, most recently from 826e464 to 57a5b4f Compare February 13, 2023 14:34
Copy link
Author

@THardy98 THardy98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @j82w and @knz)


pkg/ui/workspaces/cluster-ui/src/api/databaseDetailsApi.ts line 187 at r4 (raw file):

Previously, THardy98 (Thomas Hardy) wrote…

Talked with Jake about it and opened discussion for this on the obs-infra channel.

Ported the safesql package from managed-service to cluster-ui (#96286), using it to safely quote identifiers.

@THardy98 THardy98 force-pushed the sql_api_database_details branch 5 times, most recently from e51da02 to 0566eef Compare February 15, 2023 16:29
@THardy98 THardy98 force-pushed the sql_api_database_details branch 2 times, most recently from beed3af to d359688 Compare February 24, 2023 21:16
@THardy98 THardy98 force-pushed the sql_api_database_details branch 2 times, most recently from 4647493 to ec52ed3 Compare March 14, 2023 14:10
@THardy98
Copy link
Author

THardy98 commented Mar 14, 2023

pkg/ui/workspaces/cluster-ui/src/api/databaseDetailsApi.ts line 326 at r14 (raw file):

Previously, THardy98 (Thomas Hardy) wrote…
Is it guaranteed that there will only be 1 row? There is no loop and it's only accessing the first element here:

resp.stats.ranges_data.regions = txn_result.rows[0].regions;
resp.stats.ranges_data.node_ids = txn_result.rows[0].replicas;

This is what it looks like for the rides table on demo with --nodes 9 --global

[email protected]:26257/movr> select * from crdb_internal.table_spans where descriptor_id = (SELECT 'rides'::regclass::oid);
  descriptor_id | start_key | end_key
----------------+-----------+----------
            108 | \xf4      | \xf5
(1 row)

Time: 203ms total (execution 203ms / network 0ms)

[email protected]:26257/movr> select replica_localities from crdb_internal.ranges_no_leases as r where '\xf4' < r.end_key AND '\xf5' > r.start_key;
                              replica_localities
------------------------------------------------------------------------------
  {"region=us-east1,az=d","region=us-west1,az=a","region=europe-west1,az=c"}
  {"region=us-east1,az=d","region=us-west1,az=a","region=europe-west1,az=d"}
  {"region=us-east1,az=d","region=us-west1,az=a","region=europe-west1,az=b"}
  {"region=us-east1,az=b","region=us-west1,az=a","region=europe-west1,az=b"}
  {"region=us-east1,az=d","region=us-west1,az=b","region=europe-west1,az=b"}
  {"region=us-east1,az=d","region=us-west1,az=b","region=europe-west1,az=d"}
  {"region=us-east1,az=d","region=us-west1,az=b","region=europe-west1,az=b"}
  {"region=us-east1,az=c","region=us-west1,az=b","region=europe-west1,az=c"}
  {"region=us-east1,az=b","region=us-west1,az=b","region=europe-west1,az=c"}
(9 rows)

Time: 206ms total (execution 205ms / network 0ms)

[email protected]:26257/movr> WITH
                        ->           replicasAndregions as (
                        ->               SELECT
                        ->                 r.replicas,
                        ->                 ARRAY(SELECT DISTINCT split_part(split_part(unnest(replica_localities),',',1),'=',2)) as regions
                        ->               FROM crdb_internal.tables as t
                        ->                      JOIN movr.crdb_internal.table_spans as s ON s.descriptor_id = t.table_id
                        ->              JOIN crdb_internal.ranges_no_leases as r ON s.start_key < r.end_key AND s.end_key > r.start_key
                        ->            WHERE t.database_name = 'movr'
                        ->           ),
                        ->           unique_replicas AS (SELECT array_agg(distinct(unnest(replicas))) as replicas FROM replicasAndRegions),
                        ->           unique_regions AS (SELECT array_agg(distinct(unnest(regions))) as regions FROM replicasAndRegions)
                        ->           SELECT replicas, regions FROM unique_replicas CROSS JOIN unique_regions;
       replicas       |             regions
----------------------+-----------------------------------
  {1,6,7,3,8,4,9,2,5} | {us-east1,us-west1,europe-west1}

@THardy98
Copy link
Author

THardy98 commented Mar 14, 2023

I've removed the code that adds the VIEWACTIVITY and VIEWACTIVITYREDACTED permissions to ranges_no_leases in a separate PR with a backport: #98535

@THardy98
Copy link
Author

TYFR :)
going to bors this!

@THardy98
Copy link
Author

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 14, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Mar 14, 2023

Build failed (retrying...):

@THardy98
Copy link
Author

bors r-

@craig
Copy link
Contributor

craig bot commented Mar 14, 2023

Canceled.

@THardy98
Copy link
Author

bors r+

@THardy98
Copy link
Author

bors r-

@craig
Copy link
Contributor

craig bot commented Mar 14, 2023

Canceled.

Thomas Hardy added 2 commits March 14, 2023 21:51
This change establishes preliminary work to move the existing database
details endpoint to use sql-over-http, making it tenant-scoped.
Currently, this change is missing a couple existing fields:
- `approximate_disk_bytes`, used to display the size of the database
- `node_ids`, used to display the nodes that the database uses

The data for database details is populated using 5 queries, which fetch
the database's:
- ID
- grants
- tables
- range statistics
- index usage statistics

This PR also changes the structure of the database details response.
Each field within the overlying database details response encapsulates
the response of one of the queries. Query failures are captured at the
top-level of the response and within the corresponding response field
for the query (scoping query failures to each transaction).

Release note: None
This change establishes preliminary work to move the existing database
details endpoint to use sql-over-http, making it tenant-scoped.
Currently, this change is missing a couple existing fields:
- `approximate_disk_bytes`, used to display the size of the database
- `node_ids`, used to display the nodes that the database uses

The data for database details is populated using 5 queries, which fetch
the database's:
- ID
- grants
- tables
- range statistics
- index usage statistics

This PR also changes the structure of the database details response.
Each field within the overlying database details response encapsulates
the response of one of the queries. Query failures are captured at the
top-level of the response and within the corresponding response field
for the query (scoping query failures to each transaction).

Release note: None
@THardy98
Copy link
Author

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 15, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Mar 15, 2023

Build failed:

This change moves the existing database
details endpoint to use sql-over-http.

The data for database details is populated using 7 queries, which fetch
the database's:
- ID
- grants
- tables
- replicas and regions
- index usage statistics
- zone configuration
- span statistics

This PR also changes the structure of the database details response.
Each field within the overlying database details response encapsulates
the response of one of the queries. Query failures are captured at the
top-level of the response and within the corresponding response field
for the query (scoping query failures to each transaction).

Release note: None
@THardy98 THardy98 force-pushed the sql_api_database_details branch from ec52ed3 to 51ccd38 Compare March 15, 2023 19:15
@THardy98
Copy link
Author

bors r+

@craig
Copy link
Contributor

craig bot commented Mar 15, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Mar 15, 2023

Build succeeded:

@craig craig bot merged commit 77a55f9 into cockroachdb:master Mar 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants