Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cluster-ui: use backend-supplied regions in SQL Activity #98056

Closed
matthewtodd opened this issue Mar 6, 2023 · 0 comments · Fixed by #98410
Closed

cluster-ui: use backend-supplied regions in SQL Activity #98056

matthewtodd opened this issue Mar 6, 2023 · 0 comments · Fixed by #98410
Assignees
Labels
A-sql-observability Related to observability of the SQL layer C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)

Comments

@matthewtodd
Copy link
Contributor

matthewtodd commented Mar 6, 2023

In #95449, we began populating a regions field in the statement_statistics tables to provide a stable historical view of the regions in which a statement was executed.

The frontend code currently derives regions at view time from the sibling nodeIDs field and a call to the /_status/nodes endpoint, which is insufficient for serverless given the ephemeral nature of the nodes. (That is, /_status/nodes will only return current, not historical, nodes; and node IDs can technically disappear from one region and reappear in another.)

The frontend code should instead remove all usages of the nodeRegionsById selector (and any unused supporting code) and switch to simply reading this new regions field.

This is important for correctness when 23.1 is released to serverless, and I've been considering it a GA blocker for 23.1.

Jira issue: CRDB-25053

@matthewtodd matthewtodd added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-sql-observability Related to observability of the SQL layer T-sql-observability labels Mar 6, 2023
@matthewtodd matthewtodd self-assigned this Mar 6, 2023
craig bot pushed a commit that referenced this issue Mar 14, 2023
97138: ui: add error code to stmt and txn insights details pages r=gtr a=gtr

Part of: #87785.

Previously, the stmt and txn insights details pages did not show any
further information for failed executions. This commit adds an "error
code" column to the insights table for a failed execution in the stmt
and txn insights details pages. Additionally, a "status" column was
added to the stmt and txn workload insights tables which is either
"Completed" or "Failed".

Future work involves adding the error message string in addition to the
error code but it needs to be redacted first. Additionally, the txn
status is missing the implementation of a "Cancelled" status.

Note to reviewers: only consider the second commit, as the first is 
required to get the txn status.

- Loom [demo](https://www.loom.com/share/e82b97ff9f034d82b98640170eb54408).

Release note (ui change): Adds error code column to the insights table
for a failed execution in the stmt and txn insights details page. Adds
status column to the stmt and txn workload insights tables.

98410: cluster-ui: tenants use sqlstats-supplied regions r=matthewtodd a=matthewtodd

Fixes #98056.

As of #95449, the SQL Activity pages in the DB Console can draw regions information directly from the sqlstats tables, rather than having to translate node IDs to regions on page load.
    
Here, we make that switch, but for non-system tenants only, because:
    
1. The ephemeral nature of serverless nodes made this view-time mapping especially problematic in that context. (See further notes in #95449.)
    
2. The system-tenant views also include KV node IDs in a special Regions/Nodes column, which we are unable to recreate given the backend storage structure. (Future design work might suggest removing these node IDs altogether, for a unified UI.)

# Screenshots!
## Statements, with and without regions filter
<img width="1372" alt="statements" src="https://user-images.githubusercontent.com/5261/225033247-739df90a-9173-4aab-a666-a61a1ceeb579.png">
<img width="1372" alt="statements - filtered" src="https://user-images.githubusercontent.com/5261/225033271-1c0d0f82-3dd4-48ea-bdef-11f19af97a85.png">

## Statement details
<img width="1372" alt="statement details" src="https://user-images.githubusercontent.com/5261/225033338-6dff4a6e-a4a3-48c6-863a-84f1375b0a61.png">

## Transactions, with and without regions filter
<img width="1372" alt="transactions" src="https://user-images.githubusercontent.com/5261/225033366-65f44e95-3549-47cc-b0f2-67ad48a1a1fa.png">
<img width="1372" alt="transactions - filtered" src="https://user-images.githubusercontent.com/5261/225033391-50b9a2dc-e9a1-457b-84b1-837426eba35e.png">

## Transaction details
<img width="1372" alt="transaction details" src="https://user-images.githubusercontent.com/5261/225033505-3fdeceef-35dc-4e06-af25-ab4d0c53518f.png">

Release note: None

98464: jobs,upgrades: add migration to backfill job_info table r=dt a=adityamaru

This change adds a migration and corresponding cluster version
after which every job entry in the system.jobs table will have its
Payload and Progress written to two rows in the system.job_info table.

Informs: #97762

Release note: None

98510: backupccl: update restore/nodeshutdown tests to use new roachtest framework r=adityamaru a=msbutler

The restore/nodeshutdown tests have been using a very old workload that will not be restorable when #93804 lands. This patch changes the restore/nodeshutdown workload to a 80GB tpce restore and moves the tests to run on aws instead of gcp.

Release note: None

Epic: None

98579: upgrade/upgrades: skip TestUpgradeSchemaChangerElements r=smg260 a=smg260

Refs: #98062

Reason: flaky test

Generated by bin/skip-test.

Release justification: non-production code changes

Release note: None
Epic: None

Co-authored-by: gtr <[email protected]>
Co-authored-by: Matthew Todd <[email protected]>
Co-authored-by: adityamaru <[email protected]>
Co-authored-by: Michael Butler <[email protected]>
Co-authored-by: Miral Gadani <[email protected]>
@craig craig bot closed this as completed in 7ea327a Mar 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-sql-observability Related to observability of the SQL layer C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant