-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to cancel statement diagnostics from the frontend #74226
Comments
Ran into an issue while working on this issue. Currently, the backend cancellation API uses a statement fingerprint string provided by the frontend in a query to determine the corresponding diagnostic request ID that needs to be cancelled/expired. The frontend can sometimes provide a statement fingerprint that contains placeholders which come in the form of single-quoted underscores Alternatively, the frontend already has the diagnostic request ID and can directly pass it to the backend, removing the need to pass a statement fingerprint string, and the need to retrieve the ID via a query altogether. I'm not certain of the amount of change this would require, but under a cursory look I think it would be relatively simple. cc @yuzefovich for your thoughts on this |
I didn't realize that we already had the request ID - using that to cancel on the backend would actually be preferable over the current implementation. I too think it'll be a pretty small change. |
74816: bulk,kv: create initial splits in imports and backfills r=dt a=dt It has been observed that during a backfill or import, the split-as-we-go behavior used by the sst batcher, where it only explicitly splits and scatters after sending some amount of data all to one range, does not send any splits in the steady state when importing out of order data: since it is out of order, after chunking it by destination range, no one chunk is large enough to be considered filling that range enough to cause us to send a split. Of course, the range may fill on its own, as many processors send it small chunks over and over and those add up, but no one processor knows when that happens, so we simply leave it up to the range to split itself when it is full as it would in any other heavy write-rate, non-bulk ingestion case. However, we've seen this lead to hotspots during bulk ingestion. Until the range becomes full and decides to split, it can be a bottleneck as many processors send it data at once. Existing mechanisms for kv-driven load-based splitting are not tuned for ingestion workloads. When it does decide to split and rebalance, that is a more expensive operation, as we're now needing to read, move and ingest half of that *full* range that we just spend so much work sending data to and ingesting, all while still being sent more ingestion load. Ideally, we'd prefer to split the span before we start ingesting, both to spread ingestion load over our available capacity better, and to reduce how much ingested data it will need to shuffle if we waited to split and scatter only once a range filled. By scattering first, we'll just ingest directly to the right place initially. However a challenge in doing this is that in many cases, we don't know what data we'll be ingesting until we start reading input and producing it -- it could be that we're working off of a key-ordered CSV, or it could be that we have a uniform random distribution. In some cases, such as an index backfill, we may be able to use some external information like a SQL statistics or a sampling scan of the table to derive a good partitioning, but in others, like IMPORTs or view materialization, we have no way to know what the data will be until we start producing it. This change tries to do a bit better than not presplitting at all though, by instead using the first buffer's worth of data, if it was read out of sorted order, as if it is a representative sample of the rest of the data to come and then generating splits from it and splitting and scatter the target spans using those before proceeding to flush. If this buffer turns out not to be a good sample after all, this is no worse than before, where we pre-split not at all. If the input data was sorted, this step is unnecessary as we'll just split as we go, after each range we fill and scatter the empty remainder before moving to fill it, as we already do today. Release note (performance improvement): IMPORTs and Index Creation attempt to better balance their expected data between nodes before ingesting data. 75709: cmd/dev: check if doctor needs to be run r=dt a=dt This adds a check to all non-`doctor` dev commands that `dev doctor` has previous succeeded, so that in those commands we can assume all of the state checked or setup by `dev doctor` is correct. If the definition of `doctor` is updated, this check can also be used to force a re-run of doctor to run the new version and its new logic. The is implemented by persisting the "version" of doctor that most recently succeeded in a file (bin/.dev-status) and then comparing that to the current dev version. This also adds "setup" as an alias for "doctor" as it is now expected to be run after initial clone to "setup" the dev environment (i.e. write the status file, potentially setup githooks, etc). Release note: none. 75733: ui: add ability to cancel statement diagnostics to the frontend r=THardy98 a=THardy98 Previously, there did not exist an option to cancel a running request for statement diagnostics, this change provides a cancellation option. The statements page now has an ellipsis dropdown button where the download bundle links used to appear. The dropdown contains a cancellation button (if there is a currently waiting diagnostic request) and the download links to previously completed diagnostic requests. The statement details page now has a "Cancel request" button while a diagnostic request is not completed (i.e. in the "Waiting" status). The statement diagnostics history page on db-console also has a "Cancel Request" button while a diagnostic request is not completed (i.e. in the "Waiting" status). Added a small code change in the statement diagnostic "Activate" modal. The expiry time picker allowed for values that were <1, the code change fixes this bug. Original PR with discussions: #74818 Resolves: #74226 **CC Console Changes**: https://www.loom.com/share/36da7fa1554b45b9adac288bc3a7c23e ^ slight lag on the button updates, kind of highlights how notifications/alerts would be nice **DB Console Changes**: https://www.loom.com/share/f76694148f6f4fcdb129665eb519eb4b **Note**: this change is dependent on the merge of its [corresponding backend changes](#75732) first. Release note (ui change): Added an option to cancel a running request for statement diagnostics. Co-authored-by: David Taylor <[email protected]> Co-authored-by: Thomas Hardy <[email protected]>
Is your feature request related to a problem? Please describe.
Currently, we do not have any way to cancel already created statement diagnostics from the frontend console.
Describe the solution you'd like
We would like to add the functionality to cancel statement diagnostic requests that are still waiting on a result.
The backend API for this has already been created in #73863.
Additional context
Figma mocks: https://www.figma.com/file/xdmwvnFQd6KkO9RJ0XLDH0/22.1_SQL-obsrv_query-performance?node-id=5178%3A142577
Epic: CRDB-6619
Jira issue: CRDB-11994
The text was updated successfully, but these errors were encountered: