-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stmtdiagnostics: implement range feed on system.statement_diagnostics_requests to reduce latency #47893
Comments
@andreimatei and @ajwerner discussed this quite a bit during the initial implementation. The gossip solution is seen as temporary indeed. |
Yep, this is on my list. |
Worst case we disable the gossip without anything better initially and clients will be exposed to a bit of extra latency when requesting statement diagnostics. |
@ajwerner were you planning on touching this (I assume not anytime soon) and is this good enough for now as-is? We are nominally considering this a blocker still but it doesn't seem to be. |
As you touch this issue, please also put it in the appropriate project. I would put it in SQL-Execution but I don't know if they own this. |
Do we need to ensure that the gossip isn't called or does it just no-op? If it no-ops then we're good and don't need to do anything. We've not exposing the statements page for tenants right now anyway and even if we were, they poll periodically so it's just higher latency |
From my reading of the stmtdiag code, it will return an error (unsupported w/ multi-tenancy) which will be returned to the requester of the stmt diag report. This is fine, so no need to do anything right now. |
To fix this, we can use a range feed on the dignostic requests table. |
We have libraries now to make this pretty plug-n-play: cockroach/pkg/kv/kvclient/rangefeed/rangefeedcache/watcher.go Lines 35 to 64 in 0bd68bd
This guy for example maintains such a feed (in the tenant pod, over a tenant system table): cockroach/pkg/server/systemconfigwatcher/cache.go Lines 29 to 33 in 69e48ba
|
I added the label because I see this in the code: cockroach/pkg/sql/sqltestutils/telemetry.go Line 107 in 451d761
|
I don't know if that code is referencing the correct issue. If not, feel free to create a separate issue for tracking the skipped test. |
I see, thanks. This issue originally has been about the support of stmt diagnostics in secondary tenants, but #83547 added the support with some caveats, so this issue was repurposed to be about optimizing the stmt diagnostics feature. I'll remove that skip. |
107493: ui,build: push cluster-ui assets into external folder during watch mode r=nathanstilwell a=sjbarag Previously, watch mode builds of cluster-ui (e.g. 'dev ui watch' or 'pnpm build:watch') would emit files only to pkg/ui/workspaces/cluster-ui/dist. Using that output in a watch task of a private repo required setting up symlinks via a 'make' task[1]. Unfortunately, that symlink made it far too easy for the node module resolution algorithm in that private repo to follow the symlink back to cockroach.git, which gave that project access to the modules in pkg/ui/node_modules/ and pkg/ui/workspaces/cluster-ui/node_modules. This resulted in webpack finding multiple copies of react-router (which expects to be a singleton), typescript finding multiple incompatible versions of react, etc. Unfortunately, webpack doesn't support multiple output directories natively. Add a custom webpack plugin that copies emitted files to an arbitrary number of output directories. [1] pnpm link doesn't work due to some package-name aliasing we've got going on there. Release note: None Epic: none 107555: sql: remove stale skip in TestTelemetry r=yuzefovich a=yuzefovich This commit unskips multiple telemetry tests that were skipped for no good reason (they were referencing an unrelated issue). This uncovered some bugs in the new schema changer telemetry reporting where we duplicated `_index` twice in the feature counter for inverted indexes. Also, `index` telemetry test contained an invalid statement which is now removed. The only file that is still skipped is `sql-stats` where the output doesn't match the expectations, and I'm not sure whether the test is stale or something is broken, so a separate issue was filed. Addresses: #47893. Epic: None. Release note: None 107597: builtins: force production values in TestSerialNormalizationWithUniqueUnorderedID r=yuzefovich a=yuzefovich We've observed that if `batch-bytes-limit` value is set too low, then the "key counts" query in this test takes much longer (on my laptop it was 60s for a particular random seed vs 2.4s with production values), so this commit forces some production values. Fixes: #106829. Release note: None Co-authored-by: Sean Barag <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]>
The issue about implementing the rangefeed on reduce latency is still present. |
We should implement the range feed on
system.statement_diagnostics_requests
table in order to remove the "polling" that currently happens every 10s. See this comment for some pointers.Jira issue: CRDB-4382
Epic CRDB-18185
The text was updated successfully, but these errors were encountered: