-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jobs: avoid CTEs in crdb_internal.system_jobs query #123848
Conversation
it looks like the roundtrips went down, but the latency went up. perhaps the query plan is actually less computationally efficient, even though it performs less I/O? i'd guess there's some threshold at which latency is actually better the new way, and maybe in a multiregion cluster it could matter even more.
|
Yeah, I think where we are hoping this helps substantially is in the case where we have a job_type or job_status filter that filters out a high percentage of rows. I'll add a test that captures whether it does or not. It seems to in some local testing. |
Part of what is going on here is that the current tests are using query plans that are created in the absence of stats. If I add an I'll add some commits to update the existing tests to better reflect reality so we can compare against that. My belief here is that we should really only see much of a benefit here for queries that have a filter. |
I pushed some commits that looks at the queries in the presence of stats and now the diff matches my expectations a lot better. Namely, queries that still have to scan most of the job_info rows aren't much better, but queries with high selectivity in there where clause get a good deal better:
|
The CTE in the query used for crdb_internal.system_jobs can prevent a number of useful query optimizations. Informs cockroachdb#122687 Release note (performance improvement): Further improves the performance of job-system related queries.
Epic: None Release note: None
@dt I've removed the deeper backports tags from this given our uncertainty about how well this query fares in the absence of query statistics in older releases. Note that I think this probably accounts for most of the variance I was seeing when trying to convert this to a virtual view. So, once this is in I think it makes sense to rebase that PR on top of this and see if the virtual view no longer produces a regression. |
bors r=dt |
The CTE in the query used for crdb_internal.system_jobs can prevent a
number of useful query optimizations.
Informs #122687
Release note (performance improvement): Further improves the
performance of job-system related queries.