Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: improve query plan for find-running-job-of-type query #107589

Merged
merged 1 commit into from
Oct 9, 2023

Conversation

mgartner
Copy link
Collaborator

@mgartner mgartner commented Jul 26, 2023

This commit improves the query plan for the find-running-jobs-of-type
query by adding a hint to use the jobs_status_created_idx and by
removing an ORDER BY clause if a job ID to ignore was not given. This
can eliminate an index join from the query plan in some cases, making
the query plan more efficient.

Informs #107405

Release note: None

@mgartner mgartner requested review from dt and rytaft July 26, 2023 00:01
@mgartner mgartner requested review from a team as code owners July 26, 2023 00:01
@blathers-crl
Copy link

blathers-crl bot commented Jul 26, 2023

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@mgartner mgartner requested a review from a team July 26, 2023 00:01
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@mgartner
Copy link
Collaborator Author

Some unanswered questions (perhaps someone has some insight to provide here):

  1. How often is this query run without an ignoreJobID. It seems like it could be fairly often for the pathological case described in sql: do not collect stats so frequently for small tables with lots of updates #107559 which motivated sql: optimize query for find-running-jobs-of-type #107405.
  2. Do we really need to ignore all jobs after the ignoreJobID job? What is the reason for this?
  3. In the context of auto-stats collections, an ignoreJobID is only provided in the createStatsResumer.Resume method. Why does this need to ignore the job's ID - if the job is already running shouldn't we avoid resuming it? If so, then we can remove this ignoreJobID entirely for auto-stats jobs.

@mgartner mgartner requested a review from DrewKimball July 26, 2023 00:05
Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not totally convinced the plan here will be better - what if the full-table scan increases contention between auto-stats jobs and other job types? Plus, since the scan is ordered by unique_rowid I guess we expect to find the right row ~halfway through? So we can't expect the limit to be satisfied quickly.

Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @dt and @rytaft)

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Some unanswered questions (perhaps someone has some insight to provide here):

  1. How often is this query run without an ignoreJobID. It seems like it could be fairly often for the pathological case described in sql: do not collect stats so frequently for small tables with lots of updates #107559 which motivated sql: optimize query for find-running-jobs-of-type #107405.

If a stats job is never running when we try to create a new auto stats job, this query will be run without an ignoreJobID half the time. Otherwise, if there is often a conflicting stats job, the query will be run more frequently without an ignoreJobID. I'm not sure the case mentioned in #107559 makes as big of a difference as you'd think, as those jobs typically finish very quickly.

  1. Do we really need to ignore all jobs after the ignoreJobID job? What is the reason for this?

The reason is that if multiple stats jobs get started at once, we want exactly one of them to win, and we want to cancel the other jobs. This mechanism with ignoreJobID and the ORDER BY created LIMIT 1 clause is how we ensure that happens. This is needed because of a limitation in the way the jobs system works (or at least the way it worked when I originally wrote this code) -- you can't conditionally start a job, so there is a race condition where multiple stats jobs can start at the same time. If we added support for conditionally starting a job, we wouldn't need to query the jobs table twice for each stats job.

  1. In the context of auto-stats collections, an ignoreJobID is only provided in the createStatsResumer.Resume method. Why does this need to ignore the job's ID - if the job is already running shouldn't we avoid resuming it? If so, then we can remove this ignoreJobID entirely for auto-stats jobs.

I don't think I understand this question. createStatsResumer.Resume is the method used for running the job for the first time as well as running it after it was paused. Maybe my answer to the previous question helps...

I'm still not totally convinced the plan here will be better - what if the full-table scan increases contention between auto-stats jobs and other job types? Plus, since the scan is ordered by unique_rowid I guess we expect to find the right row ~halfway through? So we can't expect the limit to be satisfied quickly.

How could the plan be worse? We still have a limit clause, we're just removing the ORDER BY restriction.

Reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @dt and @mgartner)


-- commits line 7 at r1:
nit: query play -> query plan

@DrewKimball
Copy link
Collaborator

How could the plan be worse? We still have a limit clause, we're just removing the ORDER BY restriction.

We switch from using a constrained scan + index-join to a full scan.

@rytaft
Copy link
Collaborator

rytaft commented Jul 26, 2023

Maybe this depends on the stats on the jobs table (since we are now collecting stats)?

@rytaft
Copy link
Collaborator

rytaft commented Jul 26, 2023

Oh I guess it's because of the soft limit

@DrewKimball
Copy link
Collaborator

Removing the index-join and replacing with a full-scan was the point of the optimization since an index-join over many rows takes a lot of CPU - I'm just not sure it's a strict improvement.

@DrewKimball
Copy link
Collaborator

I notice the old version of this query didn't filter by the job type - it fetched all live jobs ordered by creation time. If going from that to the current formulation increased CPU, I think that supports the hypothesis that filtering by live jobs before job type will reduce the rows scanned, at least for this pathological case.

Copy link
Contributor

@HonoreDB HonoreDB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When there's no job id to ignore, maybe it'd be more efficient to do

SELECT true WHERE EXISTS (SELECT * FROM system.jobs WHERE 	job_type IN (...) AND
  status IN (...);

(I don't actually know if there's anything this lets the optimizer do that it can't just do with a LIMIT 1 without an ORDER.)

Reviewed all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @dt and @mgartner)

@mgartner
Copy link
Collaborator Author

Some additional context that I should have mentioned: The goal of this PR is to be a backportable performance improvement for this query. The best way to improve the query plan is to add a new index to the system.jobs table, but we don't have a way to backport new indexes to system tables.

I'm still not totally convinced the plan here will be better - what if the full-table scan increases contention between auto-stats jobs and other job types?

Good point. In theory the optimizer doesn't have to pick a full table scan - it could still use the same index, though that would require an index join still. But in v23.1, I think the optimizer will only pick a full table scan because we don't collect stats on the system.jobs table, and histograms would be required to get a selectivity accurate enough to determine the number of jobs that match the given job_type.

So, maybe there's nothing we can backport here, and v23.1 is stuck with this inefficiency?

@rytaft
Copy link
Collaborator

rytaft commented Jul 26, 2023

 But in v23.1, I think the optimizer will only pick a full table scan because we don't collect stats on the system.jobs table

I don't think this is accurate -- I believe we collect stats on system.jobs in 23.1. But we might still choose a full table scan because of the soft limit hint.

You could always add a query hint if it would help...

@mgartner
Copy link
Collaborator Author

I don't think this is accurate -- I believe we collect stats on system.jobs in 23.1. But we might still choose a full table scan because of the soft limit hint.

Yes, found the PR: #102637. So maybe this would be an ok change then?

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well it looks like #102637 was not actually successful since I failed to remove this check:

if tableDesc.GetID() == keys.JobsTableID {
return nil, pgerror.New(
pgcode.WrongObjectType, "cannot create statistics on system.jobs",
)
}

I'll open another PR to make that change on master. At this point it's too late to backport. 🤦‍♀️

So I think this change to remove ORDER BY when it's not needed is a good, backportable change, but I also think we should put a hint in for @jobs_status_created_idx so that we go back to using the job status rather than job type to constrain the scan (as @DrewKimball suggested above).

Just so we're all clear on what the plans look like:

With ORDER BY and hint:

EXPLAIN SELECT id FROM system.jobs@jobs_status_created_idx WHERE job_type IN ('CREATE STATS','AUTO CREATE STATS') AND status IN ('running','pending','cancel-requested','pause-requested','reverting','paused') ORDER BY created LIMIT 1;
                                                                                   info

\---------------------------------------------------------------------------------------------------------------------------------------------------------------------------

  distribution: local

  vectorized: true

  

  • top-k

  │ order: +created

  │ k: 1

  │

  └── • filter

      │ filter: job\_type IN ('AUTO CREATE STATS', 'CREATE STATS')

      │

      └── • index join

          │ table: jobs@primary

          │

          └── • scan

                missing stats

                table: jobs@jobs\_status\_created\_idx

                spans: \[/'cancel-requested' - /'cancel-requested'\] \[/'pause-requested' - /'pause-requested'\] \[/'paused' - /'paused'\] \[/'pending' - /'pending'\] … (2 more)

(17 rows)

With hint but without ORDER BY:

EXPLAIN SELECT id FROM system.jobs@jobs_status_created_idx WHERE job_type IN ('CREATE STATS','AUTO CREATE STATS') AND status IN ('running','pending','cancel-requested','pause-requested','reverting','paused') LIMIT 1;
                                                                                   info

\---------------------------------------------------------------------------------------------------------------------------------------------------------------------------

  distribution: local

  vectorized: true

  

  • limit

  │ count: 1

  │

  └── • filter

      │ filter: job\_type IN ('AUTO CREATE STATS', 'CREATE STATS')

      │

      └── • index join

          │ table: jobs@primary

          │

          └── • scan

                missing stats

                table: jobs@jobs\_status\_created\_idx

                spans: \[/'cancel-requested' - /'cancel-requested'\] \[/'pause-requested' - /'pause-requested'\] \[/'paused' - /'paused'\] \[/'pending' - /'pending'\] … (2 more)

(16 rows)

These plans are better than the current ones without the hint.
With ORDER BY and without hint:

EXPLAIN SELECT id FROM system.jobs WHERE job_type IN ('CREATE STATS','AUTO CREATE STATS') AND status IN ('running','pending','cancel-requested','pause-requested','reverting','paused') ORDER BY created LIMIT 1;
                                                      info

\----------------------------------------------------------------------------------------------------------------

  distribution: local

  vectorized: true

  

  • top-k

  │ order: +created

  │ k: 1

  │

  └── • filter

      │ filter: status IN ('cancel-requested', 'pause-requested', 'paused', 'pending', 'reverting', 'running')

      │

      └── • index join

          │ table: jobs@primary

          │

          └── • scan

                missing stats

                table: jobs@jobs\_job\_type\_idx

                spans: \[/'AUTO CREATE STATS' - /'AUTO CREATE STATS'\] \[/'CREATE STATS' - /'CREATE STATS'\]

(17 rows)

Without ORDER BY and without hint:

EXPLAIN SELECT id FROM system.jobs WHERE job_type IN ('CREATE STATS','AUTO CREATE STATS') AND status IN ('running','pending','cancel-requested','pause-requested','reverting','paused') LIMIT 1;
                                                                                   info

\--------------------------------------------------------------------------------------------------------------------------------------------------------------------------

  distribution: local

  vectorized: true

  

  • limit

  │ count: 1

  │

  └── • filter

      │ filter: (job\_type IN ('AUTO CREATE STATS', 'CREATE STATS')) AND (status IN ('cancel-requested', 'pause-requested', 'paused', 'pending', 'reverting', 'running'))

      │

      └── • scan

            missing stats

            table: jobs@primary

            spans: FULL SCAN (SOFT LIMIT)

(13 rows)

Regarding @HonoreDB's suggestion:

When there's no job id to ignore, maybe it'd be more efficient to do

SELECT true WHERE EXISTS (SELECT * FROM system.jobs WHERE 	job_type IN (...) AND
 status IN (...);

(I don't actually know if there's anything this lets the optimizer do that it can't just do with a LIMIT 1 without an ORDER.)

Unfortunately this doesn't improve the plan.
With hint:

EXPLAIN SELECT true WHERE EXISTS (SELECT * FROM system.jobs@jobs_status_created_idx WHERE job_type IN ('CREATE STATS','AUTO CREATE STATS') AND status IN ('running','pending','cancel-requested','pause-requested','reverting','paused'));
                                                                                                                   info

\------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

  distribution: local

  vectorized: true

  

  • root

  │

  ├── • render

  │   │

  │   └── • filter

  │       │ estimated row count: 0

  │       │ filter: COALESCE(@S1, false)

  │       │

  │       └── • emptyrow

  │

  └── • subquery

      │ id: @S1

      │ original sql: (SELECT \* FROM system.jobs@jobs\_status\_created\_idx WHERE (job\_type IN ('CREATE STATS', 'AUTO CREATE STATS')) AND (status IN ('running', 'pending', 'cancel-requested', 'pause-requested', 'reverting', 'paused')))

      │ exec mode: one row

      │

      └── • render

          │

          └── • limit

              │ count: 1

              │

              └── • filter

                  │ filter: job\_type IN ('AUTO CREATE STATS', 'CREATE STATS')

                  │

                  └── • index join

                      │ table: jobs@primary

                      │

                      └── • scan

                            missing stats

                            table: jobs@jobs\_status\_created\_idx

                            spans: \[/'cancel-requested' - /'cancel-requested'\] \[/'pause-requested' - /'pause-requested'\] \[/'paused' - /'paused'\] \[/'pending' - /'pending'\] … (2 more)

(33 rows)

Without hint:

EXPLAIN SELECT true WHERE EXISTS (SELECT * FROM system.jobs WHERE job_type IN ('CREATE STATS','AUTO CREATE STATS') AND status IN ('running','pending','cancel-requested','pause-requested','reverting','paused'));
                                                                                                       info

\------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

  distribution: local

  vectorized: true

  

  • root

  │

  ├── • render

  │   │

  │   └── • filter

  │       │ estimated row count: 0

  │       │ filter: COALESCE(@S1, false)

  │       │

  │       └── • emptyrow

  │

  └── • subquery

      │ id: @S1

      │ original sql: (SELECT \* FROM system.jobs WHERE (job\_type IN ('CREATE STATS', 'AUTO CREATE STATS')) AND (status IN ('running', 'pending', 'cancel-requested', 'pause-requested', 'reverting', 'paused')))

      │ exec mode: one row

      │

      └── • render

          │

          └── • limit

              │ count: 1

              │

              └── • filter

                  │ filter: (job\_type IN ('AUTO CREATE STATS', 'CREATE STATS')) AND (status IN ('cancel-requested', 'pause-requested', 'paused', 'pending', 'reverting', 'running'))

                  │

                  └── • scan

                        missing stats

                        table: jobs@primary

                        spans: FULL SCAN (SOFT LIMIT)

(30 rows)

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @dt and @mgartner)

@mgartner
Copy link
Collaborator Author

So I think this change to remove ORDER BY when it's not needed is a good, backportable change, but I also think we should put a hint in for @jobs_status_created_idx so that we go back to using the job status rather than job type to constrain the scan (as @DrewKimball suggested #107589 (comment)).

The jobs.RunningJobExists function is also used by the span config manager here:

exists, err := jobs.RunningJobExists(ctx, jobspb.InvalidJobID, txn, m.settings.Version,

I'm a bit worried to add a hint in that case, so maybe I'll only add the hint for auto-stats jobs since we know there should only ever be 1 running auto-stats jobs. Does that sound good to you?

@mgartner
Copy link
Collaborator Author

Hmm, actually, can we be sure that the number of running jobs will also be sufficiently low for the hint to be optimal? I'm not confident in that.

@DrewKimball
Copy link
Collaborator

Hmm, actually, can we be sure that the number of running jobs will also be sufficiently low for the hint to be optimal? I'm not confident in that.

That's a good point... the current plan at least ensures that contention should be limited to the auto-stats rows - what if scanning the live jobs first instead impacts other types of jobs?

@rytaft
Copy link
Collaborator

rytaft commented Jul 27, 2023

The hint is consistent with the fact that in versions prior to 23.1, we always used only the job status to constrain the scan. We were also intending to use the hint with 23.1 until I thought that I had enabled stats here: a6e2818

@stevendanna
Copy link
Collaborator

Re the discussion around stats collection on system.jobs? The code here:

if tableDesc.GetID() == keys.JobsTableID {
return nil, pgerror.New(
pgcode.WrongObjectType, "cannot create statistics on system.jobs",
)
}
if tableDesc.GetID() == keys.ScheduledJobsTableID {
return nil, pgerror.New(
pgcode.WrongObjectType, "cannot create statistics on system.scheduled_jobs",
)
}

produces a failure if the table ID matches the jobs table. In a local cluster I see this in the logs:

209:W230801 15:46:15.757700 1703 sql/stats/automatic_stats.go:896 ⋮ [T1,n1] 135  failed to create statistics on table 15: ‹create-stats›: cannot create statistics on system.jobs

and still don't have stats after a pretty long time:

> SHOW STATISTICS FOR TABLE system.jobs;                                                                                                                                                                                      
SHOW STATISTICS 0

Perhaps we need to remove the above code to make sure we get stats on this table.

@rytaft
Copy link
Collaborator

rytaft commented Aug 3, 2023

Re the discussion around stats collection on system.jobs? The code here:

if tableDesc.GetID() == keys.JobsTableID {
return nil, pgerror.New(
pgcode.WrongObjectType, "cannot create statistics on system.jobs",
)
}
if tableDesc.GetID() == keys.ScheduledJobsTableID {
return nil, pgerror.New(
pgcode.WrongObjectType, "cannot create statistics on system.scheduled_jobs",
)
}

produces a failure if the table ID matches the jobs table. In a local cluster I see this in the logs:

209:W230801 15:46:15.757700 1703 sql/stats/automatic_stats.go:896 ⋮ [T1,n1] 135  failed to create statistics on table 15: ‹create-stats›: cannot create statistics on system.jobs

and still don't have stats after a pretty long time:

> SHOW STATISTICS FOR TABLE system.jobs;                                                                                                                                                                                      
SHOW STATISTICS 0

Perhaps we need to remove the above code to make sure we get stats on this table.

Hi @stevendanna -- that is correct, as noted in this comment: #107589 (review)

I'm planning to open a PR to fix it this week.

@stevendanna
Copy link
Collaborator

@rytaft 🤦 D'oh, don't know how I missed that, apologies!

@rytaft
Copy link
Collaborator

rytaft commented Aug 3, 2023

No worries! Appreciate the intent to help!

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opened #108139 to fix the stats collection issue. We should either backport that PR with this one unchanged, or else change this PR to include the query hint mentioned above.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @dt and @mgartner)

@mgartner
Copy link
Collaborator Author

mgartner commented Oct 4, 2023

Ok so it sounds like the remaining TODO here is to add a hint to this query and then we can merge this and potentially close #107405.

@DrewKimball
Copy link
Collaborator

It'd be even better if we could change the schema for 23.2, but we might be running up too close to the release for that now.

@mgartner
Copy link
Collaborator Author

mgartner commented Oct 5, 2023

It'd be even better if we could change the schema for 23.2, but we might be running up too close to the release for that now.

What's the proposal for changing the schema? I've lost it in the comments above.

We are short on time, but maybe someone can pick this up?

@DrewKimball
Copy link
Collaborator

I don't know if we really considered it because we were thinking about a backport, but since there's an index join and non-inlined filter, we could make the index covering and add status.

This commit improves the query plan for the `find-running-jobs-of-type`
query by adding a hint to use the `jobs_status_created_idx` and by
removing an `ORDER BY` clause if a job ID to ignore was not given. This
can eliminate an index join from the query plan in some cases, making
the query play more efficient.

Informs cockroachdb#107405

Release note: None
@mgartner mgartner force-pushed the jobs-remove-order-by branch from 488d74a to 4945cf0 Compare October 9, 2023 17:26
@mgartner mgartner requested review from DrewKimball and a team October 9, 2023 17:27
@mgartner
Copy link
Collaborator Author

mgartner commented Oct 9, 2023

I've added the hint as suggested. PTAL!

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @dt and @rytaft)


-- commits line 7 at r1:

Previously, rytaft (Rebecca Taft) wrote…

nit: query play -> query plan

(this typo is still there)

@mgartner
Copy link
Collaborator Author

mgartner commented Oct 9, 2023

(this typo is still there)

Fixed.

TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Oct 9, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Oct 9, 2023

Build succeeded:

@craig craig bot merged commit 6de91e3 into cockroachdb:master Oct 9, 2023
@mgartner mgartner deleted the jobs-remove-order-by branch October 10, 2023 12:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants