Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

release-24.1: sql: implement generic query plans #128085

Conversation

mgartner
Copy link
Collaborator

@mgartner mgartner commented Aug 1, 2024

Backport:

Please see individual PRs for details.

/cc @cockroachdb/release


Release justification: New feature gated behind session setting.

The `plan_cache_mode` session setting has been added. It allows for
three options: `force_custom_plan`, `force_generic_plan`, and `auto`.
The session currently has no effect. In future commits it will control
how the system chooses between using a custom or generic query plan.

Release note: None
The `ConvertSelectWithPlaceholdersToJoin` exploration rule has been
added which enables optimizations to apply to query plans where
placeholder values are not known. See the rule's comments for more
details. The rule does not currently apply to any query plans because it
only matches Select expressions with filters that have placeholders, and
we currently replace all placeholders with constant values before
exploration. It will be used in the future when we enable exploration of
generic query plans.

Release note: None
…ldersToJoin

Join reordering and merge join exploration is now disabled for joins
created by the `ConvertSelectWithPlaceholdersToJoin` rule. The rule's
main goal is to assist in generating lookup joins. This changes reduces
unnecessary exploration.

Release note: None
This commit adds some stats tests with queries containing placeholders.

Release note: None
@mgartner mgartner requested review from a team and DrewKimball and removed request for a team and DrewKimball August 1, 2024 14:35
@mgartner mgartner requested review from a team as code owners August 1, 2024 14:35
@mgartner mgartner requested review from michae2 and xinhaoz and removed request for a team August 1, 2024 14:35
Copy link

blathers-crl bot commented Aug 1, 2024

Thanks for opening a backport.

Please check the backport criteria before merging:

  • Backports should only be created for serious
    issues
    or test-only changes.
  • Backports should not break backwards-compatibility.
  • Backports should change as little code as possible.
  • Backports should not change on-disk formats or node communication protocols.
  • Backports should not add new functionality (except as defined
    here).
  • Backports must not add, edit, or otherwise modify cluster versions; or add version gates.
  • All backports must be reviewed by the owning areas TL. For more information as to how that review should be conducted, please consult the backport
    policy
    .
If your backport adds new functionality, please ensure that the following additional criteria are satisfied:
  • There is a high priority need for the functionality that cannot wait until the next release and is difficult to address in another way.
  • The new functionality is additive-only and only runs for clusters which have specifically “opted in” to it (e.g. by a cluster setting).
  • New code is protected by a conditional check that is trivial to verify and ensures that it only runs for opt-in clusters. State changes must be further protected such that nodes running old binaries will not be negatively impacted by the new state (with a mixed version test added).
  • The PM and TL on the team that owns the changed code have signed off that the change obeys the above rules.
  • Your backport must be accompanied by a post to the appropriate Slack
    channel (#db-backports-point-releases or #db-backports-XX-X-release) for awareness and discussion.

Also, please add a brief release justification to the body of your PR to justify this
backport.

@blathers-crl blathers-crl bot added the backport Label PR's that are backports to older release branches label Aug 1, 2024
Copy link

blathers-crl bot commented Aug 1, 2024

Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@mgartner
Copy link
Collaborator Author

mgartner commented Aug 1, 2024

I know this PR was big. I thought backporting everything in bulk would be nice, but I'm happy to backport one PR at a time if that's easier on everyone else.

@mgartner mgartner force-pushed the backport24.1-125708-125795-125808-125871-126528-126863-125814-127214-127897 branch from ee979ce to cd8ecaa Compare August 2, 2024 17:43
@mgartner mgartner requested a review from a team as a code owner August 2, 2024 17:43
@rytaft
Copy link
Collaborator

rytaft commented Aug 5, 2024

Great work!! I didn't review all the code in detail since I think the original PR was reviewed carefully, but I did add a couple suggestions for things that might need a setting.

@mgartner
Copy link
Collaborator Author

mgartner commented Aug 5, 2024

@rytaft I don't see your suggestions. Did you forget to publish them?

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops yes -- published now

Reviewed 8 of 8 files at r1, 7 of 7 files at r2, 5 of 5 files at r3, 1 of 1 files at r4, 6 of 6 files at r5, 1 of 1 files at r6, 2 of 2 files at r7, 1 of 1 files at r8, 23 of 23 files at r9, 7 of 7 files at r10, 1 of 1 files at r11, 5 of 5 files at r12, 17 of 17 files at r13, 3 of 3 files at r14, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @michae2, and @xinhaoz)


pkg/sql/opt/memo/statistics_builder.go line 3401 at r6 (raw file):

	// Special case: a placeholder equality filter.
	if col, ok := isPlaceholderEqualityFilter(filter.Condition); ok {

Although this looks like a big improvement, I'm always hesitant to backport any sort of stats change without a setting


pkg/sql/opt/xform/generic_funcs.go line 38 at r2 (raw file):

	filters memo.FiltersExpr,
) (values memo.RelExpr, newFilters memo.FiltersExpr, ok bool) {
	// Collect all the placeholders in the filters.

should we add a check here that returns false if we are forcing custom plans?

@mgartner mgartner requested a review from rytaft August 5, 2024 20:23
Copy link
Collaborator Author

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @michae2, @rytaft, and @xinhaoz)


pkg/sql/opt/memo/statistics_builder.go line 3401 at r6 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

Although this looks like a big improvement, I'm always hesitant to backport any sort of stats change without a setting

I thought about that, but this stats code should never encounter a placeholder unless a generic query plan is being explored. With the default plan_cache_mode=force_custom_plan, all placeholders should be replaced with constant expressions before exploration. So plan_cache_mode is effectively gating this change (besides the code in isPlaceholderEqualityFilter, but that's fairly simple). Does that change your mind on this?


pkg/sql/opt/xform/generic_funcs.go line 38 at r2 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

should we add a check here that returns false if we are forcing custom plans?

I don't think it's necessary for filters with placeholders - we should never encounter those in non-generic plan exploration (see my other comment). But we might encounter those in stable expressions. I'll work on a PR and tack it on here.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @michae2, and @xinhaoz)


pkg/sql/opt/memo/statistics_builder.go line 3401 at r6 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I thought about that, but this stats code should never encounter a placeholder unless a generic query plan is being explored. With the default plan_cache_mode=force_custom_plan, all placeholders should be replaced with constant expressions before exploration. So plan_cache_mode is effectively gating this change (besides the code in isPlaceholderEqualityFilter, but that's fairly simple). Does that change your mind on this?

Sounds good, thanks for the explanation


pkg/sql/opt/xform/generic_funcs.go line 38 at r2 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

I don't think it's necessary for filters with placeholders - we should never encounter those in non-generic plan exploration (see my other comment). But we might encounter those in stable expressions. I'll work on a PR and tack it on here.

sounds good, thanks!

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Nice work! This is a pretty heroic backport. Just one question that occurred to me this time around.

Reviewed 8 of 8 files at r1, 7 of 7 files at r2, 5 of 5 files at r3, 1 of 1 files at r4, 6 of 6 files at r5, 1 of 1 files at r6, 2 of 2 files at r7, 1 of 1 files at r8, 23 of 23 files at r9, 7 of 7 files at r10, 1 of 1 files at r11, 5 of 5 files at r12, 17 of 17 files at r13, 3 of 3 files at r14, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @michae2)


pkg/sql/plan_opt.go line 575 at r9 (raw file):

		if err != nil {
			return nil, err
		} else if !isStale {

Would it ever be possible to have a stale fully optimized plan and non-stale generic plan? Or similar for a generic plan and a normalized plan.

@mgartner
Copy link
Collaborator Author

I've added one of the commits from #129050.

@mgartner
Copy link
Collaborator Author

pkg/sql/plan_opt.go line 575 at r9 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Would it ever be possible to have a stale fully optimized plan and non-stale generic plan? Or similar for a generic plan and a normalized plan.

Yes, I think both are possible. I intended to handle these cases with this logic. Is there something wrong you see?

@rytaft rytaft requested a review from DrewKimball August 28, 2024 20:45
Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 10 of 10 files at r15, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @michae2)

Stats estimates for filters of the form `col = $1` have been improved.
While histograms cannot be filtered without knowing the placeholder
value, we do know that `col` is limited to a one distinct value. This
improves the estimate over the simple 1/3 "unknown selectivity" that was
applied for these filters before this commit.

This changed required a modification to `TestCopyAndReplace` because it
made it such that a merge-join was no longer selected when optimizing
the query with placeholders. Now the test checks that `CopyAndReplace`
works correctly when copying a fully optimized memo with a parameterized
lookup-join. I've confirmed that the test still catches the bug
fixed in the original PR that added the test, cockroachdb#35020.

Release note: None
Tests for generic query plans with partial indexes have been added. No
code changes are needed in order for a query with a filter like
`col = $1` to utilize an index on `(col) WHERE col IS NOT NULL`.

Release note: None
The logic for picking a prepared memo has been moved to a function,
`fetchPreparedMemoLegacy`. This makes it easy in the following commit
to fallback to this logic based on the `plan_cache_mode` setting.

Release note: None
The `force_generic_plan` option for the session setting
`plan_cache_mode` has been implemented. Under this setting, prepared
statements will reuse a fully optimized, generic query plan, if a
non-stale one exists. If such a plan does not exist or is stale, a new
generic query plan will be built.

The term "generic query plan" is used in user-facing contexts to
represent any plan that is fully optimized without knowledge of specific
placeholder values nor folding stable expressions. Therefore, all of the
following are considered "generic":

  1. A plan for a query with no placeholders nor fold-able stable
     expressions.
  2. A plan utilizing the placeholder fast-path.
  3. A plan fully optimized without replacing placeholders with values.

However, only (3) is currently stored in the
`PreparedStatement.GenericMemo` field. This was a deliberate decision
that allows falling back to the original prepared statement logic when
switching between `force_custom_plan` and `force_generic_plan`. This
will make backports with this commit less risky. My goal, in a future
commit, is to put all generic query plans in the `GenericMemo` field to
reduce the confusion between the user-facing definition of "generic" and
the internal logic.

This commit does not change the behavior of the query cache. Fully
optimized plans for statements without placeholders or placeholder
fast-path plans are still cached in the query cache. Generic query plans
optimized in the presence of placeholders are not cached.

Release note (sql change): The session setting
`plan_cache_mode=force_generic_plan` can now be used to force prepared
statements to use query plans that are optimized once and reused in
future executions without re-optimization, as long as it does not become
stale due to schema changes or a collection of new table statistics. The
setting takes effect during `EXECUTE` commands. `EXPLAIN ANALYZE`
includes a "plan type" field. If a generic query plan is optimized for
the current execution, the "plan type" will be "generic, re-optimized".
If a generic query plan is reused for the current execution without
performing optimization, the "plan type" will be "generic, reused".
Otherwise, the "plan type" will be "custom".
The `plan_cache_mode=auto` session setting has been implemented which
instructs the database to pick generic query plans for prepared
statements when they have a cost less than or equal to custom plans.

The current strategy matches Postgres's strategy quite closely. Under
`plan_cache_mode=auto`, custom query plans will be generated for the
first five executions of a prepared statement. On the sixth execution, a
generic query plan will be generated. If the cost of the generic query
plan is less than the average cost of the custom plans (plus some
overhead for re-optimization), then the generic query plan will be used.

Release note (sql change): The session setting `plan_cache_mode=auto`
can now be used to instruct the system to automatically determine
whether to use "custom" or "generic" query plans for the execution of a
prepared statement. Custom query plans are optimized on every execution,
while generic plans are optimized once and reused on future executions
as-is. Generic query plans are beneficial in cases where query
optimization contributes significant overhead to the total cost of
executing a query.
Prior to this commit, the `no-stable-folds` directive only worked with
the "norm", "exprnorm", and "expropt" directives. It now also works with
the "opt" directive. This is required for testing generic query plans
where plans must be fully optimized without folding stable expressions.

Release note: None
The `ConvertSelectWithPlaceholdersToJoin` rule has been renamed to
`GenerateParameterizedJoin` and modified to also operate on stable
expressions. Stable expressions cannot be folded in generic query plans
because their value can only be determined at execution time. By
transforming a Select with a stable filter expression into a Join with a
Values input, the optimizer can potentially plan a lookup join with
similar performance characteristics to a constrained scan that would be
planned if the stable expression could be folded.

Release note: None
The exploration rule `GenerateParameterizedJoin` is now disabled when
the `plan_cache_mode` session setting is set to `force_custom_plan`.
This prevents possible regressions in plans for this mode in the rare
case that a stable function is not folded during normalization (due to
an error encountered during folding). In most cases, the check for
placeholders and stable functions in `GenerateParameterizedJoin` is
sufficient for preventing the rule from firing when a generic query plan
is not being built—before optimizing a custom plan placeholders are
always replaced with constants and stable functions are usually folded
to constants.

Release note: None
@mgartner mgartner force-pushed the backport24.1-125708-125795-125808-125871-126528-126863-125814-127214-127897 branch from 03e095a to ab27ad8 Compare August 29, 2024 21:23
@DrewKimball DrewKimball requested a review from rytaft August 29, 2024 21:27
Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @mgartner, @michae2, and @rytaft)


pkg/sql/plan_opt.go line 575 at r9 (raw file):

Previously, mgartner (Marcus Gartner) wrote…

Yes, I think both are possible. I intended to handle these cases with this logic. Is there something wrong you see?

Well if we do have a stale fully-optimized plan, shouldn't we return nil and build a new one, rather than resorting to a generic memo?

@mgartner
Copy link
Collaborator Author

pkg/sql/plan_opt.go line 575 at r9 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Well if we do have a stale fully-optimized plan, shouldn't we return nil and build a new one, rather than resorting to a generic memo?

That should never be possible. If the query has no placeholders or stable functions, then we'll never generate a generic memo, so it'll always be nil.

I think a lot of this confusion will be cleared up as I refactor this logic such that all fully-optimized plans are "generic", and the other cached plan is "normalized" or a "base" memo that still needs to be optimized. The main reason I didn't structure things like that form the get-go is because it would have made these backports more risky.

@mgartner
Copy link
Collaborator Author

mgartner commented Sep 4, 2024

TFTRs!

@mgartner mgartner merged commit ea2b536 into cockroachdb:release-24.1 Sep 4, 2024
19 of 20 checks passed
@mgartner mgartner deleted the backport24.1-125708-125795-125808-125871-126528-126863-125814-127214-127897 branch September 4, 2024 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport Label PR's that are backports to older release branches
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants