Skip to content

Commit

Permalink
Merge #125814
Browse files Browse the repository at this point in the history
125814: opt: optimize generic query plans with stable expressions r=mgartner a=mgartner

#### opttester: allow no-stable-folds for opt directive

Prior to this commit, the `no-stable-folds` directive only worked with
the "norm", "exprnorm", and "expropt" directives. It now also works with
the "opt" directive. This is required for testing generic query plans
where plans must be fully optimized without folding stable expressions.

Release note: None

#### opt: optimize generic query plans with stable expressions

The `ConvertSelectWithPlaceholdersToJoin` rule has been renamed to
`GenerateParameterizedJoin` and modified to also operate on stable
expressions. Stable expressions cannot be folded in generic query plans
because their value can only be determined at execution time. By
transforming a Select with a stable filter expression into a Join with a
Values input, the optimizer can potentially plan a lookup join with
similar performance characteristics to a constrained scan that would be
planned if the stable expression could be folded.

Epic: CRDB-37712

Release note: None


Co-authored-by: Marcus Gartner <[email protected]>
  • Loading branch information
craig[bot] and mgartner committed Jul 13, 2024
2 parents dc3ebba + 1debb13 commit a1204d2
Show file tree
Hide file tree
Showing 6 changed files with 453 additions and 142 deletions.
90 changes: 58 additions & 32 deletions pkg/sql/opt/exec/execbuilder/testdata/generic
Original file line number Diff line number Diff line change
Expand Up @@ -517,26 +517,39 @@ isolation level: serializable
priority: normal
quality of service: regular
·
filter
lookup join
│ sql nodes: <hidden>
│ regions: <hidden>
│ actual row count: 0
│ filter: t = now()
│ KV time: 0µs
│ KV contention time: 0µs
│ KV rows decoded: 0
│ KV bytes read: 0 B
│ KV gRPC calls: 0
│ estimated max memory allocated: 0 B
│ table: t@t_pkey
│ equality: (k) = (k)
│ equality cols are key
└── • scan
sql nodes: <hidden>
kv nodes: <hidden>
regions: <hidden>
actual row count: 0
KV time: 0µs
KV contention time: 0µs
KV rows decoded: 0
KV bytes read: 0 B
KV gRPC calls: 0
estimated max memory allocated: 0 B
missing stats
table: t@t_pkey
spans: FULL SCAN
└── • lookup join
│ sql nodes: <hidden>
│ kv nodes: <hidden>
│ regions: <hidden>
│ actual row count: 0
│ KV time: 0µs
│ KV contention time: 0µs
│ KV rows decoded: 0
│ KV bytes read: 0 B
│ KV gRPC calls: 0
│ estimated max memory allocated: 0 B
│ table: t@t_t_idx
│ equality: (column9) = (t)
└── • values
sql nodes: <hidden>
regions: <hidden>
actual row count: 1
size: 1 column, 1 row

# The generic plan can be reused.
query T
Expand All @@ -554,26 +567,39 @@ isolation level: serializable
priority: normal
quality of service: regular
·
filter
lookup join
│ sql nodes: <hidden>
│ regions: <hidden>
│ actual row count: 0
│ filter: t = now()
│ KV time: 0µs
│ KV contention time: 0µs
│ KV rows decoded: 0
│ KV bytes read: 0 B
│ KV gRPC calls: 0
│ estimated max memory allocated: 0 B
│ table: t@t_pkey
│ equality: (k) = (k)
│ equality cols are key
└── • scan
sql nodes: <hidden>
kv nodes: <hidden>
regions: <hidden>
actual row count: 0
KV time: 0µs
KV contention time: 0µs
KV rows decoded: 0
KV bytes read: 0 B
KV gRPC calls: 0
estimated max memory allocated: 0 B
missing stats
table: t@t_pkey
spans: FULL SCAN
└── • lookup join
│ sql nodes: <hidden>
│ kv nodes: <hidden>
│ regions: <hidden>
│ actual row count: 0
│ KV time: 0µs
│ KV contention time: 0µs
│ KV rows decoded: 0
│ KV bytes read: 0 B
│ KV gRPC calls: 0
│ estimated max memory allocated: 0 B
│ table: t@t_t_idx
│ equality: (column9) = (t)
└── • values
sql nodes: <hidden>
regions: <hidden>
actual row count: 1
size: 1 column, 1 row

statement ok
DEALLOCATE p
Expand Down
6 changes: 4 additions & 2 deletions pkg/sql/opt/testutils/opttester/opt_tester.go
Original file line number Diff line number Diff line change
Expand Up @@ -484,7 +484,7 @@ func New(catalog cat.Catalog, sql string) *OptTester {
// modifies the existing set of the flags.
//
// - no-stable-folds: disallows constant folding for stable operators; only
// used with "norm".
// used with "norm", "opt", "exprnorm", and "expropt".
//
// - fully-qualify-names: fully qualify all column names in the test output.
//
Expand Down Expand Up @@ -1249,7 +1249,9 @@ func (ot *OptTester) OptimizeWithTables(tables map[cat.StableID]cat.Table) (opt.
o.NotifyOnMatchedRule(func(ruleName opt.RuleName) bool {
return !ot.Flags.DisableRules.Contains(int(ruleName))
})
o.Factory().FoldingControl().AllowStableFolds()
if !ot.Flags.NoStableFolds {
o.Factory().FoldingControl().AllowStableFolds()
}
return ot.optimizeExpr(o, tables)
}

Expand Down
1 change: 1 addition & 0 deletions pkg/sql/opt/xform/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,7 @@ go_library(
"//pkg/sql/rowinfra",
"//pkg/sql/sem/eval",
"//pkg/sql/sem/tree",
"//pkg/sql/sem/volatility",
"//pkg/sql/types",
"//pkg/util/buildutil",
"//pkg/util/cancelchecker",
Expand Down
145 changes: 69 additions & 76 deletions pkg/sql/opt/xform/generic_funcs.go
Original file line number Diff line number Diff line change
Expand Up @@ -16,114 +16,107 @@ import (
"github.com/cockroachdb/cockroach/pkg/sql/opt"
"github.com/cockroachdb/cockroach/pkg/sql/opt/memo"
"github.com/cockroachdb/cockroach/pkg/sql/sem/tree"
"github.com/cockroachdb/cockroach/pkg/sql/sem/volatility"
"github.com/cockroachdb/cockroach/pkg/sql/types"
"github.com/cockroachdb/cockroach/pkg/util/intsets"
"github.com/cockroachdb/errors"
)

// HasPlaceholders returns true if the given relational expression's subtree has
// HasPlaceholdersOrStableExprs returns true if the given relational expression's subtree has
// at least one placeholder.
func (c *CustomFuncs) HasPlaceholders(e memo.RelExpr) bool {
return e.Relational().HasPlaceholder
func (c *CustomFuncs) HasPlaceholdersOrStableExprs(e memo.RelExpr) bool {
return e.Relational().HasPlaceholder || e.Relational().VolatilitySet.HasStable()
}

// GeneratePlaceholderValuesAndJoinFilters returns a single-row Values
// expression containing placeholders in the given filters. It also returns a
// new set of filters where the placeholders have been replaced with variables
// referencing the columns produced by the returned Values expression. If the
// given filters have no placeholders, ok=false is returned.
func (c *CustomFuncs) GeneratePlaceholderValuesAndJoinFilters(
// GenerateParameterizedJoinValuesAndFilters returns a single-row Values
// expression containing placeholders and stable expressions in the given
// filters. It also returns a new set of filters where the placeholders and
// stable expressions have been replaced with variables referencing the columns
// produced by the returned Values expression. If the given filters have no
// placeholders or stable expressions, ok=false is returned.
func (c *CustomFuncs) GenerateParameterizedJoinValuesAndFilters(
filters memo.FiltersExpr,
) (values memo.RelExpr, newFilters memo.FiltersExpr, ok bool) {
// Collect all the placeholders in the filters.
//
// collectPlaceholders recursively walks the scalar expression and collects
// placeholder expressions into the placeholders slice.
var placeholders []*memo.PlaceholderExpr
var seenIndexes intsets.Fast
var collectPlaceholders func(e opt.Expr)
collectPlaceholders = func(e opt.Expr) {
if p, ok := e.(*memo.PlaceholderExpr); ok {
idx := int(p.Value.(*tree.Placeholder).Idx)
// Don't include the same placeholder multiple times.
if !seenIndexes.Contains(idx) {
seenIndexes.Add(idx)
placeholders = append(placeholders, p)
var exprs memo.ScalarListExpr
var cols opt.ColList
placeholderCols := make(map[tree.PlaceholderIdx]opt.ColumnID)

// replace recursively walks the expression tree and replaces placeholders
// and stable expressions. It collects the replaced expressions and creates
// columns representing those expressions. Those expressions and columns
// will be used in the Values expression created below.
var replace func(e opt.Expr) opt.Expr
replace = func(e opt.Expr) opt.Expr {
switch t := e.(type) {
case *memo.PlaceholderExpr:
idx := t.Value.(*tree.Placeholder).Idx
// Reuse the same column for duplicate placeholder references.
if col, ok := placeholderCols[idx]; ok {
return c.e.f.ConstructVariable(col)
}
col := c.e.f.Metadata().AddColumn(fmt.Sprintf("$%d", idx+1), t.DataType())
placeholderCols[idx] = col
exprs = append(exprs, t)
cols = append(cols, col)
return c.e.f.ConstructVariable(col)

case *memo.FunctionExpr:
// TODO(mgartner): Consider including other expressions that could
// be stable: casts, assignment casts, UDFCallExprs, unary ops,
// comparisons, binary ops.
// TODO(mgartner): Include functions with arguments if they are all
// constants or placeholders.
if t.Overload.Volatility == volatility.Stable && len(t.Args) == 0 {
col := c.e.f.Metadata().AddColumn("", t.DataType())
exprs = append(exprs, t)
cols = append(cols, col)
return c.e.f.ConstructVariable(col)
}
return
}
for i, n := 0, e.ChildCount(); i < n; i++ {
collectPlaceholders(e.Child(i))
}

return c.e.f.Replace(e, replace)
}

// Replace placeholders and stable expressions in each filter.
for i := range filters {
// Only traverse the scalar expression if it contains a placeholder.
if filters[i].ScalarProps().HasPlaceholder {
collectPlaceholders(filters[i].Condition)
cond := filters[i].Condition
if newCond := replace(cond).(opt.ScalarExpr); newCond != cond {
if newFilters == nil {
// Lazily allocate newFilters.
newFilters = make(memo.FiltersExpr, len(filters))
copy(newFilters, filters[:i])
}
// Construct a new filter if placeholders were replaced.
newFilters[i] = c.e.f.ConstructFiltersItem(newCond)
} else if newFilters != nil {
// Otherwise copy the filter if newFilters has been allocated.
newFilters[i] = filters[i]
}
}

// If there are no placeholders in the filters, there is nothing to do.
if len(placeholders) == 0 {
// If no placeholders or stable expressions were replaced, there is nothing
// to do.
if len(exprs) == 0 {
return nil, nil, false
}

// Create the Values expression with one row and one column for each
// placeholder.
cols := make(opt.ColList, len(placeholders))
colIDs := make(map[tree.PlaceholderIdx]opt.ColumnID, len(placeholders))
typs := make([]*types.T, len(placeholders))
exprs := make(memo.ScalarListExpr, len(placeholders))
for i, p := range placeholders {
idx := p.Value.(*tree.Placeholder).Idx
col := c.e.f.Metadata().AddColumn(fmt.Sprintf("$%d", idx+1), p.DataType())
cols[i] = col
colIDs[idx] = col
exprs[i] = p
typs[i] = p.DataType()
// replaced expression.
typs := make([]*types.T, len(exprs))
for i, e := range exprs {
typs[i] = e.DataType()
}

tupleTyp := types.MakeTuple(typs)
rows := memo.ScalarListExpr{c.e.f.ConstructTuple(exprs, tupleTyp)}
values = c.e.f.ConstructValues(rows, &memo.ValuesPrivate{
Cols: cols,
ID: c.e.f.Metadata().NextUniqueID(),
})

// Create new filters by replacing the placeholders in the filters with
// variables.
var replace func(e opt.Expr) opt.Expr
replace = func(e opt.Expr) opt.Expr {
if p, ok := e.(*memo.PlaceholderExpr); ok {
idx := p.Value.(*tree.Placeholder).Idx
col, ok := colIDs[idx]
if !ok {
panic(errors.AssertionFailedf("unknown placeholder %d", idx))
}
return c.e.f.ConstructVariable(col)
}
return c.e.f.Replace(e, replace)
}

newFilters = make(memo.FiltersExpr, len(filters))
for i := range newFilters {
cond := filters[i].Condition
if newCond := replace(cond).(opt.ScalarExpr); newCond != cond {
// Construct a new filter if placeholders were replaced.
newFilters[i] = c.e.f.ConstructFiltersItem(newCond)
} else {
// Otherwise copy the filter.
newFilters[i] = filters[i]
}
}

return values, newFilters, true
}

// GenericJoinPrivate returns JoinPrivate that disabled join reordering and
// ParameterizedJoinPrivate returns JoinPrivate that disabled join reordering and
// merge join exploration.
func (c *CustomFuncs) GenericJoinPrivate() *memo.JoinPrivate {
func (c *CustomFuncs) ParameterizedJoinPrivate() *memo.JoinPrivate {
return &memo.JoinPrivate{
Flags: memo.DisallowMergeJoin,
SkipReorderJoins: true,
Expand Down
40 changes: 24 additions & 16 deletions pkg/sql/opt/xform/rules/generic.opt
Original file line number Diff line number Diff line change
Expand Up @@ -2,47 +2,55 @@
# generic.opt contains exploration rules for optimizing generic query plans.
# =============================================================================

# ConvertSelectWithPlaceholdersToJoin is an exploration rule that converts a
# Select expression with placeholders in the filters into an InnerJoin that
# joins the Select's input with a Values expression that produces the
# placeholder values.
# GenerateParameterizedJoin is an exploration rule that converts a Select
# expression with placeholders and stable expression in the filters into an
# InnerJoin that joins the Select's input with a Values expression that produces
# the placeholder values and stable expressions.
#
# This rule allows generic query plans, in which placeholder values are not
# known, to be optimized. By converting the Select into an InnerJoin, the
# optimizer can plan a lookup join, in many cases, which has similar performance
# characteristics to the constrained Scan that would be planned if the
# placeholder values were known. For example, consider a schema and query like:
# known and stable expressions are not folded, to be optimized. By converting
# the Select into an InnerJoin, the optimizer can, in many cases, plan a lookup
# join which has similar performance characteristics to the constrained Scan
# that would be planned if the placeholder values were known.
#
# For example, consider a schema and query like:
#
# CREATE TABLE t (i INT PRIMARY KEY)
# SELECT * FROM t WHERE i = $1
#
# ConvertSelectWithPlaceholdersToJoin will perform the first conversion below,
# from a Select into a Join. GenerateLookupJoins will perform the second
# conversion from a (hash) Join into a LookupJoin.
#
# GenerateParameterizedJoin will perform the first transformation below, from a
# Select into a Join. GenerateLookupJoins will perform the second transformation
# from a (hash) Join into a LookupJoin.
#
# Select (i=$1) Join (i=col_$1) LookupJoin (t@t_pkey)
# | -> / \ -> |
# | / \ |
# Scan t Values ($1) Scan t Values ($1)
#
[ConvertSelectWithPlaceholdersToJoin, Explore]
[GenerateParameterizedJoin, Explore]
(Select
$scan:(Scan $scanPrivate:*) & (IsCanonicalScan $scanPrivate)
$filters:* &
(HasPlaceholders (Root)) &
(HasPlaceholdersOrStableExprs (Root)) &
(Let
(
$values
$newFilters
$ok
):(GeneratePlaceholderValuesAndJoinFilters $filters)
):(GenerateParameterizedJoinValuesAndFilters
$filters
)
$ok
)
)
=>
(Project
(InnerJoin $values $scan $newFilters (GenericJoinPrivate))
(InnerJoin
$values
$scan
$newFilters
(ParameterizedJoinPrivate)
)
[]
(OutputCols (Root))
)
Loading

0 comments on commit a1204d2

Please sign in to comment.