Skip to content

Commit

Permalink
opt: add exploration rule to eliminate Project inside GroupBy
Browse files Browse the repository at this point in the history
This commit updates the exploration rule EliminateIndexJoinInsideGroupBy
and renames it to EliminateIndexJoinOrProjectInsideGroupBy. The rule now
removes either an IndexJoin or Project operator if it can be proven that
the removal does not affect the output of the parent grouping operator.

Removal of a Project is needed in cases where the partial index predicate
constrains some columns to be constant, and therefore provides those
columns as constant projections. If the projected columns are not actually
needed by the GroupBy, however, the Project is not necessary and interferes
with other rules matching, such as SplitGroupByScanIntoUnionScans.

Informs #65473

Release note (performance improvement): Improved the efficiency of validation
for some partial unique indexes in REGIONAL BY ROW tables by improving the
query plan to use all streaming operations.
  • Loading branch information
rytaft committed Jul 6, 2021
1 parent 19fee46 commit 64bb06a
Show file tree
Hide file tree
Showing 6 changed files with 420 additions and 256 deletions.
189 changes: 78 additions & 111 deletions pkg/sql/opt/memo/testdata/stats_quality/tpch/q20

Large diffs are not rendered by default.

13 changes: 13 additions & 0 deletions pkg/sql/opt/xform/groupby_funcs.go
Original file line number Diff line number Diff line change
Expand Up @@ -266,3 +266,16 @@ func (c *CustomFuncs) GroupingColumns(private *memo.GroupingPrivate) opt.ColSet
func (c *CustomFuncs) GroupingOrdering(private *memo.GroupingPrivate) props.OrderingChoice {
return private.Ordering
}

// MakeGroupingPrivate constructs a new GroupingPrivate using the given
// grouping columns, OrderingChoice, NullsAreDistinct bool, and ErrorOnDup text.
func (c *CustomFuncs) MakeGroupingPrivate(
groupingCols opt.ColSet, ordering props.OrderingChoice, nullsAreDistinct bool, errorText string,
) *memo.GroupingPrivate {
return &memo.GroupingPrivate{
GroupingCols: groupingCols,
Ordering: ordering,
NullsAreDistinct: nullsAreDistinct,
ErrorOnDup: errorText,
}
}
39 changes: 26 additions & 13 deletions pkg/sql/opt/xform/rules/groupby.opt
Original file line number Diff line number Diff line change
Expand Up @@ -212,17 +212,17 @@
=>
((OpName) (Select $unionScans $filters) $aggs $private)

# EliminateIndexJoinInsideGroupBy removes an IndexJoin operator if it can be
# proven that the removal does not affect the output of the parent grouping
# operator. This is the case if:
# EliminateIndexJoinOrProjectInsideGroupBy removes an IndexJoin or Project
# operator if it can be proven that the removal does not affect the output of
# the parent grouping operator. This is the case if:
#
# 1. Only columns from the index join's input are being used by the grouping
# operator.
# 1. Only columns from the index join/project's input are being used by the
# grouping operator.
#
# 2. The OrderingChoice of the grouping operator can be expressed with only
# columns from the index join's input. Or in other words, at least one column
# in every ordering group is one of the output columns from the index join's
# input.
# columns from the index join/project's input. Or in other words, at least
# one column in every ordering group is one of the output columns from the
# index join/project's input.
#
# This rule is useful when using partial indexes. When generating partial index
# scans, expressions can be removed from filters because they exactly match
Expand Down Expand Up @@ -270,22 +270,35 @@
# └── scan t@secondary,partial
# └── columns: i:1 rowid:4!null
#
[EliminateIndexJoinInsideGroupBy, Explore]
# A Project is created in cases where the partial index predicate constrains
# some columns to be constant, and therefore provides those columns as constant
# projections instead of using an IndexJoin. The Project can be eliminated for
# the same reasons as the IndexJoin.
[EliminateIndexJoinOrProjectInsideGroupBy, Explore]
(GroupBy | DistinctOn | EnsureUpsertDistinctOn
(IndexJoin $input:*)
(IndexJoin | Project $input:*)
$aggs:*
$private:* &
(OrderingCanProjectCols
(GroupingOrdering $private)
$ordering:(GroupingOrdering $private)
$inputCols:(OutputCols $input)
) &
(ColsAreSubset
(UnionCols
(GroupingColumns $private)
$groupingCols:(GroupingColumns $private)
(AggregationOuterCols $aggs)
)
$inputCols
)
)
=>
((OpName) $input $aggs $private)
((OpName)
$input
$aggs
(MakeGroupingPrivate
$groupingCols
(PruneOrdering $ordering $inputCols)
(NullsAreDistinct $private)
(ErrorOnDup $private)
)
)
95 changes: 43 additions & 52 deletions pkg/sql/opt/xform/testdata/external/tpch
Original file line number Diff line number Diff line change
Expand Up @@ -2186,59 +2186,50 @@ sort
│ │ ├── grouping columns: ps_suppkey:17!null
│ │ ├── immutable
│ │ ├── key: (17)
│ │ └── project
│ │ ├── columns: ps_partkey:16!null ps_suppkey:17!null
│ │ └── inner-join (lookup part)
│ │ ├── columns: ps_partkey:16!null ps_suppkey:17!null ps_availqty:18!null p_partkey:23!null p_name:24!null sum:52!null
│ │ ├── key columns: [16] = [23]
│ │ ├── lookup columns are key
│ │ ├── immutable
│ │ ├── key: (16,17)
│ │ └── project
│ │ ├── columns: ps_partkey:16!null ps_suppkey:17!null p_partkey:23!null
│ │ ├── immutable
│ │ ├── key: (17,23)
│ │ ├── fd: (16)==(23), (23)==(16)
│ │ └── inner-join (lookup part)
│ │ ├── columns: ps_partkey:16!null ps_suppkey:17!null ps_availqty:18!null p_partkey:23!null p_name:24!null sum:52!null
│ │ ├── key columns: [16] = [23]
│ │ ├── lookup columns are key
│ │ ├── immutable
│ │ ├── key: (17,23)
│ │ ├── fd: (16,17)-->(18,52), (23)-->(24), (16)==(23), (23)==(16)
│ │ ├── select
│ │ │ ├── columns: ps_partkey:16!null ps_suppkey:17!null ps_availqty:18!null sum:52!null
│ │ │ ├── immutable
│ │ │ ├── key: (16,17)
│ │ │ ├── fd: (16,17)-->(18,52)
│ │ │ ├── group-by
│ │ │ │ ├── columns: ps_partkey:16!null ps_suppkey:17!null ps_availqty:18!null sum:52!null
│ │ │ │ ├── grouping columns: ps_partkey:16!null ps_suppkey:17!null
│ │ │ │ ├── key: (16,17)
│ │ │ │ ├── fd: (16,17)-->(18,52)
│ │ │ │ ├── inner-join (hash)
│ │ │ │ │ ├── columns: ps_partkey:16!null ps_suppkey:17!null ps_availqty:18!null l_partkey:35!null l_suppkey:36!null l_quantity:38!null l_shipdate:44!null
│ │ │ │ │ ├── multiplicity: left-rows(exactly-one), right-rows(zero-or-more)
│ │ │ │ │ ├── fd: (16,17)-->(18), (16)==(35), (35)==(16), (17)==(36), (36)==(17)
│ │ │ │ │ ├── index-join lineitem
│ │ │ │ │ │ ├── columns: l_partkey:35!null l_suppkey:36!null l_quantity:38!null l_shipdate:44!null
│ │ │ │ │ │ └── scan lineitem@l_sd
│ │ │ │ │ │ ├── columns: l_orderkey:34!null l_linenumber:37!null l_shipdate:44!null
│ │ │ │ │ │ ├── constraint: /44/34/37: [/'1994-01-01' - /'1994-12-31']
│ │ │ │ │ │ ├── key: (34,37)
│ │ │ │ │ │ └── fd: (34,37)-->(44)
│ │ │ │ │ ├── scan partsupp
│ │ │ │ │ │ ├── columns: ps_partkey:16!null ps_suppkey:17!null ps_availqty:18!null
│ │ │ │ │ │ ├── key: (16,17)
│ │ │ │ │ │ └── fd: (16,17)-->(18)
│ │ │ │ │ └── filters
│ │ │ │ │ ├── l_partkey:35 = ps_partkey:16 [outer=(16,35), constraints=(/16: (/NULL - ]; /35: (/NULL - ]), fd=(16)==(35), (35)==(16)]
│ │ │ │ │ └── l_suppkey:36 = ps_suppkey:17 [outer=(17,36), constraints=(/17: (/NULL - ]; /36: (/NULL - ]), fd=(17)==(36), (36)==(17)]
│ │ │ │ └── aggregations
│ │ │ │ ├── sum [as=sum:52, outer=(38)]
│ │ │ │ │ └── l_quantity:38
│ │ │ │ └── const-agg [as=ps_availqty:18, outer=(18)]
│ │ │ │ └── ps_availqty:18
│ │ │ └── filters
│ │ │ └── ps_availqty:18 > (sum:52 * 0.5) [outer=(18,52), immutable, constraints=(/18: (/NULL - ])]
│ │ └── filters
│ │ └── p_name:24 LIKE 'forest%' [outer=(24), constraints=(/24: [/'forest' - /'foresu'); tight)]
│ │ ├── key: (17,23)
│ │ ├── fd: (16,17)-->(18,52), (23)-->(24), (16)==(23), (23)==(16)
│ │ ├── select
│ │ │ ├── columns: ps_partkey:16!null ps_suppkey:17!null ps_availqty:18!null sum:52!null
│ │ │ ├── immutable
│ │ │ ├── key: (16,17)
│ │ │ ├── fd: (16,17)-->(18,52)
│ │ │ ├── group-by
│ │ │ │ ├── columns: ps_partkey:16!null ps_suppkey:17!null ps_availqty:18!null sum:52!null
│ │ │ │ ├── grouping columns: ps_partkey:16!null ps_suppkey:17!null
│ │ │ │ ├── key: (16,17)
│ │ │ │ ├── fd: (16,17)-->(18,52)
│ │ │ │ ├── inner-join (hash)
│ │ │ │ │ ├── columns: ps_partkey:16!null ps_suppkey:17!null ps_availqty:18!null l_partkey:35!null l_suppkey:36!null l_quantity:38!null l_shipdate:44!null
│ │ │ │ │ ├── multiplicity: left-rows(exactly-one), right-rows(zero-or-more)
│ │ │ │ │ ├── fd: (16,17)-->(18), (16)==(35), (35)==(16), (17)==(36), (36)==(17)
│ │ │ │ │ ├── index-join lineitem
│ │ │ │ │ │ ├── columns: l_partkey:35!null l_suppkey:36!null l_quantity:38!null l_shipdate:44!null
│ │ │ │ │ │ └── scan lineitem@l_sd
│ │ │ │ │ │ ├── columns: l_orderkey:34!null l_linenumber:37!null l_shipdate:44!null
│ │ │ │ │ │ ├── constraint: /44/34/37: [/'1994-01-01' - /'1994-12-31']
│ │ │ │ │ │ ├── key: (34,37)
│ │ │ │ │ │ └── fd: (34,37)-->(44)
│ │ │ │ │ ├── scan partsupp
│ │ │ │ │ │ ├── columns: ps_partkey:16!null ps_suppkey:17!null ps_availqty:18!null
│ │ │ │ │ │ ├── key: (16,17)
│ │ │ │ │ │ └── fd: (16,17)-->(18)
│ │ │ │ │ └── filters
│ │ │ │ │ ├── l_partkey:35 = ps_partkey:16 [outer=(16,35), constraints=(/16: (/NULL - ]; /35: (/NULL - ]), fd=(16)==(35), (35)==(16)]
│ │ │ │ │ └── l_suppkey:36 = ps_suppkey:17 [outer=(17,36), constraints=(/17: (/NULL - ]; /36: (/NULL - ]), fd=(17)==(36), (36)==(17)]
│ │ │ │ └── aggregations
│ │ │ │ ├── sum [as=sum:52, outer=(38)]
│ │ │ │ │ └── l_quantity:38
│ │ │ │ └── const-agg [as=ps_availqty:18, outer=(18)]
│ │ │ │ └── ps_availqty:18
│ │ │ └── filters
│ │ │ └── ps_availqty:18 > (sum:52 * 0.5) [outer=(18,52), immutable, constraints=(/18: (/NULL - ])]
│ │ └── filters
│ │ └── p_name:24 LIKE 'forest%' [outer=(24), constraints=(/24: [/'forest' - /'foresu'); tight)]
│ └── filters (true)
└── filters
└── n_name:11 = 'CANADA' [outer=(11), constraints=(/11: [/'CANADA' - /'CANADA']; tight), fd=()-->(11)]
Expand Down
Loading

0 comments on commit 64bb06a

Please sign in to comment.