-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Refactor](Nerieds) Refactor aggregate function/plan/rules and support related cbo rules #14827
Conversation
449c99f
to
f3377b6
Compare
TeamCity pipeline, clickbench performance test result: |
4f1a2be
to
0ed25f3
Compare
67dc73e
to
ea3e000
Compare
ea3e000
to
b071b96
Compare
fe/fe-core/src/main/java/org/apache/doris/analysis/AggregateInfo.java
Outdated
Show resolved
Hide resolved
@@ -314,6 +320,11 @@ public CascadesContext getCascadesContext() { | |||
return cascadesContext; | |||
} | |||
|
|||
public static PhysicalProperties buildInitRequireProperties(Plan initPlan) { | |||
boolean isQuery = !(initPlan instanceof Command) || (initPlan instanceof ExplainCommand); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why add (initPlan instanceof ExplainCommand)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently, explain command only for explain query, so we should require gather for it.
when the explain command support other sqls, like insert into
, we should change it.
fe/fe-core/src/main/java/org/apache/doris/nereids/jobs/cascades/DeriveStatsJob.java
Show resolved
Hide resolved
@@ -667,7 +668,8 @@ public String toString() { | |||
builder.append(group).append("\n"); | |||
builder.append(" stats=").append(group.getStatistics()).append("\n"); | |||
StatsDeriveResult stats = group.getStatistics(); | |||
if (stats != null && group.getLogicalExpressions().get(0).getPlan() instanceof LogicalOlapScan) { | |||
if (stats != null && !group.getLogicalExpressions().isEmpty() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is because, some group has no logical expression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bingo
if (!olapScan.getTable().isColocateTable() && olapScan.getScanTabletNum() == 1) { | ||
return PhysicalProperties.GATHER; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a todo, let's find a better way to handle both tablet num == 1 and colocate table together in future
fe/fe-core/src/main/java/org/apache/doris/nereids/rules/rewrite/logical/DistinctToGroupBy.java
Outdated
Show resolved
Hide resolved
return expr; | ||
} else if (expr instanceof AggregateExpression && ((AggregateExpression) expr).getFunction().isDistinct()) { | ||
return expr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why their child could not be fold?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe erase some distinct columns, that will be some bug
public final Expression originExpr; | ||
public final Slot remainExpr; | ||
public final NamedExpression pushedExpr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add a example to explain them
|
||
return root; | ||
// push expression to bottom project | ||
Set<Alias> existsAliases = ExpressionUtils.collect( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it solve the problem of explosion of data? SQL like:
SELECT a + 1, a + 2, a + 3, SUM(b) FROM t GROUP BY a + 1, a + 2, a + 3;
SELECT a, SUM(b + 1), SUM(b + 2), SUM(b + 3) FROM t GROUP BY a;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the result is
LogicalAggregate(groupBy=(slot#1, slot#2, slot#3), output=[slot#1, slot#2, slot#3, sum(slot#4)])
|
LogicalProject(projects=[(a + 1)#1, (a + 2)#2, (a + 3)#3, b#4])
and
LogicalAggregate(groupBy=(slot#0), output=[slot#0, sum(slot#1), sum(slot#2), sum(slot#3)])
|
LogicalProject(projects=[a#0, (b + 1)#1, (b + 2)#2, (b + 3)#3])
*/ | ||
private List<PhysicalHashAggregate<Plan>> onePhaseAggregateWithoutDistinct( | ||
LogicalAggregate<? extends Plan> logicalAgg, ConnectContext connectContext) { | ||
RequireProperties requireGather = RequireProperties.of(PhysicalProperties.GATHER); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gather require could be a static member of RequireProperties
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RequireProperties has a List<PhysicalProperties> properties
match to every require physical properties of children.
so RequireProperties.GATHER
is only used for one child and not very general
d2b5625
to
0989b92
Compare
0989b92
to
d596c1c
Compare
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
Proposed changes
refactor
GATHER
physicalProperties for query, because query always collect result to the coordinator, useGATHER
maybe select a better planNormalizeAggregate
LogicalAggregate
, likeAggPhase
,isDisassemble
AggregateDisassemble
andDistinctAggregateDisassemble
, and useAggregateStrategies
to generate various of PhysicalHashAggregate, liketwo phases aggregate
,three phases aggregate
, and cascades can auto select the lowest cost alternative.PushAggregateToOlapScan
toAggregateStrategies
new feature
disable_nereids_rules
to skip some rules.example:
n
n
n
the result show that we use the one stage aggregate
the result is two stage aggregate
Checklist(Required)