-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
planner: Add HashJoin<-Receiver specific physicalPlan column pruner #38536
Conversation
Signed-off-by: yibin <[email protected]>
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
/cc @windtalker @winoros |
/cc @chrysan |
/rebuild |
/re-build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, you need to change the struct from
type PhysicalExchangeSender struct {
basePhysicalPlan
...
}
to
type PhysicalExchangeSender struct {
physicalSchemaProducer
...
}
The first one means that it doesn't hold its own schema(output columns), just uses the child's.
And the possible place that we would meet the same issue is the place where the parent doesn't need the full schema of the child. A clear way may be to pass the parent col to the physical property. And check the parent col when you create a sender. The above solution is based on the fact that we don't project the operator's columns which are not needed by their parent(A inline projection inside the operator). If we support the inline project inside the operator. We don't need these things. |
Signed-off-by: yibin <[email protected]>
Signed-off-by: yibin <[email protected]>
planner/core/optimizer_test.go
Outdated
"github.com/pingcap/tidb/expression" | ||
"github.com/pingcap/tidb/parser/ast" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fail the static check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Signed-off-by: yibin <[email protected]>
/cc @Yisaer |
/run-unit-tests |
Signed-off-by: yibin <[email protected]>
Signed-off-by: yibin <[email protected]>
Signed-off-by: yibin <[email protected]>
Signed-off-by: yibin <[email protected]>
Signed-off-by: yibin <[email protected]>
Signed-off-by: yibin <[email protected]>
Signed-off-by: yibin <[email protected]>
/run-unit-tests |
Signed-off-by: yibin <[email protected]>
/run-unit-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible to add a FT for the case?(check the result of explain select ...
) rest LGTM
planner/core/optimizer.go
Outdated
for i, mppCol := range sender.HashCols { | ||
exprCols[i] = mppCol.Col | ||
} | ||
exprUsed := expression.GetUsedList(exprCols, sender.Schema()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe hashCols
and hashUsed
are more clear than exprCols
and exprUsed
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice suggestion, done.
planner/core/optimizer.go
Outdated
ch := sender.children[0] | ||
proj := PhysicalProjection{ | ||
Exprs: usedExprs, | ||
}.Init(sctx, ch.statsInfo(), 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to set selectBlockOffset
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set to child's selectBlockOffset now.
Signed-off-by: yibin <[email protected]>
/run-mysql-test |
/merge |
This pull request has been accepted and is ready to merge. Commit hash: 0e036e4
|
TiDB MergeCI notify🔴 Bad News! New failing [2] after this pr merged.
|
Signed-off-by: yibin [email protected]
What problem does this PR solve?
Issue Number: close #38511
Problem Summary:
In some situations, current ColumnPrune optimizer for logical plan can't eliminate useless columns in exchange sender/receivers whose parent node's HashJoin. And such useless columns consume significant CPU utilities and network bandwith.
For excample, a HashJoin node may has following plan, the ExchangeSender/Receiver has useless cols: col1, col2:
// HashJoin[col0, col3; col0==col3] <- ExchangeReceiver[col0, col1, col2] <- ExchangeSender[col0, col1, col2] <- Selection[col0, col1, col2; col1 < col2] <- TableScan[col0, col1, col2]
//<- ExchangeReceiver1[col3] <- ExchangeSender1[col3] <- TableScan1[col3]
Although it is hard to figure out the exact root cause of these cases, it is easy to add a simple physical column pruner to eliminate useless columns.
In this PR, choose to add a physical projection under PhysicalExchangeSender to do the column pruning. It is considered safer than directly pruning the useless columns. And it costs a little, because the projection will be executed in memory.
And it can be considered a double check even if the root cause is completed fixed in future.
What is changed and how it works?
Check List
Tests
TPCH_100, before and after this PR, query results match.
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.