You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently this is how we decide whther wholStage exec is supported:
// if any of the execs in WholeStageCodegen supported mark this entire thing
// as supported
val anySupported = childNodes.exists(_.isSupported == true)
val unSupportedExprsArray = childNodes.filter(_.unsupportedExprs.nonEmpty ).map(
x => x.unsupportedExprs).flatten.toArray
// average speedup across the execs in the WholeStageCodegen for now
There is a corner case when WholestageCodegen contains an exec like ColumnarToRow.
As a result the WholestageCodegen is set to unsupported but the problem that the ColumnarToRow is supposed to be removed.
In that case, the WholestageCodegen should either be set to shouldRemove or it should be marked as supported because it is theoretically empty.
The text was updated successfully, but these errors were encountered:
Agree, we should just mark it as should remove like we do when its not in a wholestagecodegen. If that is only exec in it we could just remove the entire wholestagecode gen. That does bring up an interesting question though, that is if cpu has this transition, it might completely go away on GPU if everything is staying columnar. That could make the speedup better. For the qual tool speedup calculation if we remove it we will likely spread that time across the other execs. I don't think we have any logic complex enough to deal with that very well right now.
amahussein
added a commit
to amahussein/spark-rapids-tools
that referenced
this issue
Jun 26, 2024
Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>
FixesNVIDIA#860
This commit adjusts the accuracy of the Qual tool by targeting the
following issues:
- child nodes of wholeStageCodeGen would not be assigned to stages if
they have no metrics.
- there is a corner case when all the childs of wholeStageCodeGen are
marked as shouldRemove. In that case, the node would still be
considered unsupported and contribute to the speedup.
The changes are:
- propagate the stageIDs of wholeStageCodeGen to the child nodes
- a wholeStageCodeGen node is marked as shouldRemove when all the childs
are marked as shouldRemove.
- fix unit-test which has 4 different wholeStageCodeGen nodes that
contain only `ColumnarToRow` execs
Describe the bug
Currently this is how we decide whther wholStage exec is supported:
There is a corner case when
WholestageCodegen
contains an exec likeColumnarToRow
.As a result the
WholestageCodegen
is set tounsupported
but the problem that theColumnarToRow
is supposed to be removed.In that case, the
WholestageCodegen
should either be set to shouldRemove or it should be marked as supported because it is theoretically empty.The text was updated successfully, but these errors were encountered: