Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mark wholestageCodeGen as shouldRemove when child nodes are removed #1142

Merged
merged 1 commit into from
Jun 27, 2024

Conversation

amahussein
Copy link
Collaborator

Signed-off-by: Ahmed Hussein (amahussein) [email protected]

Fixes #860 , Fixes #793

This commit adjusts the accuracy of the Qual tool by targeting the following issues:

  • child nodes of wholeStageCodeGen would not be assigned to stages if they have no metrics.
  • there is a corner case when all the childs of wholeStageCodeGen are marked as shouldRemove. In that case, the node would still be considered unsupported and contribute to the speedup.

The changes are:

  • propagate the stageIDs of wholeStageCodeGen to the child nodes
  • a wholeStageCodeGen node is marked as shouldRemove when all the childs are marked as shouldRemove.
  • fix unit-test which has 4 different wholeStageCodeGen nodes that contain only ColumnarToRow execs

Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>

Fixes NVIDIA#860

This commit adjusts the accuracy of the Qual tool by targeting the
following issues:
- child nodes of wholeStageCodeGen would not be assigned to stages if
  they have no metrics.
- there is a corner case when all the childs of wholeStageCodeGen are
  marked as shouldRemove. In that case, the node would still be
considered unsupported and contribute to the speedup.

The changes are:
- propagate the stageIDs of wholeStageCodeGen to the child nodes
- a wholeStageCodeGen node is marked as shouldRemove when all the childs
  are marked as shouldRemove.
- fix unit-test which has 4 different wholeStageCodeGen nodes that
  contain only `ColumnarToRow` execs
@amahussein amahussein added bug Something isn't working core_tools Scope the core module (scala) labels Jun 26, 2024
@amahussein amahussein self-assigned this Jun 26, 2024
// average speedup across the execs in the WholeStageCodegen for now
val supportedChildren = childNodes.filterNot(_.shouldRemove)
val avSpeedupFactor = SQLPlanParser.averageSpeedup(supportedChildren.map(_.speedupFactor))
// TODO: revisit the logic behind adding the stages of child nodes to the current node.
Copy link
Collaborator Author

@amahussein amahussein Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The below allStagesIncludingChildren is there since the early days. I don't quite understand the reason childNodes stages are appended to the wholeStageCodeGen stageIDs.
@tgravescs do you know why we this was needed? If not, I suggest we remove it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking at the comment its saying that the child node might be associated to a stage but the whole stage codegen doesn't match it. So we basically add all the child node ones in. I don't remember when that can happen but would not remove it unless we are positive its not needed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I don't think this comment was addressed before merged? unless you know this doesn't happen I would prefer to see this comment removed or to file an issue to investigate it. This to me could just lead to confusion from someone reading the code.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad! haven't got my cup of coffee. For some reason I misread that I got 2 approvals.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Posted a followup #1146

@amahussein
Copy link
Collaborator Author

The bug of not assigning stages to execs inside wholeStageCodeGen also exists in the Profiler. We can either fix it in this PR or file a followup.

@nartal1 nartal1 changed the title Mark wholestageCodeGen as shouldRemove when childs are removed Mark wholestageCodeGen as shouldRemove when children are removed Jun 26, 2024
Copy link
Collaborator

@nartal1 nartal1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @amahussein ! The logic LGTM for marking wholestageCodeGen as shouldRemove.

@amahussein amahussein changed the title Mark wholestageCodeGen as shouldRemove when children are removed Mark wholestageCodeGen as shouldRemove when child nodes are removed Jun 26, 2024
@amahussein amahussein merged commit 823ac09 into NVIDIA:dev Jun 27, 2024
15 checks passed
@amahussein amahussein deleted the spark-rapids-tools-860 branch June 27, 2024 13:50
amahussein added a commit to amahussein/spark-rapids-tools that referenced this pull request Jun 27, 2024
Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>

Follow-up on NVIDIA#1142 to remove TODO comment line in WholeStageExecParser
amahussein added a commit that referenced this pull request Jun 27, 2024
Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>

Follow-up on #1142 to remove TODO comment line in WholeStageExecParser
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working core_tools Scope the core module (scala)
Projects
None yet
3 participants