-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TIR][Bugfix] Improved massive build times caused by tir.floormod and tir.floordiv. Fixed Topi testcase. #5666
Conversation
Reviewed. |
Thanks @dpankratz , in this particular case, maybe we could try to go and fix the integer lowering path. How about wrap the temp value creation with a LetNode? |
@tqchen I'm not sure how to gracefully add a LetNode since I'm running into trouble creating a variable with a unique name. For example, using a simple counter to create variables of the form I've tried searching for other places where creating variables with unique names is handled and haven't seen anything which makes me uneasy. Any suggestions? |
the name hint of the variable is not required to be unique, as the address of the variable is used to uniquely identify the var |
cc @dprankratz can we followup |
@tqchen I did some more digging into using let nodes. Consider the following expression:
Then composing those with another
The net effect is that multiple Is there an easy solution to this problem? |
@dpankratz sorry for the delayed reply. #5949 relaxes the constraint of the Let to allow a single var to be binded multiple times as long as the binding values are the same. Can you please try again? |
src/tir/transforms/lower_intrin.cc
Outdated
*/ | ||
|
||
// floor(a / b) | ||
auto fdtype = DataType::Float(dtype.bits() * 2, dtype.lanes()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let us also use the rules in the else branch instead (in case the target does not support floating pts but support integer arithmetics)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the logic in the most recent commit. Thanks for the suggestion!
src/tir/transforms/lower_intrin.cc
Outdated
|
||
// a - floor(a / b) * b | ||
auto fdtype = DataType::Float(dtype.bits() * 2, dtype.lanes()); | ||
auto div = tir::Div(tir::Cast(fdtype, op->a), tir::Cast(fdtype, op->b)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let us also use the rules in the else branch instead (in case the target does not support floating pts but support integer arithmetics)
…n np floor_div and fmod.
…orDiv using let nodes
…that don't support it.
Sorry for the long delay, Thanks @dpankratz ! This PR is now merged |
… tir.floordiv. Fixed Topi testcase. (apache#5666) * Improved uncommon case of floormod and floordiv. Removed dependence on np floor_div and fmod. * Fixed clang-format complaints * Streamlined floormod and floordiv lowering logic * Improved build times by expressing int64 case of tir FloorMod and FloorDiv using let nodes * Updated use-def analysis and llvm codegen to support duplicated letnodes. * Corrected misuse of var_map_ in llvm codegen * Updated backends that support LetNode * Changed floormod and div lowering logic to avoid using FP on systems that don't support it. * Fixed formatting Co-authored-by: pankratz <[email protected]>
… tir.floordiv. Fixed Topi testcase. (apache#5666) * Improved uncommon case of floormod and floordiv. Removed dependence on np floor_div and fmod. * Fixed clang-format complaints * Streamlined floormod and floordiv lowering logic * Improved build times by expressing int64 case of tir FloorMod and FloorDiv using let nodes * Updated use-def analysis and llvm codegen to support duplicated letnodes. * Corrected misuse of var_map_ in llvm codegen * Updated backends that support LetNode * Changed floormod and div lowering logic to avoid using FP on systems that don't support it. * Fixed formatting Co-authored-by: pankratz <[email protected]>
… tir.floordiv. Fixed Topi testcase. (apache#5666) * Improved uncommon case of floormod and floordiv. Removed dependence on np floor_div and fmod. * Fixed clang-format complaints * Streamlined floormod and floordiv lowering logic * Improved build times by expressing int64 case of tir FloorMod and FloorDiv using let nodes * Updated use-def analysis and llvm codegen to support duplicated letnodes. * Corrected misuse of var_map_ in llvm codegen * Updated backends that support LetNode * Changed floormod and div lowering logic to avoid using FP on systems that don't support it. * Fixed formatting Co-authored-by: pankratz <[email protected]>
… tir.floordiv. Fixed Topi testcase. (apache#5666) * Improved uncommon case of floormod and floordiv. Removed dependence on np floor_div and fmod. * Fixed clang-format complaints * Streamlined floormod and floordiv lowering logic * Improved build times by expressing int64 case of tir FloorMod and FloorDiv using let nodes * Updated use-def analysis and llvm codegen to support duplicated letnodes. * Corrected misuse of var_map_ in llvm codegen * Updated backends that support LetNode * Changed floormod and div lowering logic to avoid using FP on systems that don't support it. * Fixed formatting Co-authored-by: pankratz <[email protected]>
… tir.floordiv. Fixed Topi testcase. (apache#5666) * Improved uncommon case of floormod and floordiv. Removed dependence on np floor_div and fmod. * Fixed clang-format complaints * Streamlined floormod and floordiv lowering logic * Improved build times by expressing int64 case of tir FloorMod and FloorDiv using let nodes * Updated use-def analysis and llvm codegen to support duplicated letnodes. * Corrected misuse of var_map_ in llvm codegen * Updated backends that support LetNode * Changed floormod and div lowering logic to avoid using FP on systems that don't support it. * Fixed formatting Co-authored-by: pankratz <[email protected]>
Build times
I experienced hangs when attempting to build a compute statement that depended on deep TIR expressions involving the
tir.floormod
andtir.floordiv
operaors. The build times can be reproduced with the script here. This is due to the very complicated TIR expression generated when FloorMod or FloorDiv are lowered.I was able to improve upon this by changing how these operators were lowered to use the
tir.floor
intrinsic which also has the benefit of matching the definitions of the operators.Topi testcase
There are existing tests for
tir.floormod
andtir.floordiv
through the topitest_topi_broadcast.py
file. However, I noticed that this files usednp.fmod
andnp.floor_divide
as points of reference which are not equivalent tofloormod
andfloordiv
.I fixed the testcase to use a correct version of
floormod
andfloordiv
.