-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mismatch in IterDomain Iteration Types in TorchBench functorch_dp_cifar10 #2008
Comments
Just what I'm seeing for now. Here's the fusion math:
And we are using the pointwise scheduler, which seems to be failing due to the reduction-broadcast pattern. |
C++ repro:
Looks like the problem is replaying propagation from a producer to a consumer when there's more than one trivial reduction. In the above repro, the replay works fine when the number of trivial domains is just 1 (as shown in the comments). I'll look into it more closely. |
Thanks. |
* Allow non-root trivial reductions Fixes #2008 Co-authored-by: Christian Sarofeen <[email protected]>
* Fix vectorize size calculation (#2035) * Allow non-root trivial reductions (#2037) * Allow non-root trivial reductions Fixes #2008 Co-authored-by: Christian Sarofeen <[email protected]> * Test file cleanup (#2040) * Move test_gpu.cpp to test_gpu1.cpp * Split test_gpu1.cpp to test_gpu1.cpp, test_gpu2.cpp and test_gpu3.cpp. Each file should be up to 10K LoC. New tests should be added to test_gpu3.cpp until it gets 10K LoC. Co-authored-by: Gao, Xiang <[email protected]> Co-authored-by: Christian Sarofeen <[email protected]>
* Allow non-root trivial reductions (#2037) * Allow non-root trivial reductions Fixes #2008 Co-authored-by: Christian Sarofeen <[email protected]> * Test file cleanup (#2040) * Move test_gpu.cpp to test_gpu1.cpp * Split test_gpu1.cpp to test_gpu1.cpp, test_gpu2.cpp and test_gpu3.cpp. Each file should be up to 10K LoC. New tests should be added to test_gpu3.cpp until it gets 10K LoC. * format * fix merge * format Co-authored-by: Christian Sarofeen <[email protected]>
🐛 Describe the bug
The bug is happening during scheduling:
Prescheduled Fusion IR:
Error:
Repro requires a bump in the devel fork from upstream to pick up the python frontend changes.
Repro:
Versions
Upstream TOT?
The text was updated successfully, but these errors were encountered: