-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix two compute_with bugs. #8152
Conversation
This PR fixes a bug in compute_with, and another bug I found while fixing it (we could really use a compute_with fuzzer). The first bug is that you can get into situations where the bounds of a producer func will refer directly to the loop variable of a consumer func, where the consumer is in a compute_with fused group. In main, that loop variable may not be defined because fused loop names have been rewritten to include the token ".fused.". This PR adds let stmts to define it just inside the fused loop body. The second bug is that not all parent loops in compute_with fused groups were having their bounds expanded to cover the region to be computed of all children, because the logic for deciding which loops to expand only considered the non-specialized pure definition. So e.g. compute_with applied to an update stage would fail to compute values of the child Func where they do not overlap with the parent Func. This PR visits all definitions of the parent Func of the fused group, instead of just the unspecialized pure definition of the parent Func. Fixes #8149
Tagging Alex for review, as I think he was the last person to touch this (albeit a long time ago). |
const Stmt &produceDef = Internal::build_extern_produce(env, f, target); | ||
producer = inject_stmt(producer, produceDef, LoopLevel::inlined().lock()); | ||
const Stmt &produce_def = Internal::build_extern_produce(env, f, target); | ||
producer = inject_stmt(producer, produce_def, LoopLevel::inlined().lock()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Were these changes done automatically? Why the churn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was a drive-by fix to a location variable that didn't match our naming conventions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good... I don't totally understand the for_type
or device_api
assignments for the new loops, but it looks like it implements whatever the old policy was.
* Fix two compute_with bugs. This PR fixes a bug in compute_with, and another bug I found while fixing it (we could really use a compute_with fuzzer). The first bug is that you can get into situations where the bounds of a producer func will refer directly to the loop variable of a consumer func, where the consumer is in a compute_with fused group. In main, that loop variable may not be defined because fused loop names have been rewritten to include the token ".fused.". This PR adds let stmts to define it just inside the fused loop body. The second bug is that not all parent loops in compute_with fused groups were having their bounds expanded to cover the region to be computed of all children, because the logic for deciding which loops to expand only considered the non-specialized pure definition. So e.g. compute_with applied to an update stage would fail to compute values of the child Func where they do not overlap with the parent Func. This PR visits all definitions of the parent Func of the fused group, instead of just the unspecialized pure definition of the parent Func. Fixes halide#8149 * clang-tidy
This PR fixes a bug in compute_with, and another bug I found while fixing it (we could really use a compute_with fuzzer).
The first bug is that you can get into situations where the bounds of a producer func will refer directly to the loop variable of a consumer func, where the consumer is in a compute_with fused group. In main, that loop variable may not be defined because fused loop names have been rewritten to include the token ".fused.". This PR adds let stmts to define it just inside the fused loop body.
The second bug is that not all parent loops in compute_with fused groups were having their bounds expanded to cover the region to be computed of all children, because the logic for deciding which loops to expand only considered the non-specialized pure definition. So e.g. compute_with applied to an update stage would fail to compute values of the child Func where they do not overlap with the parent Func. This PR visits all definitions of the parent Func of the fused group, instead of just the unspecialized pure definition of the parent Func.
Fixes #8149