-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TIR] Allow compute_at create block predicate for non-trivial bounds and support floordiv pattern #9527
[TIR] Allow compute_at create block predicate for non-trivial bounds and support floordiv pattern #9527
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. It is super helpful for imperfect tiling case.
For the region cover problem, I will look at it. It's better to fix it before this PR merged.
cc @junrushao1994
@@ -514,6 +605,14 @@ void ComputeAtOrReverseComputeAtImpl(ScheduleState self, const StmtSRef& block_s | |||
/*realize=*/reconstructor.new_block_realize_, | |||
/*loop_var_ranges=*/LoopDomainOfSRefTreePath(GetRef<StmtSRef>(block_sref->parent)), | |||
/*analyzer=*/&analyzer); | |||
// The verifier can not prove region cover state if some complex predicte is introduced | |||
// so here it explicitly reset these flags below. | |||
if (is_compute_at && !is_const_int(reconstructor.new_block_realize_->predicate)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a bug of RegionCoverCheck
. We should fix it instead of working around it.
This is very helpful! Would love to let @Hzfengsy shepherd this PR. Thanks a lot! |
CC @Hzfengsy |
Add an option |
f91c1fc
to
15e3963
Compare
Great job! Here are some comments. |
After discusion with @Hzfengsy, I decide to revert the The original concern is that if the desired pattern is just the dynamic loop extents. Take "cache" block as an example, user may want to lower it into some DMA operations. If the DMA intrinsic happen to be dynamic shape enabled, but without conditional accesses, it would be non-trivial to pattern matching during lowering. |
In your example, why the extend of ax0 and ax1 is 10 ? |
This is the extent to cover the region required by compute block's reads. |
15e3963
to
0280297
Compare
0280297
to
dca31be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks @wrongtest for the hard and long-term work!
…and support floordiv pattern (apache#9527) * allow generate block predicate in compute_at schedule * revert apache#9880 and add more testcases
Hi there~ This PR is an enforcement for
compute_at
andreverse_compute_at
primitives. Binding block into loops may create some non-trivial iter bounds. Complex iter bound is neither human-kind friendly nor compatible with backend passes targeting at bounds and conditions (eg, loop partition). So the PR try to distinguish some of complex bounds and use block predicates to make the ir structure simpler.A working example is as below, we want to create spatial tiles and read each tiled data from cache, thus the schedule operation is
compute_at
cache_read block into tiled loops.Main stream code will produce
The PR will produce
The modification is to delay the intersection of intset deduced from required uses and intset enforced by buffer shape / original iter bound. Instead of direct intset intersection (can create much complex expr of min/max), A
BlockVarDomainInfo
class is added to maintain above two intsets named asdom
andbound
. Finally the implementation can choose with some heuristic:dom
^bound
) as iter domain if it is simple enoughdom
as iter domain and add block predicate forbound
The PR also add minimal support to analyze floordiv/floormod in provide-required region mapping.