We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PassUpDomain estimates a bounding box for IntSet conservatively when axes are first fused, and then split. Code:
import tvm m = tvm.convert(12) l = tvm.convert(6) A = tvm.placeholder((m, l), name='A') A1 = tvm.compute((m, l), lambda i, j: A[i, j], name='A1') A2 = tvm.compute((m, l), lambda i, j: A1[i, j] + 3, name='A2') s = tvm.create_schedule(A2.op) fused_axes = s[A2].fuse(A2.op.axis[0], A2.op.axis[1]) xo, xi = s[A2].split(fused_axes, 12) s[A1].compute_at(s[A2], xo) print(tvm.lower(s, [A, A1, A2], simple_mode=True))
produces
produce A2 { for (i.j.fused.outer, 0, 6) { produce A1 { for (i, 0, 12) { for (j, 0, 6) { A1[((i*6) + j)] = A[((i*6) + j)] } } } for (i.j.fused.inner, 0, 12) { A2[((i.j.fused.outer*12) + i.j.fused.inner)] = (A1[((i.j.fused.outer*12) + i.j.fused.inner)] + 3.000000f) } } }
Note that the whole tensor A1 is realized at each iteration of i.j.fused.outer. More efficient would be:
A1
i.j.fused.outer
produce A2 { for (i.j.fused.outer, 0, 6) { produce A1 { for (i, 0, 2) { for (j, 0, 6) { A1[((((i.j.fused.outer*2) + i)*6) + j)] = A[((((i.j.fused.outer*2) + i)*6) + j)] } } } for (i.j.fused.inner, 0, 12) { A2[((i.j.fused.outer*12) + i.j.fused.inner)] = (A1[((i.j.fused.outer*12) + i.j.fused.inner)] + 3.000000f) } } }
Related discussions: https://discuss.tvm.ai/t/discuss-contributing-new-docs-for-inferbound/2151/9 https://discuss.tvm.ai/t/tensorize-which-use-case-is-correct/2140/4
The text was updated successfully, but these errors were encountered:
Successfully merging a pull request may close this issue.
PassUpDomain estimates a bounding box for IntSet conservatively when axes are first fused, and then split.
Code:
produces
Note that the whole tensor
A1
is realized at each iteration ofi.j.fused.outer
.More efficient would be:
Related discussions:
https://discuss.tvm.ai/t/discuss-contributing-new-docs-for-inferbound/2151/9
https://discuss.tvm.ai/t/tensorize-which-use-case-is-correct/2140/4
The text was updated successfully, but these errors were encountered: