-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TVMScript] Support TVMScript template meta-programming over variables #11097
Conversation
To be clear, it's certain limited level of metaprogramming, not the entire thing :-) |
Amazing! I'll soon start tensorization using MMA intrinsics, and for that I absolutely need parameterization over shape and dtype to support all variants. Also I'd love to clean up various dot product intrinsics in https://github.com/apache/tvm/tree/main/python/tvm/tir/tensor_intrin |
I would love to wait for a week and have @Lunderberg review this PR before merging, to make sure it satisfies Eric’s usecase. What do you guys think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, and the functionality so far looks really good! I like the use of inspect.getclosurevars
to avoid needing an additional argument to T.prim_func
.
There's one specific issue I ran into, when using the parameters as part of expressions, where I have a recommendation to fix it.
There's also a couple of structures that I'm currently using that don't work (e.g. A T.Buffer[shape, "int32"]
, where shape
is a tuple in the parent scope). I'll see if I can narrow down the into individual testable structures, either for this PR or a follow-up PR
@@ -233,7 +240,7 @@ def lookup_symbol(self, name: str) -> Optional[Union[Buffer, Var]]: | |||
for symbols in reversed(self.symbols): | |||
if name in symbols: | |||
return symbols[name] | |||
return None | |||
return self.closure_vars.get(name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm running into some errors when a closure variable isn't already a PrimExpr
. In the example below, this will cause an error message saying that it cannot add together a PrimExpr
and an int.
offset = 1
@T.prim_func
def func(A: T.Buffer[(1,), "int32"], B: T.Buffer[(1,), "int32"]):
B[0] = A[0] + offset
Whenever we pull a variable out of the closure, can we run it through tvm.runtime.convert
? That way, any expression type supported by the FFI would be converted to a TIR-supported format.
if name in self.closure_vars:
return tvm.runtime.convert(self.closure_vars[name])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And it looks like this recommended change also causes some of the passing tests in https://gist.github.com/Lunderberg/dd38f82810e4e06c0834087d4a96bda9 to fail, such as using a meta-parameter as a loop iteration bound. But that's more than I can debug on a Friday afternoon.
I put together a gist with some meta-parameter tests. The convolution at the start is the sort of thing I'd like to be able to express. The later tests were me experimenting with things that felt like they should work, but not all of them do. The main categories of the failures I made were:
|
a8b3254
to
b38032a
Compare
Thanks @Lunderberg for bring such testcases and issues, which is valuable and needs more efforts. Unfortunately I don't have much bandwidth on the specific usecase, but I'm happy to review if you submit subsequent changes. |
Thanks @Hzfengsy for sharing! Yeah it's completely understandable that Siyuan doesn't have enough time lately to do another PR to iterate over this one. I would say the charm of an OSS community is that people enjoy the freedom to collaborate publicly without boundary, getting recognition, credit and ownership they deserve as they contribute, no matter if the original owner has enough bandwidth or not :-) In fact, our group (@yelite @juda @cyx-6) are going to push for an ambitious complete refactoring of TVMScript printer/parser for better extensibility (e.g. for Relax and other customized IR), and full-featured metaprogramming (as opposed to this limited one), but it would require some non-trivial time (until end of Q2). We mentioned this project a couple of time internally, but the full text for OSS is still under active construction - happy to share the preliminary/immature proposal. The reason that I pinged Siyuan to submit this quick patch that supports only certain degree of limited metaprogramming is that I saw @masahi and @Lunderberg have some imminent product needs for this feature (e.g. tensorization), and therefore believe that:
This PR demonstrates the conceptual idea of this preliminary syntax for metaprogramming, as Eric and Masa are already familiar with: def matmul_generator(M, N, K, dtype):
@T.prim_func
def matmul(a: T.handle, b: T.handle, c: T.handle) -> None:
A = T.match_buffer(a, [M, K], dtype=dtype)
B = T.match_buffer(b, [N, K], dtype=dtype)
C = T.match_buffer(c, [M, N], dtype=dtype)
for i, j, k in T.grid(M, N, K):
with T.block():
vi, vj, vk = T.axis.remap("SSR", [i, j, k])
with T.init():
C[vi, vj] = T.float32(0)
C[vi, vj] = C[vi, vj] + A[vi, vk] * B[vj, vk]
return matmul
f_1 = matmul_generator(128, 128, 128, "float16")
f_2 = matmul_generator(256, 256, 256, "float32") The action I propose to take:
Let me know what you guys think! |
If we'd like to distinguish variables and other asts in script and variables in host language? For example, use a mustache syntax the case above could be (like the html template of VUE.js) def matmul_generator(M, N, K, dtype):
@T.prim_func
def matmul(a: T.handle, b: T.handle, c: T.handle) -> None:
A = T.match_buffer(a, [{M}, {K}], dtype={dtype})
B = T.match_buffer(b, [{N}, {K}], dtype={dtype})
C = T.match_buffer(c, [{M}, {N}], dtype={dtype})
for i, j, k in T.grid({M}, {N}, {K}):
with T.block():
vi, vj, vk = T.axis.remap("SSR", [i, j, k])
with T.init():
C[vi, vj] = T.float32(0)
C[vi, vj] = C[vi, vj] + A[vi, vk] * B[vj, vk]
return matmul And we can also write like T.evaluate({ [call my expr ib builder] }) or for i in range({N}):
{
[call my ib builder to create sequence of stmts]
} |
Yes, I believe the new capability suffices for my need, and since the required change is small, I think we should land it as is now.
This is a very valid point, but introducing a quotation syntax extension like that really gets us into the realm of a full-blown multi-stage programming systems. Things can get complicated pretty fast if we want to discuss what exactly we can put inside of the quote (e.g. do we allow arbitrary python expressions, do we allow referring to TVMScript values inside quote etc). I believe thorough design discussion is needed before we formally introduce such syntax extensions. Good examples of prior art from PL include:
|
b38032a
to
ee3135f
Compare
ee3135f
to
b9df7bb
Compare
This is sufficient for my use, yes. My primary use case is in writing unit tests, with various shape parameters.
I think I'd lean against having it different syntax. With the current implementation, meta-parameters in tvmscript appear to a user as if they are nonlocal variables in a python function, which becomes a very easy explanation to teach new users. If we need to make this distinction while parsing to TIR, that can be done by querying the context manager. If we need to make this distinction after generating TIR, I think that would be better done using
I agree, changing the title would help emphasize that this is one specific metaprogramming feature, and not the final state of metaprogramming for tvmscript.
I'm down for any of these options. I think it would make the most sense to have separate PRs, since this PR alone provides tremendous utility, and additional features can be added as bandwidth allows and needs come up. |
Thank you @junrushao1994 for moderating and proposing the actionable items, and thank you all @Lunderberg @masahi @wrongtest for the very inspiring discussion so far! Learned quite a lot by just reading this thread :-)
Looks like we have reached consensus - excited to see it works for your usecases!
Thanks. I'm going to change the title to "[TVMScript] Support TVMScript template meta-programming over variables"
Yay! The CI is green now!
I agree with Eric that having separate PRs makes most of sense. No need to add me as co-author if I didn't contribute, but I'm happy to review! Besides, while understanding that RFC drafting is time-consuming and painstaking, multiple active community members have been already showing great interest in TVMScript stuff, specifically metaprogramming. To this end, I have a meta point: A5. Even if we didn't have a final draft, @yelite @junrushao1994 would you like to share some preliminary motivating examples or design ideas with the community for broader discussion, also to avoid potential duplication of effort? |
Thanks for the discussion so far! I think the mission of this PR is basically complete, so I'm going to approve.
I'm personally more than happy to open a discussion thread with motivating examples in metaprogramming, sharing immature POCs, and hearing from the community. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks for the discussion! I think all of us reach consensus on A1-A4. @Lunderberg Could you please take another look at it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly! Looking through, everything looks good to me!
apache#11097) This PR supports a simple meta-programming paradigm for TVMScript, which allows users to get access to var definition in the Python environment.
This PR adds `te.extern_primfunc` which provides the interface around TE ExternOp that allows a TVMScript defined schedulable TIR PrimFunc to be inlined into a TE compute graph. The result is that TIR can be used for compute definitions in Relay OpStrategies and, paired with meta-scheduler support in relay as introduced in #10578, these compute definitions can be scheduled and tuned as demonstrated in the attached tests. Prior to this, compute definitions were limited to those definable in TE only. As a consequence of this patch and ongoing improvements to TVMScript meta-programming (#11097), TOPI can be extended to include compute and scheduling functions targeting schedulable TIR uniformly.
This PR supports a simple meta-programming paradise for TVMScript, which allows users to get access to var definition in the Python environment.
cc @junrushao1994 @Lunderberg @spectrometerHBH @wrongtest