-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BYOC] Add example of Composite + Annotate for DNNL fused op #5272
Conversation
@tqchen I got segfault in CI that is not related to this PR. Can you have a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @masahi for the contribution. I think the constant problem is tricky for the C Source module as it would generate a huge source file for compilation. I have some initial thoughts and code on to solve the problem. I need to think about it a bit more and file an RFC when I have some cycles in a few weeks.
Also added support for conv2d + relu pattern. This shows how easy it is to add a new pattern and support it in the codegen. |
@tqchen Can you have a look at CI problem? https://ci.tvm.ai/blue/organizations/jenkins/tvm/detail/PR-5272/7/pipeline/152 |
Please try to retrigger #5285 |
Yes I got that error too, but also there is a problem on i386 CI causing segfault from crt. https://ci.tvm.ai/blue/organizations/jenkins/tvm/detail/PR-5272/7/pipeline/152. This PR #5280 also had the same segfault from i386. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I still feel this is a bit ad-hoc, it should be fine for now as we only use DNNL as an example to demonstrate external codegen.
In addition, it seems to me that we don't need this check anymore because DNNL codegen now supports a subgraph with multiple ops. @masahi @zhiics please confirm:
void GenDNNLFunc(const Function& func) {
CHECK(func.defined()) << "Input error: expect a Relay function.";
const auto* call = func->body.as<CallNode>();
CHECK(call) << "DNNL expects a single convolution or dense op"; // This check.
What is still ad-hoc? I think I explained the motivation clearly in the original PR #4741 and the current impl using composite is the best we could do. Wanting fused op support in external codegen is a natural and @alexbooth already finds it useful. |
Sorry for being unclear. The motivation of using fused ops to demonstrate composite functions is definitely clear. What I meant was the general implementation of DNNL codegen as I've mentioned in #4741, so I'm good with this PR. |
@masahi please rebase when you get a chance |
@masahi just a reminder, probably you didn't rebase again the master, other changes are here as well |
Yes I'm working on fixing it. Resolving conflict turned out a bit complicated, so I wanted to start fresh. |
Hey sorry for that. Yes that was a mistake. When I was rebasing to the master, the commit that decouples DNNL and VM part was reapplied, so #5288 was reverted as you pointed out. If possible, would you cherry-pick it back in this PR? Thanks! |
@comaniac Done. Please take a look at the last commit |
Changes to multi output support in dnnl/codegen.cc has also been reverted in #5277, but I've already integrated that change in this PR last night. |
Yes I've also checked that those diffs are in this PR as well. Thanks. |
…5272) * merge change from dev branch * fix string issue * bring comanic's change back
…5272) * merge change from dev branch * fix string issue * bring comanic's change back
…5272) * merge change from dev branch * fix string issue * bring comanic's change back
…m_data:master to master * commit 'cd0d52daa6942bdafa9363ff6cfa3d25fcd5b8d6': (824 commits) [Intrinsic] Add log1p, ldexp, atan2, hypot, nextafter, copysign (apache#5312) [Rust][CI] Restore Rust CI (apache#5137) Remove PrimExpr from String (apache#5311) [Requantize] Cleanup and Optimize Lowering (apache#5286) [IR][TRANSFORM] Enable CopyOnWrite for passes. (apache#5309) [PYTORCH]Abs, Arange, Softplus ops (apache#5295) [LLVM] Fix generation of LLVM intrinsics (apache#5282) [BYOC] Add example of Composite + Annotate for DNNL fused op (apache#5272) [Frontend][TensorFlow]Improve TensorFlow Static Shape Tensor Array (apache#5243) [RUNTIME] Introduce RValue reference(move) support to TypedPackedFunc (apache#5271) [RELAY][FRONTEND][CAFFE2] add Mul and ConvTranspose operator (apache#5302) [BYOC] Refine AnnotateTarget and MergeCompilerRegion Passes (apache#5277) [CI] Fix the hexagon string (apache#5304) [Arith] linear system and equation solver (apache#5171) [PYTORCH]Repeat, Reciprocal & Reshape Op support (apache#5280) [FRONTEND][TENSORFLOW] Fix gather_nd indices (apache#5279) Update device_annotation.cc (apache#5291) [REFACTOR][IR] Move to runtime::String (apache#5276) [NDArray] Set shape_ in NDArray::FromDLPack (apache#5301) [RUNTIME] Initial implementation of Hexagon runtime support (apache#5252) ...
This is a reimplementation of #4741 based on the new annotator support for composite added in #5261. This is the first real use case of composite in the code base. It is purely for demonstration purpose and not intended to be used for performance critical scenarios.
Due to the manually inlined tensors in c codegen, full mobilenet execution is disabled in the test. I tried but it took more than 5GB of RAM and compilation didn't finish in a reasonable time. The related issue discussed in https://discuss.tvm.ai/t/external-codegen-constant-tensors-in-c-codegen/5890/