Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[relay][external codegen] outline and inline lifted functions for external codegen #4996

Merged
merged 3 commits into from
Mar 7, 2020

Conversation

zhiics
Copy link
Member

@zhiics zhiics commented Mar 5, 2020

This PR outlines the functions that will be handled by external codegen and then inline them back to avoid the optimizations from Relay. All function level passes are handled directly from the pass manager. A few module level passes needs to detect if a function uses default compiler or not. This is because these passes take the functions internally and they are invisible to the pass manager.

@zhiics
Copy link
Member Author

zhiics commented Mar 5, 2020

cc @comaniac @mbaret @soiferj @tqchen @icemelon9 @masahi

@zhiics zhiics force-pushed the inline_partitioned_functions branch from 58e5b5e to b97bbc6 Compare March 5, 2020 23:08
@zhiics zhiics force-pushed the inline_partitioned_functions branch from b97bbc6 to 6d6d332 Compare March 6, 2020 00:04
@masahi masahi self-assigned this Mar 6, 2020
Copy link
Contributor

@mbaret mbaret left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR! It will help resolve a major issue in the partitioning as well as introducing some infrastructure to enable codegen-specific optimisation passes.

src/relay/pass/partition_graph.cc Outdated Show resolved Hide resolved
src/relay/backend/vm/inline_primitives.cc Show resolved Hide resolved
@masahi
Copy link
Member

masahi commented Mar 6, 2020

How about adding batch norm to the test to make sure it is still there after opt passes?

Copy link
Contributor

@comaniac comaniac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

src/relay/pass/partition_graph.cc Show resolved Hide resolved
@zhiics zhiics force-pushed the inline_partitioned_functions branch from 4dcf21d to 8554928 Compare March 6, 2020 17:55
@zhiics
Copy link
Member Author

zhiics commented Mar 6, 2020

@masahi Good point. Done

Copy link
Member

@masahi masahi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great!

@masahi
Copy link
Member

masahi commented Mar 6, 2020

Will wait for @mbaret's approval.

Copy link
Contributor

@mbaret mbaret left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, lgtm.

@masahi
Copy link
Member

masahi commented Mar 6, 2020

Also need to wait for the github bug to be resolved

@masahi masahi merged commit 28ee806 into apache:master Mar 7, 2020
@masahi
Copy link
Member

masahi commented Mar 7, 2020

Thanks @zhiics @mbaret @comaniac

@zhiics zhiics deleted the inline_partitioned_functions branch March 7, 2020 16:41
@mbaret
Copy link
Contributor

mbaret commented Mar 9, 2020

@zhiics It looks like this makes it so we can't run any passes on external functions. Is this the intended behaviour, and if so are we intending to add a mechanism at some point to allow for passes on external functions?

@zhiics
Copy link
Member Author

zhiics commented Mar 9, 2020

Yes, this is intended. We plan to do it. For example, we may want to have a build pipeline that executes target independent passes first and then target dependent passes. We can partition after platform independent passes and rely on external compilers to take care of them.

You can do this right now manually. For example, you can run constant folding, CSE, etc, before partitioning.

@mbaret
Copy link
Contributor

mbaret commented Mar 9, 2020

Beforehand, we had been running some passes after partitioning (for instance a layout transform for ACL functions). It looks like we can't even force these to run now though as no function passes are allowed to touch external functions. We can't run the passes before partitioning because it will transform the entire graph, not just the bits for external partitioning.

@zhiics
Copy link
Member Author

zhiics commented Mar 9, 2020

I personally don't think TVM should be aware of accelerator specific transformations because they are very hardware specific. For example, we don't want this on TRT, dnnl, and many other accelerators. However, you can still do this even now. You can directly invoke layout transformation without using the pass infra.

@mbaret
Copy link
Contributor

mbaret commented Mar 9, 2020

It's possible we're doing something wrong, but the function LowerExternalFunctions in compile_engine produces modules just containing external functions. We want to call passes on these modules from our ACL-specific codegen (so they'll only affect our codegen, not all TVM functions). However, it looks like the passes are skipping all the functions in these modules because they don't use the default compiler. Is there another way we can manually run the passes?

@zhiics
Copy link
Member Author

zhiics commented Mar 9, 2020

Can you post the questions on discussion forum?

@mbaret
Copy link
Contributor

mbaret commented Mar 9, 2020

Yep, I'll try and give a better explanation there in the morning.

trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Apr 16, 2020
…ernal codegen (apache#4996)

* outline and inline lifted functions for external codegen

* add batch_norm test

* test batch_norm inline
zhiics added a commit to neo-ai/tvm that referenced this pull request Apr 17, 2020
…ernal codegen (apache#4996)

* outline and inline lifted functions for external codegen

* add batch_norm test

* test batch_norm inline
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants