Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RUNTIME] Support module based interface runtime #5753

Merged
merged 29 commits into from
Jul 15, 2020

Conversation

FrozenGene
Copy link
Member

@FrozenGene FrozenGene commented Jun 9, 2020

This pr supports #5038.

include/tvm/runtime/graph_runtime.h Outdated Show resolved Hide resolved
include/tvm/runtime/graph_runtime_factory.h Outdated Show resolved Hide resolved
@tqchen tqchen self-assigned this Jun 12, 2020
@tqchen
Copy link
Member

tqchen commented Jun 12, 2020

@FrozenGene please let us know when it is ready for review

@FrozenGene
Copy link
Member Author

@FrozenGene please let us know when it is ready for review

sure. i will update the status inside this pr and will notify you after completion.

@FrozenGene FrozenGene force-pushed the model_based_runtime branch from 4ca9e56 to 5563dea Compare June 15, 2020 12:43
@FrozenGene FrozenGene force-pushed the model_based_runtime branch 2 times, most recently from 9ecd15f to 3391c66 Compare June 23, 2020 11:57
@FrozenGene
Copy link
Member Author

@tqchen I think you could start to review the core part now and wish to listen your feedback. All the functionality (except the new graph_runtime api signature) of graph runtime (except debug graph runtime / vm) are done. The usage could be refer tests/python/unittest/test_module_runtime_interface.py

@FrozenGene
Copy link
Member Author

gental ping @tqchen I am working on debug_graph_runtime / vm, but i think you could start to review core part :-)

@tqchen
Copy link
Member

tqchen commented Jun 29, 2020

I will spend sometime to review the PR this week

Copy link
Contributor

@ANSHUMAN87 ANSHUMAN87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FrozenGene : Thanks for the PR! Great work 👍
I have done initial round of review. Please find some high level comments and some queries. Hope it helps. Thanks!

python/tvm/contrib/graph_runtime.py Outdated Show resolved Hide resolved
python/tvm/rpc/client.py Outdated Show resolved Hide resolved
python/tvm/runtime/graph_runtime_factory.py Outdated Show resolved Hide resolved
python/tvm/runtime/graph_runtime_factory.py Outdated Show resolved Hide resolved
python/tvm/runtime/graph_runtime_factory.py Outdated Show resolved Hide resolved
python/tvm/runtime/graph_runtime_factory.py Outdated Show resolved Hide resolved
python/tvm/runtime/module.py Outdated Show resolved Hide resolved
python/tvm/runtime/module.py Outdated Show resolved Hide resolved
python/tvm/runtime/module.py Outdated Show resolved Hide resolved
python/tvm/runtime/module.py Outdated Show resolved Hide resolved
python/tvm/runtime/module.py Outdated Show resolved Hide resolved
src/runtime/graph/graph_runtime.h Outdated Show resolved Hide resolved
src/runtime/graph/graph_runtime_factory.cc Outdated Show resolved Hide resolved
src/runtime/graph/graph_runtime_factory.cc Outdated Show resolved Hide resolved
src/runtime/graph/graph_runtime_factory.cc Outdated Show resolved Hide resolved
src/runtime/graph/graph_runtime_factory.h Show resolved Hide resolved
@FrozenGene FrozenGene force-pushed the model_based_runtime branch 2 times, most recently from bd8063b to 2f16dde Compare July 6, 2020 10:09
python/tvm/contrib/graph_runtime.py Outdated Show resolved Hide resolved
python/tvm/rpc/client.py Outdated Show resolved Hide resolved
python/tvm/runtime/graph_runtime_factory.py Outdated Show resolved Hide resolved
src/runtime/graph/graph_runtime_factory.h Outdated Show resolved Hide resolved
src/runtime/graph/graph_runtime_factory.h Outdated Show resolved Hide resolved
src/runtime/graph/graph_runtime.h Outdated Show resolved Hide resolved
python/tvm/runtime/module.py Outdated Show resolved Hide resolved
python/tvm/runtime/module.py Outdated Show resolved Hide resolved
python/tvm/runtime/graph_runtime_factory.py Outdated Show resolved Hide resolved
@tqchen
Copy link
Member

tqchen commented Jul 6, 2020

Thanks @FrozenGene I made some initial comments. Would like to followup on the general design directions. The PR as it is implements the features we want. However, it is equally important to think about the minimalism.

In particular, we want to implement the feature using a minimum set of concepts(APIs). The runtime module based interface is more like a interface convention instead of a common implementation that we use for packaging. We can imagine different kinds of implementations, GraphRuntimeFactory is one of them(for graph executions). We would also like as much de-coupling as possible.

So the ley challenge is -- how can we implement the features using as a minimum set of API interface as possible.

We can dissect the current API into two category of functionalities

  • F0: Load the module in, execute
  • F1: Result of the relay.build, backward compatibility, write the module out.

Notably, F0 and F1 does not have to use the same runtime.module implementations.

Minimum Design for F0

If we focus on F0, we can find that we only need one interface for graph runtime in the C++ side(via Module API) -- the creation function:

from tvm.contrib import graph_runtime
gmod = graph_runtime.GraphModule(mod['resnet18'](tvm.cpu(0)))

Notably, in the use cases of F0, we do not need the GraphRuntimeFactory wrapper(as the wrapper itself is primarily for backward compatiblity reasons).

Minimum Design for F1

If we do not need to support the additional features(e.g. disable package_params or get_params). Then no additional API is needed.

We would certainly need the GraphFactoryModule wrapper to hold the return value of relay.build. However, note that the wrapper is only needed for backward compatibility reason only. As a result, we do not need to place GraphFactoryModule in the runtime folder, instead, we can just place it near under relay.backend, or close to graph_runtime.py for now. When we deprecate the old runtime API eventually, we could remove the python wrapper.

Discussions

From the above discussions, we can find that the only really necessary API is the factory creation function. We could certainly expose get_params so users can obtain the parameters.

The current way of implementing disable_params should be simplified. First of all, we would prefer stateless classes as much as possible, so the API that switch a flag on and off is not really a good idea.

One potential way to address the problem is to still use a compositional API:

mod = relay.build()
# return a new GraphFactoryModule with params removed
# mod_no_params does not need the GraphFactoryModule wrapping.
mod_no_params = mod["remove_params"]()
# no params will be exported
mod_no_params.export_library("xyz.so")
# params will be exported
mod.export_library("xyz.so")

We can discuss more API naming choices. Another parallel thread is how to create a debug runtime(if available). In this case, we could simply do mod["debug_create"]("default", ctx.cpu(0)).

@tqchen tqchen added the status: need update need update based on feedbacks label Jul 6, 2020
@FrozenGene FrozenGene marked this pull request as ready for review July 9, 2020 06:17
@FrozenGene FrozenGene changed the title [Draft] Support Module based interface runtime Support Module based interface runtime Jul 9, 2020
@FrozenGene FrozenGene changed the title Support Module based interface runtime Support module based interface runtime Jul 9, 2020
@FrozenGene
Copy link
Member Author

FrozenGene commented Jul 11, 2020

Thanks @tqchen @zhiics review. I will update code to address these comments tomorrow.

About tvm::map, do we have plan to move it into runtime/container like our tvm::String?

@FrozenGene
Copy link
Member Author

@tqchen @zhiics could you help to review it again?

python/tvm/relay/backend/graph_runtime_factory.py Outdated Show resolved Hide resolved
python/tvm/relay/backend/graph_runtime_factory.py Outdated Show resolved Hide resolved
names.emplace_back(v.first);
arrays.emplace_back(const_cast<DLTensor*>(v.second.operator->()));
}
uint64_t sz = arrays.size();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the MetadataModule, we should be able to remove the serialization and deserialization of params for GraphRuntime and the factory. That may affect downstream users. I can take a stab on it later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you give me a link about this MetadataModule?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was introduced in #5770. We should not do it in this pr. This just makes you aware of it.

@tqchen
Copy link
Member

tqchen commented Jul 14, 2020

Copy link
Member

@zhiics zhiics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ANSHUMAN87
Copy link
Contributor

ANSHUMAN87 commented Jul 14, 2020

@FrozenGene : Sorry for late pitch in! With the latest change, i am not able to find dealing with multiple modules like previously done. Can you please point me to one or give an example how to do it. Thanks!

@FrozenGene
Copy link
Member Author

FrozenGene commented Jul 14, 2020

@FrozenGene : With the latest change, i am not able to find dealing with multiple modules like previously done. Can you please point me to one or give an example how to do it. Thanks!

@ANSHUMAN87 Ah...yes. Latest change of this pr doesn't contain this part. The main reasons is our compiler doesn't be ready for it. For example, imagine we have one resnet18 model and resnet50 model for CPU, our compiler will not generate unique name for them. Both models will have the same function name like fused_nn_contrib_conv2d_NCHWc_add. So when our compiler is ready for this, we could enable multi model support. Current pr has considered this situation and we could add this support easily. Previous pr you see the multi model support is one model for CPU, one model for GPU, it just work around and bypass this issue.

@ANSHUMAN87
Copy link
Contributor

@FrozenGene : Thanks a lot! I got it now. So may be we add this as a note in the original issue tracker for this feature. So that we can comeback to it at later point.
To me the multiple module support is the key attraction 🙂

Copy link
Contributor

@ANSHUMAN87 ANSHUMAN87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @FrozenGene 👍

@FrozenGene
Copy link
Member Author

@FrozenGene : Thanks a lot! I got it now. So may be we add this as a note in the original issue tracker for this feature. So that we can comeback to it at later point.
To me the multiple module support is the key attraction 🙂

I have listed it in original RFC (#5038)

@tqchen tqchen changed the title Support module based interface runtime [RUNTIME] Support module based interface runtime Jul 15, 2020
@tqchen tqchen merged commit 9fcde21 into apache:master Jul 15, 2020
@tqchen
Copy link
Member

tqchen commented Jul 15, 2020

Thanks @FrozenGene , this PR is now merged. Thanks @zhiics @ANSHUMAN87

@tqchen tqchen added status: accepted and removed status: WIP status: need update need update based on feedbacks labels Jul 15, 2020
@FrozenGene FrozenGene deleted the model_based_runtime branch July 16, 2020 02:39
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Aug 26, 2020
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Aug 26, 2020
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Sep 2, 2020
trevor-m pushed a commit to neo-ai/tvm that referenced this pull request Sep 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants