Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

µTVM CRT modifications for on-device RPC server #5921

Merged
merged 32 commits into from
Jul 12, 2020

Conversation

areusch
Copy link
Contributor

@areusch areusch commented Jun 25, 2020

Sending change to CI; will update with more details when it's ready to review and RFC is posted.

@areusch areusch force-pushed the utvm-crt-changes branch from 64abc40 to 4d738c5 Compare June 25, 2020 16:37
@liangfu liangfu self-assigned this Jun 25, 2020
@liangfu
Copy link
Member

liangfu commented Jun 25, 2020

@areusch Thanks for your valuable contribution. Since this is a rather big change that might affect many existing users, would you please send an RFC for discussion before taking actions on implementation?

@areusch
Copy link
Contributor Author

areusch commented Jun 25, 2020

hi @liangfu, I posted up an RFC about this here: https://discuss.tvm.ai/t/rfc-misra-c-changes-for-rpc-support/7098/2

@liangfu
Copy link
Member

liangfu commented Jun 26, 2020

Thanks for quick response and the comprehensive RFC. Let's discuss the the proposed features in the post.

areusch added 9 commits June 25, 2020 17:56
 * Create a make-based build in src/runtime/crt. This is intended to
   be built in build/standalone_crt (generated by running ninja
   standalone_crt in build/). Its job is to build CRT without
   depending on headers not explicitly allowed in CRT.
 * Create a "public-facing" CRT API targeted to firmware running
   alongside CRT in include/tvm/runtime/crt. Developers who are
   integrating the CRT are the target of this API.
 * Reorganize CRT internally into common/ and graph_runtime/
   pieces. Build each pieces as a separate statically-linked library.
 * Slim down TVMGraphRuntime public-facing API to just the functions
   that are used externally.
 * Updates to apps/bundle_deploy to make this work.
 * Also add error_codes.h, a file containing error codes returned by CRT.
 * NOTE: This changes the default API for functions exposed under the
   CRT by the TVMFuncCall API. `resource_handle` is now always given
   as a new 6th parameter.
 * `resource_handle` is NULL when invoked on a global function and a
   pointer to the module owning the function otherwise.
@areusch areusch force-pushed the utvm-crt-changes branch from bea63f8 to fa4f0e6 Compare June 26, 2020 16:34
@@ -0,0 +1,57 @@
# Licensed to the Apache Software Foundation (ASF) under one
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that we are going to have StandaloneCrt.cmake, can we use CMake to build this instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im not sure how to do this without dragging in extra tvm configuration from e.g. include_directories() calls. The other thing is that the make-based flow is a good example for firmware engineers not used to cmake. What do you think? Cc @tqchen

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will let you guys decide, but it might make sense to show a make flow if it is simple enough, because others can learn from it

static TVMFunctionHandle EncodeFunctionHandle(tvm_module_index_t module_index,
tvm_function_index_t function_index) {
return (TVMFunctionHandle)((uintptr_t)(
((module_index | 0x8000) << (sizeof(tvm_function_index_t) * 8)) | (function_index | 0x8000)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a define (maybe INT16_MSB_MASK) and use for all occurrences of 0x8000

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, why is the function index ORing a valid bit as well (as opposed to the RFC, where it was just the module index)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left this in so that all valid TVMFunctionHandle are non-zero

* Assumes a constant average function name length.
*/
static const size_t kTvmAverageFuncEntrySizeBytes =
kTvmAverageFunctionNameSizeBytes + 1 + sizeof(void*);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the + 1 here? if it's for the null terminator, should that be in the const above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/size/strlen/g to clarify

# specific language governing permissions and limitations
# under the License.

if(USE_STANDALONE_CRT)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we should add this to the cmake config

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

TVMArgs TVMArgs_Create(TVMValue* values, uint32_t* tcodes, uint32_t values_count);

typedef struct TVMPackedFunc {
char name[200];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe make this smaller and defined in crt_config.h?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar thing--can't touch this in this PR as it's pretty big right now

Comment on lines +49 to +52
// operator type in string
char op_type[16];
// name of the op
char name[120];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

define consts in crt_config.h?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not going to touch these right now, they're orthogonal to this change. a follow-on PR can look at the graph runtime impl more thoroughly

// overfill `names`.
EXPECT_LT(kTvmAverageFuncEntrySizeBytes, strlen(function_name_chars) + 1);

for (unsigned int buf_size = 0; buf_size < 10 + 1 + sizeof(void*); buf_size++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should 10 + 1 + sizeof(void*) be replaced with kTvmAverageFuncEntrySizeBytes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good call, done.


for (unsigned int rem = 0; rem < kTvmAverageFuncEntrySizeBytes; rem++) {
// test_function name will be used to test overfilling.
char test_function_name[kTvmAverageFuncNameSizeBytes + 2 + rem];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why + 2 here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its testing attempted over-filling, so + 1 for the overfill char and +1 for \0.

@weberlo
Copy link
Contributor

weberlo commented Jul 1, 2020

@areusch Added some comments. Also curious if there are any special instructions for running the tests. I added set(USE_STANDALONE_CRT on) in config.cmake, but when I attempt to build, I get

CMake Error at cmake/modules/StandaloneCrt.cmake:92 (target_link_libraries):
  Cannot specify link libraries for target
  "host_standalone_crt_graph_runtime" which is not built by this project.
Call Stack (most recent call first):
  CMakeLists.txt:308 (include)

* TVM runtime PackedFunc API.
*/
/* class GraphRuntime : public ModuleNode { */
typedef struct TVMGraphRuntimeAPI {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to add the "API" suffix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah sorry, previous approach which I now don't use. API should be gone everywhere now

return (TVMModuleHandle)((uintptr_t)(module_index | 0x8000));
}

static int _TVMModCreateFromCModule(const TVMModule* mod, TVMModuleHandle* out_handle) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider remove the underscore prefix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

graph, lib, params = relay.build(
func, 'llvm --system-lib', params=params)
func, 'c', params=params)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why should we change this default behavior?

Copy link
Contributor Author

@areusch areusch Jul 8, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i changed this at first because llvm didn't work on my mac. then, I needed to keep it changed to handle --system-lib for now:

--system-lib is being handled using python/tvm/micro/function_registry.py. in a follow-on I should be able to fold that in to C++ (I.e. as something like --system-lib --runtime=crt). is it ok to leave as c for this PR?

Copy link
Member

@liangfu liangfu Jul 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest keeping llvm --system-lib, since we are targeting uTVM, not just mac :-)

@areusch
Copy link
Contributor Author

areusch commented Jul 9, 2020

@liangfu @weberlo looks like this is passing CI; please take another look and explicitly approve if you're okay with the current state

@liangfu
Copy link
Member

liangfu commented Jul 10, 2020

Thanks @areusch .

I tried to run make in apps/bundle_deploy, and failed with following error

g++ -shared -g -Wall -std=c++14 -O2 -fPIC -I/home/ubuntu/workspace/tvm/include -I/home/ubuntu/workspace/tvm/3rdparty/dmlc-core/include -I/home/ubuntu/workspace/tvm/3rdparty/dlpack/include -Icrt_config -fvisibility=hidden -o build/bundle.so  bundle.cc build/model.o build/func_registry.o build/crt/graph_runtime/libgraph_runtime.a build/crt/common/libcommon.a -pthread
/usr/bin/ld: build/crt/common/libcommon.a(crt_runtime_api.o): relocation R_X86_64_PC32 against symbol `TVMAPIErrorf' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Bad value
collect2: error: ld returned 1 exit status
Makefile:103: recipe for target 'build/bundle.so' failed
make: *** [build/bundle.so] Error 1

In addition, the inference speed slowed down comparing to the llvm --system-lib approach, and there are lots of unused-variable warnings in build/model.c.

Did I make any wrong attempt in running the demo?

@tqchen tqchen added status: need review status: need update need update based on feedbacks labels Jul 10, 2020
@areusch
Copy link
Contributor Author

areusch commented Jul 10, 2020

@liangfu thanks for taking a look. I looked more into your error and it looks like I overlooked testing bundle_dynamic most recently. that is fixable, however...

I did some more investigation and there are a couple of problems:

  1. I believe the performance regression is related to switching to c compiler, based on some profiling.
  2. with the CRT changes proposed in this PR, we can't use --system-lib because the global CRT memory manager needs to be initialized before functions can be registered, and the current --system-lib approach registers functions in static initialization. especially for targeting embedded bare-metal systems, I think it would be wise to allow a way to do arbitrary initialization before requiring use of TVM components.
  3. I tried using just plain llvm to avoid/prove the performance regression; however, there isn't a way to create a statically-linked non-system-lib right now (i.e. see codegen init here). i'd like to fix this, but I don't want to make this PR any larger than it is right now.
  4. finally, all of this is conspiring to make the bundle_deploy make situation more complex. bundle_deploy produces 3 targets: C main statically-linked against CRT runtime, c++ main() dynamically loading pure-c CRT, and c++ main dynamically loading c++ Runtime. the last one requires --system-lib, while the other two can't have it. so we'll need to produce a model/test_model that targets the c++ runtime to keep the last target alive.

in a future PR, I plan to add an additional flag to the target string --runtime=<crt|c++>. this flag would allow us to produce a statically-linked llvm module compatible with the C runtime. would it be okay to do the following til then:

  1. use c module for all CRT bundle_deploy targets
  2. build 2 copies of test_model and model: one for CRT and one for c++ runtime
  3. for now, CRT targets will need to take a performance regression until we can re-enable the llvm module approach according to my plan above.

maybe @tqchen has more thoughts here?

also, it would be helpful if we could add this app to the CI somehow. any thoughts on that?

@tqchen
Copy link
Member

tqchen commented Jul 10, 2020

I think it is fine to just build for crt and not targeting c++ for now, as long as it works. The app was removed from the CI due to i386 compact issue of CRT, we can add it back if we confirmed that the issue has been resolved

Copy link
Member

@liangfu liangfu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update and the detailed explanation. This is a quite significant change. I think we can can have this merged, and expect followup PRs to add the additional argument in targeting llvm with crt/c++ runtime.

@liangfu liangfu merged commit d6ceba0 into apache:master Jul 12, 2020
@liangfu
Copy link
Member

liangfu commented Jul 12, 2020

Thanks @areusch for contributing, and thanks @tqchen and @weberlo for reviewing. This is now merged.

trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Jul 14, 2020
* Reorganize CRT into parts, public API, and add standalone build.

 * Create a make-based build in src/runtime/crt. This is intended to
   be built in build/standalone_crt (generated by running ninja
   standalone_crt in build/). Its job is to build CRT without
   depending on headers not explicitly allowed in CRT.
 * Create a "public-facing" CRT API targeted to firmware running
   alongside CRT in include/tvm/runtime/crt. Developers who are
   integrating the CRT are the target of this API.
 * Reorganize CRT internally into common/ and graph_runtime/
   pieces. Build each pieces as a separate statically-linked library.
 * Slim down TVMGraphRuntime public-facing API to just the functions
   that are used externally.
 * Updates to apps/bundle_deploy to make this work.

* Add TVMFuncRegistry, CRT test infrastructure, and tests.

 * Also add error_codes.h, a file containing error codes returned by CRT.

* Add TVMErrorf()

* [API_CHANGE] Integrate func registry into CRT.

 * NOTE: This changes the default API for functions exposed under the
   CRT by the TVMFuncCall API. `resource_handle` is now always given
   as a new 6th parameter.
 * `resource_handle` is NULL when invoked on a global function and a
   pointer to the module owning the function otherwise.

* Generalize arena-based memory manager.

* lint

* Fix git-clang-format arg parsing

* add apache header

* add mutable func registry tests

* git-clang-format

* fix more lint

* Move memory_test to crttests.

* fix tests

* checkpoint

* checkpoint

* bundle_deploy demo_static works

* rm debug printf

* git-clang-format

* fix lint

* add asf header

* pylint

* update build configs for jenkins

* make regression compiler happy

* fix build errors in regression GCC

* address comments

* git-clang-format

* fix for 32-bit cpp regression

* fix incorrect use of memcpy and tests for 32-bit

* clang-format
trevor-m pushed a commit to neo-ai/tvm that referenced this pull request Jul 14, 2020
* Reorganize CRT into parts, public API, and add standalone build.

 * Create a make-based build in src/runtime/crt. This is intended to
   be built in build/standalone_crt (generated by running ninja
   standalone_crt in build/). Its job is to build CRT without
   depending on headers not explicitly allowed in CRT.
 * Create a "public-facing" CRT API targeted to firmware running
   alongside CRT in include/tvm/runtime/crt. Developers who are
   integrating the CRT are the target of this API.
 * Reorganize CRT internally into common/ and graph_runtime/
   pieces. Build each pieces as a separate statically-linked library.
 * Slim down TVMGraphRuntime public-facing API to just the functions
   that are used externally.
 * Updates to apps/bundle_deploy to make this work.

* Add TVMFuncRegistry, CRT test infrastructure, and tests.

 * Also add error_codes.h, a file containing error codes returned by CRT.

* Add TVMErrorf()

* [API_CHANGE] Integrate func registry into CRT.

 * NOTE: This changes the default API for functions exposed under the
   CRT by the TVMFuncCall API. `resource_handle` is now always given
   as a new 6th parameter.
 * `resource_handle` is NULL when invoked on a global function and a
   pointer to the module owning the function otherwise.

* Generalize arena-based memory manager.

* lint

* Fix git-clang-format arg parsing

* add apache header

* add mutable func registry tests

* git-clang-format

* fix more lint

* Move memory_test to crttests.

* fix tests

* checkpoint

* checkpoint

* bundle_deploy demo_static works

* rm debug printf

* git-clang-format

* fix lint

* add asf header

* pylint

* update build configs for jenkins

* make regression compiler happy

* fix build errors in regression GCC

* address comments

* git-clang-format

* fix for 32-bit cpp regression

* fix incorrect use of memcpy and tests for 32-bit

* clang-format
areusch added a commit to areusch/tvm that referenced this pull request Jan 25, 2021
 * In apache#5921, resource_handle was added as a parameter to
   TVMBackendPackedCFunc, which is the typedef for functions called by
   LibraryModule's function lookup.
 * It appears TVM_DLL_EXPORT_TYPED_FUNC was overlooked in that PR,
   although there don't seem to be any runtime affects known so
   far. However, making this definition proper to avoid any compiler
   warnings/debug tool problems.
 * See also https://discuss.tvm.apache.org/t/rfc-misra-c-changes-for-rpc-support/7098/5
tqchen pushed a commit that referenced this pull request Jan 26, 2021
* In #5921, resource_handle was added as a parameter to
   TVMBackendPackedCFunc, which is the typedef for functions called by
   LibraryModule's function lookup.
 * It appears TVM_DLL_EXPORT_TYPED_FUNC was overlooked in that PR,
   although there don't seem to be any runtime affects known so
   far. However, making this definition proper to avoid any compiler
   warnings/debug tool problems.
 * See also https://discuss.tvm.apache.org/t/rfc-misra-c-changes-for-rpc-support/7098/5
alexwong pushed a commit to alexwong/tvm that referenced this pull request Feb 11, 2021
* In apache#5921, resource_handle was added as a parameter to
   TVMBackendPackedCFunc, which is the typedef for functions called by
   LibraryModule's function lookup.
 * It appears TVM_DLL_EXPORT_TYPED_FUNC was overlooked in that PR,
   although there don't seem to be any runtime affects known so
   far. However, making this definition proper to avoid any compiler
   warnings/debug tool problems.
 * See also https://discuss.tvm.apache.org/t/rfc-misra-c-changes-for-rpc-support/7098/5
electriclilies pushed a commit to electriclilies/tvm that referenced this pull request Feb 18, 2021
* In apache#5921, resource_handle was added as a parameter to
   TVMBackendPackedCFunc, which is the typedef for functions called by
   LibraryModule's function lookup.
 * It appears TVM_DLL_EXPORT_TYPED_FUNC was overlooked in that PR,
   although there don't seem to be any runtime affects known so
   far. However, making this definition proper to avoid any compiler
   warnings/debug tool problems.
 * See also https://discuss.tvm.apache.org/t/rfc-misra-c-changes-for-rpc-support/7098/5
Lokiiiiii pushed a commit to Lokiiiiii/tvm that referenced this pull request Mar 2, 2021
* In apache#5921, resource_handle was added as a parameter to
   TVMBackendPackedCFunc, which is the typedef for functions called by
   LibraryModule's function lookup.
 * It appears TVM_DLL_EXPORT_TYPED_FUNC was overlooked in that PR,
   although there don't seem to be any runtime affects known so
   far. However, making this definition proper to avoid any compiler
   warnings/debug tool problems.
 * See also https://discuss.tvm.apache.org/t/rfc-misra-c-changes-for-rpc-support/7098/5
trevor-m pushed a commit to neo-ai/tvm that referenced this pull request Mar 2, 2021
* In apache#5921, resource_handle was added as a parameter to
   TVMBackendPackedCFunc, which is the typedef for functions called by
   LibraryModule's function lookup.
 * It appears TVM_DLL_EXPORT_TYPED_FUNC was overlooked in that PR,
   although there don't seem to be any runtime affects known so
   far. However, making this definition proper to avoid any compiler
   warnings/debug tool problems.
 * See also https://discuss.tvm.apache.org/t/rfc-misra-c-changes-for-rpc-support/7098/5
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: need review status: need update need update based on feedbacks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants