-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RUNTIME] Initial implementation of Hexagon runtime support #3163
[RUNTIME] Initial implementation of Hexagon runtime support #3163
Conversation
Is this ready for review? I saw a lot of codes commented out |
The commented out parts are mostly debugging code. This is ready for review. |
I will be pushing some changes soon to pacify the cpplint checker. |
@@ -0,0 +1,366 @@ | |||
/*! | |||
* Copyright 2018-2019 Qualcomm Innovation Center, Inc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are in the process of migrating to ASF, and as a future convention we do not keep copyright messages because the code will be licensed to ASF, please check if it is OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will bring it to the attention of the right people.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may take a bit of time to figure out. If we keep the copyright notices, will it block this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likely it will be a blocker, but we can first keep reviewing and make sure things are in shape. The code can exist in your repo for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will it be acceptable if we remove this single line (that says "Copyright..."), and leave everything else unchanged? We need to have a better idea of what's acceptable before we discuss it with the legal teams.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is exactly what I asked for, see https://www.apache.org/legal/src-headers.html
Sorry for the confusion, I meant to remove the single line that says "Copyright.."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
License header should be
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does that mean exactly?
There is this, plus an extra paragraph in our license. You said you wanted us to remove the copyright line, and that was it. What is the intent of this comment in the light of what had been written before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correction: the first paragraph is different. Is this what you are referring to?
Some quick comments. Here is a summary:
The second things that need some elaboration(perhaps in the comment as well are the level of abstractions introduced.
Likely we could only introduce two levels of abstractions. Besides the highest level, we only need another one that is a common denominator of the two. We also can hide implementations. Specifically, the implementation of hexagon::Device's subclass does not have to appear in the header. Here is one possible proposal of organization:
Depending on the compilation flag, we only link sim or frpc. Both of them will implement hexagon::Device::Create which creates the hexagon::Device, or the corresponding appropriate subclass. The above comment is based on my first quick pass over the code, so it might be inaccurate, would love to continue discussion as well. At the same time, as in uTVM, we want to leave doors open to implement lazy execution. This will likely mean push the tasks as a linked list argument into the argument section, then ask the device side to execute everything in the list. This does not have to be implemented in this PR, but would be a good thing to keep in mind to make sure the design is future compatible |
}; | ||
typedef unsigned int tvm_hexagon_remote_handle_t; | ||
typedef uint64 tvm_hexagon_remote_scalar_t; | ||
__QAIC_HEADER_EXPORT int __QAIC_HEADER(tvm_hexagon_remote_map_buffer_to_dsp)( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we also upload tvm_hexagon_remote_map_buffer_to_dsp
definition? i.e. tvm_hexagon_remote_map_buffer_to_dsp_stub.c
/ tvm_hexagon_remote_map_buffer_to_dsp_skel.c
/ tvm_hexagon_remote_map_buffer_to_dsp.il`
#ifndef tvm_hexagon_remote_URI | ||
#define tvm_hexagon_remote_URI \ | ||
"file:///" \ | ||
"libtvm_hexagon_remote_skel.so?tvm_hexagon_remote_skel_handle_invoke&_" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
libtvm_hexagon_remote_skel.so
is not in HexagonSDK, we build it. Could we also upload it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll upload the sources next week. You could use them to build the skel.so and the stub.so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a bit of delay. Hopefully I'll get that done next week.
One common comment: # ifdef USE_CDSP
if (domain_channel_handle == AEE_EUNKNOWN) {
if (remote_session_control) {
remote_rpc_control_unsigned_module data;
data.enable = 1;
data.domain = CDSP_DOMAIN_ID;
int rc = remote_session_control(DSPRPC_CONTROL_UNSIGNED_MODULE, &data,
sizeof(data));
LOG_TVM(ANDROID_LOG_DEBUG, "TVM",
"HexagonTarget %s: remote_session_control rc=%d for unsigned PD",
__func__, rc);
} else {
LOG_TVM(ANDROID_LOG_DEBUG, "TVM",
"HexagonTarget %s: No unsigned PD support available", __func__);
}
int rc = tvm_hexagon_remote_open(tvm_hexagon_remote_URI "&_dom=cdsp",
&domain_channel_handle);
if (rc != AEE_SUCCESS) {
LOG_TVM(ANDROID_LOG_ERROR, "TVM",
"HexagonTarget %s:Failed to open channel rc=0x%x", __func__, rc);
return nullptr;
}
}
#endif
void* mem = rpcmem_alloc(RPCMEM_HEAP, RPCMEM_DEFAULT_FLAGS, size);
if (mem == nullptr) {
LOG_TVM(ANDROID_LOG_ERROR, "TVM",
"HexagonTarget %sMem alloc failed for size=0x%x alignment=0x%x",
__func__, size, align);
}
#if USE_CDSP
int mem_fd = rpcmem_to_fd(mem);
uint64_t dsp_va = 0;
int rc = tvm_hexagon_remote_map_buffer_to_dsp(
domain_channel_handle, mem_fd, static_cast<int>(size),
reinterpret_cast<uint64*>(&dsp_va));
if (rc != AEE_SUCCESS) {
LOG_TVM(
ANDROID_LOG_ERROR, "TVM",
"HexagonTarget %s: buffer mapping to dsp failed for fd=0x%x rc=0x%x",
__func__, mem_fd, rc);
return nullptr;
}
// LOG_TVM(ANDROID_LOG_ERROR, "TVM",
// "Address from remote_map_buffer :0x%x", dsp_va);
dsp_address_mapping.insert(
std::make_pair(dsp_va, reinterpret_cast<uint64_t>(mem)));
dsp_size_mapping.insert(std::make_pair(dsp_va, static_cast<uint64_t>(size)));
// LOG_TVM(ANDROID_LOG_DEBUG, "TVM",
// "HexagonTarget %s: dsp_va:0x%x apps_va:0x%x mem_fd=0x%x",
// __func__, static_cast<uint64_t>(dsp_va),
// reinterpret_cast<uint64_t>(mem), mem_fd);
return reinterpret_cast<void*>(dsp_va);
#endif
return reinterpret_cast<void*>(mem)
} Could we consider supporting it? Secondly, in fact, in HexagonSDK, we have UbuntuARM, and we could also support Linux and the code is also very similar. Could we have plan to support it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some high-level comments:
- Avoid hard-coding symbols when possible. If we have to dlopen something, it might be a good idea to have a clear message asking users to properly set some path.
- Seeing some code that is commented out. Let's remove them if they are truly useless.
- As Tianqi already mentioned, try avoiding asserts and std::cerr.
// qidl copyright | ||
// qidl nested=false | ||
#include "AEEStdDef.h" | ||
#include "android_Release_aarch64/remote.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have preference to use <>
instead of quotes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that a question for me? I don't know. You tell me...
I hope I have fixed all issues related to coding conventions. I'd like to get that out of the way before making any other changes. |
Yes, it does. Additionally, what we need is Alloc/Free (for memory) and Load/Unload (for shared objects). With that we could switch to LowLevelDeviceAPI (except that I haven't found in the sources).
This approach requires that both sides (sim and frpc) define the same "factory" function. This means that they can never be built together. I'm not sure if that's good or bad, but we need to think about long term consequences of this choice. |
I think it is fine to assume that we will either build simulator or frpc. We can also have two separate factory functions that allow us to link both. As I mentioned in the last post. I think we should reduce the level of abstractions into two:
The reason is that we are essentially implementing a "driver" based on the low-level device interface. If the low-level interface is somewhat common between the simulator and real hardware, then we should only expose the LowLevelDeviceAPI. |
|
||
/*! \brief Number of bytes each allocation must align to in temporary allocation */ | ||
constexpr int kTempAllocaAlignment = 64; | ||
constexpr int kTempAllocaAlignment = 128; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about this? https://github.com/dmlc/tvm/blob/master/src/pass/ir_util.h#L174 I think we should handle it specially for Hexagon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Someone suggested having a target-specific function that gets the required alignment, instead of having global constants. I think that would be a better solution. This patch simply has what we've done about it.
I haven't looked into it yet, is there a precedent for implementing such a thing? Do you have any suggestions as to how it should be done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also suggest we have a target-specific function that gets the required alignment. Then in GetTempAllocaAlignment
we could get this alignment and just simply return it for Hexagon (Need detect the target is Hexagon, for example we define the macro USE_HEXAGON
in CMakeLists.txt).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment. How about create target-specific function alignment?
The sim and device runtimes are now separated, and the header files for each are incorporated into the corresponding .cc files. |
Aside from the copyright issue, are there any other outstanding issues with the code? Have I missed anything? |
We are considering support for ADSP.
I'll get back to you about this one. As of now we haven't done anything with UbuntuARM. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some additional comments, overall look like we are converging
* permitted use with any of the "vta" and "verilog" subdirectories in the TVM | ||
* repo. | ||
*/ | ||
#ifndef TVM_RUNTIME_HEXAGON_DEVICE_FASTRPC_TVM_HEXAGON_REMOTE_H_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tvm_hexagon_remote.h ->hexagon_remote.h given TVM is implied
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is automatically generated by the IDL compiler. It comes from a library called tvm_hexagon_remote
, so its name becomes tvm_hexagon_remote.h
. Even though it's edited after coming out of the IDL compiler (adding license, formatting, renaming the header guard), I'd like to keep the name to indicate the connection.
|
||
#define WEAK __attribute__((weak)) | ||
|
||
namespace { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider use an explicit namespace anoymous namespace has problem udirng amalgmation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document why this shim file is needed at the beginning of the file.
|
||
namespace { | ||
|
||
void* LoadLibCdspRpc() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider make The CdspRPC a class(CDSPRPCSession, put all the fields into the class, and use Global singleton pattern). The redirection can get global singleton and run.
I just looked at the uTVM PR, and the commonality is only superficial. Hexagon is in no way a micro-device in the sense of that PR. |
Sure, I just think there will be common design patterns that could be shared across the drivers. Given that you are already reducing the abstraction it is fine. |
This is an update of the original PR. Since there has been a month of a delay, I upgraded the sources to the latest ones (the code in the original PR got quite a bit out of date). |
@FrozenGene @huajsj please help to do another round of review. I will also do a review in this week. Thanks @kparzysz-quic |
I will do the review next week. Thanks. |
This Contribution is being provided by Qualcomm Innovation Center, Inc. See NOTICE file for more details.
Rebased to resolve conflicts. |
I take a look at the new version of the license which contains the disclaimer text. We need to confirm again what is the implication especially after we migrate the repo to apache. I might wait until the repo migration before merging this. While I feel it is OK, I would act caution and treat this differently(as if it was a different license). In particular, it would be great to have a clear isolated folder to point to, e.g. every file within src/runtime/hexagon has the disclaimer, while the rest of the files do not, and the disclaimer text can explicitly point to the subfolder, allowing the user to cut-off the content if necessary NOTE: verilog folder has been removed and only vta folder will contain open source hardware design in the future. |
Is there a time frame for the repo migration? |
@@ -82,6 +82,7 @@ typedef enum { | |||
kDLSDAccel = 6, | |||
kOpenGL = 11, | |||
// AddExtraTVMType which is not in DLPack here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this comment after kDLHexagon = 13
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
|
||
/*! \brief Number of bytes each allocation must align to in temporary allocation */ | ||
constexpr int kTempAllocaAlignment = 64; | ||
constexpr int kTempAllocaAlignment = 128; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment. How about create target-specific function alignment?
* Qualcomm Technologies, Inc. and Qualcomm Innovation Center, Inc. retain | ||
* copyright of their respective Contributions. | ||
*/ | ||
#ifdef __ANDROID__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we try UbuntuARM64
? From my experience, the code should be very similar with Android here? If we could add UbuntuARM64
support in this patch, I think it is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have any experience with UbuntuARM64
or any devices that run it. We can consider it in the future, but it won't happen right away.
TVM_LOGD("CDSP subsystem present"); | ||
} else if (!stat("/dev/subsys_adsp", &sb)) { | ||
enable_domains_ = false; | ||
TVM_LOGD("ADSP subsystem present"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not review. Just clarify one information: So, ADSP is supported and work well, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, ADSP is now supported on systems that have HVX.
TVM_LOGD("ADSP subsystem present"); | ||
} | ||
|
||
constexpr auto domain_lib_name = "libtvm_hexagon_remote_stub.so"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we generate this .so? Is it contained in the hexagon.cmake
build process?
@kparzysz-quic After reading https://discuss.tvm.ai/t/introducing-hexagon-backend/2421/13 again, about IDL preprocess, I think we could add one option like Currently, this patch contains the output of IDL, like tvm_hexagon_remote.h, right? But we could not find how to generate without IDL file. |
@@ -175,6 +175,10 @@ def ext_dev(self, dev_id=0): | |||
"""Construct extension device.""" | |||
return self.context(12, dev_id) | |||
|
|||
def hexagon(self, dev_id=0): | |||
"""Construct extension device.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extension
be Hexagon
@kparzysz-quic I have used your PR to try to build runtime. I think we need Secondly, we need one procedure in build system or instruction to tell user to produce |
} | ||
|
||
void HexagonTarget::CopyDeviceToDevice(void* dst, const void* src, | ||
unsigned len) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we execute model (such as NNVM frontend / Relay frontend), we will meet CopyDeviceToDevice, imagine we have module.get_output(0, tvm.nd.empty(output_shape, dtype="float32", ctx=ctx)).asnumpy()
. The implementation is not difficult (Using GetAppsAddr
to get src / dst app addr
and memcpy
). Leave this comment reminding us we should implement it and don't leave it empty. Otherwise, we will get 0 of output.
Replaced by #5252. |
No description provided.