Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RUNTIME] Initial implementation of Hexagon runtime support #3163

Closed
wants to merge 1 commit into from
Closed

[RUNTIME] Initial implementation of Hexagon runtime support #3163

wants to merge 1 commit into from

Conversation

kparzysz-quic
Copy link
Contributor

No description provided.

@tqchen tqchen changed the title Initial implementation of Hexagon support [RUNTIME] Initial implementation of Hexagon support May 9, 2019
@tqchen tqchen changed the title [RUNTIME] Initial implementation of Hexagon support [RUNTIME] Initial implementation of Hexagon runtime support May 9, 2019
@junrushao
Copy link
Member

Is this ready for review? I saw a lot of codes commented out

@kparzysz-quic
Copy link
Contributor Author

The commented out parts are mostly debugging code. This is ready for review.

@kparzysz-quic
Copy link
Contributor Author

I will be pushing some changes soon to pacify the cpplint checker.

CMakeLists.txt Show resolved Hide resolved
src/runtime/hexagon/hexagon_device_api.cc Outdated Show resolved Hide resolved
@@ -0,0 +1,366 @@
/*!
* Copyright 2018-2019 Qualcomm Innovation Center, Inc.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are in the process of migrating to ASF, and as a future convention we do not keep copyright messages because the code will be licensed to ASF, please check if it is OK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will bring it to the attention of the right people.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may take a bit of time to figure out. If we keep the copyright notices, will it block this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Likely it will be a blocker, but we can first keep reviewing and make sure things are in shape. The code can exist in your repo for now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will it be acceptable if we remove this single line (that says "Copyright..."), and leave everything else unchanged? We need to have a better idea of what's acceptable before we discuss it with the legal teams.

Copy link
Member

@tqchen tqchen May 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is exactly what I asked for, see https://www.apache.org/legal/src-headers.html
Sorry for the confusion, I meant to remove the single line that says "Copyright.."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

License header should be

Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements.  See the NOTICE file
distributed with this work for additional information
regarding copyright ownership.  The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License.  You may obtain a copy of the License at

  http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied.  See the License for the
specific language governing permissions and limitations
under the License.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does that mean exactly?

There is this, plus an extra paragraph in our license. You said you wanted us to remove the copyright line, and that was it. What is the intent of this comment in the light of what had been written before?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction: the first paragraph is different. Is this what you are referring to?

src/runtime/hexagon/hexagon_wrapper_dl.h Outdated Show resolved Hide resolved
src/runtime/hexagon/hexagon_module.h Outdated Show resolved Hide resolved
src/runtime/hexagon/hexagon_device_target.h Outdated Show resolved Hide resolved
src/runtime/hexagon/hexagon_module.cc Outdated Show resolved Hide resolved
src/runtime/hexagon/hexagon_device_sim.cc Outdated Show resolved Hide resolved
@tqchen
Copy link
Member

tqchen commented May 10, 2019

Some quick comments. Here is a summary:

  • Style issues(follow google c style)
    • Use CHECK_XX style instead of assert
    • Use CamelCase for functions
    • Add variable_with_underscore_ for class members.

The second things that need some elaboration(perhaps in the comment as well are the level of abstractions introduced.

  • At the highest level DeviceAPI and Module
  • hexagon::Device appears to be a low-level device abstraction for hexagon
  • There seems to be even a LowLevlAPI(I have not checked the code carefully) that contains

Likely we could only introduce two levels of abstractions. Besides the highest level, we only need another one that is a common denominator of the two.

We also can hide implementations. Specifically, the implementation of hexagon::Device's subclass does not have to appear in the header.

Here is one possible proposal of organization:

  • src/runtime/hexagon/sim/sim_device.cc
  • src/runtime/hexagon/frpc/frpc_device.cc
  • src/runtime/hexagon/device/device_lib.cc (device side library for driving the code.)

Depending on the compilation flag, we only link sim or frpc. Both of them will implement hexagon::Device::Create which creates the hexagon::Device, or the corresponding appropriate subclass.

The above comment is based on my first quick pass over the code, so it might be inaccurate, would love to continue discussion as well.

At the same time, as in uTVM, we want to leave doors open to implement lazy execution. This will likely mean push the tasks as a linked list argument into the argument section, then ask the device side to execute everything in the list. This does not have to be implemented in this PR, but would be a good thing to keep in mind to make sure the design is future compatible

@tqchen tqchen self-assigned this May 10, 2019
};
typedef unsigned int tvm_hexagon_remote_handle_t;
typedef uint64 tvm_hexagon_remote_scalar_t;
__QAIC_HEADER_EXPORT int __QAIC_HEADER(tvm_hexagon_remote_map_buffer_to_dsp)(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we also upload tvm_hexagon_remote_map_buffer_to_dsp definition? i.e. tvm_hexagon_remote_map_buffer_to_dsp_stub.c / tvm_hexagon_remote_map_buffer_to_dsp_skel.c / tvm_hexagon_remote_map_buffer_to_dsp.il`

#ifndef tvm_hexagon_remote_URI
#define tvm_hexagon_remote_URI \
"file:///" \
"libtvm_hexagon_remote_skel.so?tvm_hexagon_remote_skel_handle_invoke&_" \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

libtvm_hexagon_remote_skel.so is not in HexagonSDK, we build it. Could we also upload it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll upload the sources next week. You could use them to build the skel.so and the stub.so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a bit of delay. Hopefully I'll get that done next week.

CMakeLists.txt Show resolved Hide resolved
@FrozenGene
Copy link
Member

FrozenGene commented May 10, 2019

One common comment:
Currently, we only support CDSP. However, SnapDragon 820 is very popular in industry. 820 doesn't have CDSP and can not select. It only could go to ADSP. Currently, in this PR, we use domain_channel_handle to detect and set data.domain = CDSP_DOMAIN_ID . However, if we are in 820, we could just simply go to void* mem = rpcmem_alloc(RPCMEM_HEAP, RPCMEM_DEFAULT_FLAGS, size); For example:

# ifdef USE_CDSP
if (domain_channel_handle == AEE_EUNKNOWN) {
    if (remote_session_control) {
      remote_rpc_control_unsigned_module data;
      data.enable = 1;
      data.domain = CDSP_DOMAIN_ID;
      int rc = remote_session_control(DSPRPC_CONTROL_UNSIGNED_MODULE, &data,
                                      sizeof(data));
      LOG_TVM(ANDROID_LOG_DEBUG, "TVM",
              "HexagonTarget %s: remote_session_control rc=%d for unsigned PD",
              __func__, rc);
    } else {
      LOG_TVM(ANDROID_LOG_DEBUG, "TVM",
              "HexagonTarget %s: No unsigned PD support available", __func__);
    }
    int rc = tvm_hexagon_remote_open(tvm_hexagon_remote_URI "&_dom=cdsp",
                                     &domain_channel_handle);
    if (rc != AEE_SUCCESS) {
      LOG_TVM(ANDROID_LOG_ERROR, "TVM",
              "HexagonTarget %s:Failed to open channel rc=0x%x", __func__, rc);
      return nullptr;
    }
  }
#endif 
  void* mem = rpcmem_alloc(RPCMEM_HEAP, RPCMEM_DEFAULT_FLAGS, size);
  if (mem == nullptr) {
    LOG_TVM(ANDROID_LOG_ERROR, "TVM",
            "HexagonTarget %sMem alloc failed for size=0x%x alignment=0x%x",
            __func__, size, align);
  }
#if USE_CDSP
  int mem_fd = rpcmem_to_fd(mem);
  uint64_t dsp_va = 0;

  int rc = tvm_hexagon_remote_map_buffer_to_dsp(
      domain_channel_handle, mem_fd, static_cast<int>(size),
      reinterpret_cast<uint64*>(&dsp_va));
  if (rc != AEE_SUCCESS) {
    LOG_TVM(
        ANDROID_LOG_ERROR, "TVM",
        "HexagonTarget %s: buffer mapping to dsp failed for fd=0x%x rc=0x%x",
        __func__, mem_fd, rc);
    return nullptr;
  }

  // LOG_TVM(ANDROID_LOG_ERROR, "TVM",
  //  "Address from remote_map_buffer :0x%x", dsp_va);
  dsp_address_mapping.insert(
      std::make_pair(dsp_va, reinterpret_cast<uint64_t>(mem)));
  dsp_size_mapping.insert(std::make_pair(dsp_va, static_cast<uint64_t>(size)));
  // LOG_TVM(ANDROID_LOG_DEBUG, "TVM",
  //  "HexagonTarget %s: dsp_va:0x%x apps_va:0x%x mem_fd=0x%x",
  //  __func__, static_cast<uint64_t>(dsp_va),
  //  reinterpret_cast<uint64_t>(mem), mem_fd);
  return reinterpret_cast<void*>(dsp_va);
#endif
  return reinterpret_cast<void*>(mem)
}

Could we consider supporting it?

Secondly, in fact, in HexagonSDK, we have UbuntuARM, and we could also support Linux and the code is also very similar. Could we have plan to support it?

Copy link
Member

@junrushao junrushao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some high-level comments:

  1. Avoid hard-coding symbols when possible. If we have to dlopen something, it might be a good idea to have a clear message asking users to properly set some path.
  2. Seeing some code that is commented out. Let's remove them if they are truly useless.
  3. As Tianqi already mentioned, try avoiding asserts and std::cerr.

CMakeLists.txt Show resolved Hide resolved
// qidl copyright
// qidl nested=false
#include "AEEStdDef.h"
#include "android_Release_aarch64/remote.h"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have preference to use <> instead of quotes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that a question for me? I don't know. You tell me...

src/runtime/hexagon/hexagon_device_api.cc Show resolved Hide resolved
src/runtime/hexagon/hexagon_device_sim.cc Outdated Show resolved Hide resolved
src/runtime/hexagon/hexagon_wrapper_dl.cc Outdated Show resolved Hide resolved
src/runtime/hexagon/hexagon_module.cc Outdated Show resolved Hide resolved
@kparzysz-quic
Copy link
Contributor Author

I hope I have fixed all issues related to coding conventions. I'd like to get that out of the way before making any other changes.

@kparzysz-quic
Copy link
Contributor Author

The second things that need some elaboration(perhaps in the comment as well are the level of abstractions introduced.

  • At the highest level DeviceAPI and Module

  • hexagon::Device appears to be a low-level device abstraction for hexagon

  • There seems to be even a LowLevlAPI(I have not checked the code carefully) that contains

Yes, it does. Additionally, what we need is Alloc/Free (for memory) and Load/Unload (for shared objects). With that we could switch to LowLevelDeviceAPI (except that I haven't found in the sources).

Here is one possible proposal of organization:

  • src/runtime/hexagon/sim/sim_device.cc
  • src/runtime/hexagon/frpc/frpc_device.cc
  • src/runtime/hexagon/device/device_lib.cc (device side library for driving the code.)

Depending on the compilation flag, we only link sim or frpc. Both of them will implement hexagon::Device::Create which creates the hexagon::Device, or the corresponding appropriate subclass.

This approach requires that both sides (sim and frpc) define the same "factory" function. This means that they can never be built together. I'm not sure if that's good or bad, but we need to think about long term consequences of this choice.

@tqchen
Copy link
Member

tqchen commented May 11, 2019

I think it is fine to assume that we will either build simulator or frpc. We can also have two separate factory functions that allow us to link both.

As I mentioned in the last post. I think we should reduce the level of abstractions into two:

  • DeviceAPI, Module(tvm's original layer)
  • LowLevelDeviceAPI, (remove the intermediate hexagon::Device)

The reason is that we are essentially implementing a "driver" based on the low-level device interface. If the low-level interface is somewhat common between the simulator and real hardware, then we should only expose the LowLevelDeviceAPI.


/*! \brief Number of bytes each allocation must align to in temporary allocation */
constexpr int kTempAllocaAlignment = 64;
constexpr int kTempAllocaAlignment = 128;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this? https://github.com/dmlc/tvm/blob/master/src/pass/ir_util.h#L174 I think we should handle it specially for Hexagon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone suggested having a target-specific function that gets the required alignment, instead of having global constants. I think that would be a better solution. This patch simply has what we've done about it.
I haven't looked into it yet, is there a precedent for implementing such a thing? Do you have any suggestions as to how it should be done?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also suggest we have a target-specific function that gets the required alignment. Then in GetTempAllocaAlignment we could get this alignment and just simply return it for Hexagon (Need detect the target is Hexagon, for example we define the macro USE_HEXAGON in CMakeLists.txt).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment. How about create target-specific function alignment?

@kparzysz-quic
Copy link
Contributor Author

The sim and device runtimes are now separated, and the header files for each are incorporated into the corresponding .cc files.

@kparzysz-quic
Copy link
Contributor Author

Aside from the copyright issue, are there any other outstanding issues with the code? Have I missed anything?

@kparzysz-quic
Copy link
Contributor Author

One common comment:
Currently, we only support CDSP. However, SnapDragon 820 is very popular in industry. 820 doesn't have CDSP and can not select. It only could go to ADSP. [...]
Could we consider supporting it?

We are considering support for ADSP.

Secondly, in fact, in HexagonSDK, we have UbuntuARM, and we could also support Linux and the code is also very similar. Could we have plan to support it?

I'll get back to you about this one. As of now we haven't done anything with UbuntuARM.

Copy link
Member

@tqchen tqchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some additional comments, overall look like we are converging

* permitted use with any of the "vta" and "verilog" subdirectories in the TVM
* repo.
*/
#ifndef TVM_RUNTIME_HEXAGON_DEVICE_FASTRPC_TVM_HEXAGON_REMOTE_H_
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tvm_hexagon_remote.h ->hexagon_remote.h given TVM is implied

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is automatically generated by the IDL compiler. It comes from a library called tvm_hexagon_remote, so its name becomes tvm_hexagon_remote.h. Even though it's edited after coming out of the IDL compiler (adding license, formatting, renaming the header guard), I'd like to keep the name to indicate the connection.


#define WEAK __attribute__((weak))

namespace {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider use an explicit namespace anoymous namespace has problem udirng amalgmation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document why this shim file is needed at the beginning of the file.


namespace {

void* LoadLibCdspRpc() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider make The CdspRPC a class(CDSPRPCSession, put all the fields into the class, and use Global singleton pattern). The redirection can get global singleton and run.

src/runtime/hexagon/hexagon_module.cc Show resolved Hide resolved
src/runtime/hexagon/hexagon_module.h Show resolved Hide resolved
src/runtime/hexagon/hexagon_module.h Outdated Show resolved Hide resolved
src/runtime/hexagon/hexagon_module.h Outdated Show resolved Hide resolved
src/runtime/hexagon/hexagon_module.h Show resolved Hide resolved
src/runtime/hexagon/hexagon_module.h Show resolved Hide resolved
src/runtime/hexagon/hexagon_module.h Outdated Show resolved Hide resolved
src/runtime/hexagon/hexagon_module.h Outdated Show resolved Hide resolved
@kparzysz-quic
Copy link
Contributor Author

kparzysz-quic commented May 22, 2019

As I mentioned in the last post. I think we should reduce the level of abstractions into two:

  • DeviceAPI, Module(tvm's original layer)
  • LowLevelDeviceAPI, (remove the intermediate hexagon::Device)

The reason is that we are essentially implementing a "driver" based on the low-level device interface. If the low-level interface is somewhat common between the simulator and real hardware, then we should only expose the LowLevelDeviceAPI.

I just looked at the uTVM PR, and the commonality is only superficial. Hexagon is in no way a micro-device in the sense of that PR.

@tqchen
Copy link
Member

tqchen commented May 22, 2019

Sure, I just think there will be common design patterns that could be shared across the drivers. Given that you are already reducing the abstraction it is fine.

@kparzysz-quic
Copy link
Contributor Author

This is an update of the original PR. Since there has been a month of a delay, I upgraded the sources to the latest ones (the code in the original PR got quite a bit out of date).
Hopefully all concerns are now addressed. The reason for the delay was a need for another legal review regarding license headers and our additional notices. The Apache license header is now identical to the one from Apache website.

@tqchen
Copy link
Member

tqchen commented Jun 27, 2019

@FrozenGene @huajsj please help to do another round of review. I will also do a review in this week. Thanks @kparzysz-quic

@FrozenGene
Copy link
Member

I will do the review next week. Thanks.

This Contribution is being provided by Qualcomm Innovation Center, Inc.
See NOTICE file for more details.
@kparzysz-quic
Copy link
Contributor Author

Rebased to resolve conflicts.

@tqchen
Copy link
Member

tqchen commented Jun 30, 2019

I take a look at the new version of the license which contains the disclaimer text. We need to confirm again what is the implication especially after we migrate the repo to apache. I might wait until the repo migration before merging this.

While I feel it is OK, I would act caution and treat this differently(as if it was a different license). In particular, it would be great to have a clear isolated folder to point to, e.g. every file within src/runtime/hexagon has the disclaimer, while the rest of the files do not, and the disclaimer text can explicitly point to the subfolder, allowing the user to cut-off the content if necessary

NOTE: verilog folder has been removed and only vta folder will contain open source hardware design in the future.

@kparzysz-quic
Copy link
Contributor Author

Is there a time frame for the repo migration?

@@ -82,6 +82,7 @@ typedef enum {
kDLSDAccel = 6,
kOpenGL = 11,
// AddExtraTVMType which is not in DLPack here
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this comment after kDLHexagon = 13

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.


/*! \brief Number of bytes each allocation must align to in temporary allocation */
constexpr int kTempAllocaAlignment = 64;
constexpr int kTempAllocaAlignment = 128;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment. How about create target-specific function alignment?

* Qualcomm Technologies, Inc. and Qualcomm Innovation Center, Inc. retain
* copyright of their respective Contributions.
*/
#ifdef __ANDROID__
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we try UbuntuARM64? From my experience, the code should be very similar with Android here? If we could add UbuntuARM64 support in this patch, I think it is better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have any experience with UbuntuARM64 or any devices that run it. We can consider it in the future, but it won't happen right away.

TVM_LOGD("CDSP subsystem present");
} else if (!stat("/dev/subsys_adsp", &sb)) {
enable_domains_ = false;
TVM_LOGD("ADSP subsystem present");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not review. Just clarify one information: So, ADSP is supported and work well, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, ADSP is now supported on systems that have HVX.

TVM_LOGD("ADSP subsystem present");
}

constexpr auto domain_lib_name = "libtvm_hexagon_remote_stub.so";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we generate this .so? Is it contained in the hexagon.cmake build process?

@FrozenGene
Copy link
Member

@kparzysz-quic After reading https://discuss.tvm.ai/t/introducing-hexagon-backend/2421/13 again, about IDL preprocess, I think we could add one option like USE_HEXAGON_SDK. For example HEXAGON_IDL_PREPROCESS. when we set USE_HEXAGON is ON and HEXAGON_IDL_PREPROCESS is ON (default is false), we could run IDL compiler qaic. the IDL definition file could be existed in src/runtime/hexagon folder.

Currently, this patch contains the output of IDL, like tvm_hexagon_remote.h, right? But we could not find how to generate without IDL file.

@@ -175,6 +175,10 @@ def ext_dev(self, dev_id=0):
"""Construct extension device."""
return self.context(12, dev_id)

def hexagon(self, dev_id=0):
"""Construct extension device."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extension be Hexagon

@FrozenGene
Copy link
Member

FrozenGene commented Jul 26, 2019

@kparzysz-quic I have used your PR to try to build runtime. I think we need tvm_hexagon_remote.idl and tvm_hexagon_remote_nd.idl, which could be located in fast_rpc directory , which could let users do it by themself or not. Meanwhile, we lack of stub.c / skel.c generated by IDL. We can not get the libtvm_hexagon_remote_stub.so and libtvm_hexagon_remote_nd_stub.so and so on.

Secondly, we need one procedure in build system or instruction to tell user to produce libtvm_hexagon_remote_skel.so / libtvm_hexagon_remote_stub.so. Like Halide does: https://github.com/halide/Halide/blob/master/src/runtime/hexagon_remote/Makefile

}

void HexagonTarget::CopyDeviceToDevice(void* dst, const void* src,
unsigned len) {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we execute model (such as NNVM frontend / Relay frontend), we will meet CopyDeviceToDevice, imagine we have module.get_output(0, tvm.nd.empty(output_shape, dtype="float32", ctx=ctx)).asnumpy(). The implementation is not difficult (Using GetAppsAddr to get src / dst app addr and memcpy). Leave this comment reminding us we should implement it and don't leave it empty. Otherwise, we will get 0 of output.

@kparzysz-quic
Copy link
Contributor Author

Replaced by #5252.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants