-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Hexagon] Generalize builtin for Nd memory alloc with storage scope and add lowering for VTCM / Hexagon #10558
Conversation
Breakage was caused by apache#9727, which didn't account for the new `builtin::mem_copy()` when computing the stack size in `StackSizeChecker`.
… hexagon-alloc-nd-lower-vtcm
@@ -163,6 +170,63 @@ TVM_REGISTER_GLOBAL("device_api.hexagon.mem_copy").set_body([](TVMArgs args, TVM | |||
*rv = static_cast<int32_t>(0); | |||
}); | |||
|
|||
std::map<void*, HexagonBuffer*> vtcmallocs; | |||
|
|||
TVM_REGISTER_GLOBAL("device_api.hexagon.AllocNd").set_body([](TVMArgs args, TVMRetValue* rv) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using a lambda for the body, can we move this to a separate function? It tends to make debugging easier later on, and becomes a template for a generalized function we can propose adding to c_runtime_api.cc
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR creates a separate function called AllocVtcmWorkspace
in the Hexagon Device API that is called from the lambda and which mirrors AllocTextureWorkspace
in the OpenCL device API. These two APIs are the template for a more generalized Device API to "allocate a workspace with storage scope" but this should occur in a follow-up PR, in my opinion.
Note that the lambda must construct Device
and DLDataType
arguments for AllocVtcmWorkspace
from primitive types passed through the builtin before making the call to AllocVtcmWorkspace
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great with few minor suggestions!
@kparzysz-quic please let me know if you have any feedback, this is very nearly ready to merge |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
My only comment is about naming conventions in |
Good point. If it's all right with you let's clean this up in the subsequent PR that will be needed to clean up the |
Thanks @adstraw! |
Forgot to add the usual message :) |
…nd add lowering for VTCM / Hexagon (apache#10558) * repurpose texture flatten for vtcm; TIR lowering correct * clean up remaining code in texture flatten pass * add Alloc and FreeTexture, but failing to run over rpc * test passing with malloc in the device api * cleanup * fails in very reliable way with memory corruption * working with non-HexagonBuffer vtcm alloc * cleanup * do not pass scope through mem_copy api * [Hexagon] Resolve breakage in test_hexagon/test_cache_read_write Breakage was caused by apache#9727, which didn't account for the new `builtin::mem_copy()` when computing the stack size in `StackSizeChecker`. * use HexagonBuffer in Alloc and Free packed funcs * Added comment indicating need for StackSizeChecker::MakeMemCopy. * add AllocVtcmWorkspace and FreeVtcmWorkspace * cleanup * Updated unittests to run all contrib/test_hexagon at CI. * create separate vtcm alloc lowering pass and transform * reset texture_flatten.cc * comments * CI bump * Fix lint formatting error. * Updated fix to remove StackSizeChecker entirely. * pass device and type to device api * Bugfix, verify the precheck's allocations, not own. * Bugfix, pass context information to the precheck. * pass order and shape to device api * working * fix up types and arg passing * pass scope to device api * common builtin for texture / vtcm * add scope to freend api * format and lint * fixed missed format error * restart ci * fix test random value issue + code review feedback * fix test hang * restructure lower vtcm pass per code review feedback (option a) * format error * global.vtcm + tvm_stack_make_shape Co-authored-by: Eric Lunderberg <[email protected]>
This PR modifies the
texture2d_alloca
builtin --- a specialized intrinsic for allocating texture memory --- to a more general form callednd_mem_alloc_with_scope
and provides a lowering for that builtin for both texture memory / OpenCL and VTCM / Hexagon.This change allows for allocations from
cache_read
andcache_write
schedule primitives with VTCM storage scope to be lowered to specialized interfaces in the Hexagon Device API to allocate and free VTCM:AllocVtcmWorkspace
andFreeVtcmWorkspace
. Where previously those allocations were lowered toAllocWorkspace
andFreeWorkspace
which allocate and free DDR.This PR makes possible but does not address a standard (rather than specialized)
AllocWorkspace
with storage scope in the Device API.