-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only output bo needs to be synced from device after result is available #849
base: main
Are you sure you want to change the base?
Changes from all commits
1228cf7
9003f1f
ae3024c
c47b1c6
1fd9113
336c485
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -333,11 +333,10 @@ static iree_status_t iree_hal_xrt_direct_command_buffer_dispatch( | |
// Third argument is the number of LX6 instructions. | ||
run.set_arg(arg_index++, kernel_params.num_instr); | ||
|
||
xrt::bo ofm_bo; | ||
|
||
// Copy descriptors from all sets to the end of the current segment for later | ||
// access. | ||
// TODO(jornt): hack to ensure that the output buffer is synced by syncing all | ||
// buffers after the run. | ||
std::vector<xrt::bo> bos; | ||
// TODO(max): do we need multiple descriptor sets ever for AIE? | ||
uint32_t set = 0; | ||
IREE_RETURN_AND_END_ZONE_IF_ERROR( | ||
|
@@ -348,8 +347,11 @@ static iree_status_t iree_hal_xrt_direct_command_buffer_dispatch( | |
xrt::bo(*command_buffer->descriptor_sets[set].bindings[j], | ||
command_buffer->descriptor_sets[set].lengths[j], | ||
command_buffer->descriptor_sets[set].offsets[j]); | ||
bos.push_back(arg_buffer); | ||
run.set_arg(arg_index + j, arg_buffer); | ||
bool not_ofm = (bindings.values[j].buffer->memory_type & IREE_HAL_MEMORY_TYPE_HOST_VISIBLE) && | ||
(bindings.values[j].buffer->allowed_usage & IREE_HAL_BUFFER_USAGE_MAPPING); | ||
if (!not_ofm) | ||
ofm_bo = arg_buffer; | ||
} | ||
|
||
run.start(); | ||
|
@@ -360,7 +362,7 @@ static iree_status_t iree_hal_xrt_direct_command_buffer_dispatch( | |
return iree_make_status(IREE_STATUS_UNKNOWN, e.what()); | ||
} | ||
|
||
for (xrt::bo& bo : bos) bo.sync(XCL_BO_SYNC_BO_FROM_DEVICE); | ||
ofm_bo.sync(XCL_BO_SYNC_BO_FROM_DEVICE); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just because today we only have one output, doesn't mean tomorrow we will only have one output. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So far all NPU products only use one output buffer. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This has absolutely nothing to do with NPU and everything to do with the model. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Do we have a model running on NPU with multiple output bo? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This indeed doesn't have anything to do with NPU and whether we currently have a model like this or not. We shouldn't make the assumption that there is a single output buffer. |
||
|
||
IREE_TRACE_ZONE_END(z0); | ||
return iree_ok_status(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IREE_HAL_BUFFER_USAGE_MAPPING
is also not the correct flag:https://github.com/iree-org/iree/blob/66342abbfaaee707e27ecc7d8151ad9e357ca0da/runtime/src/iree/hal/buffer.h#L389
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please refer to this comment:
#849 (comment)