Skip to content

Commit

Permalink
[GPU] Use enqueue_fill_mem instead enqueue_memcpy in fill
Browse files Browse the repository at this point in the history
  • Loading branch information
Lyamin-Roman committed Oct 14, 2024
1 parent 9ddda80 commit ac96e15
Show file tree
Hide file tree
Showing 2 changed files with 3 additions and 7 deletions.
1 change: 1 addition & 0 deletions src/plugins/intel_gpu/src/graph/impls/ocl/gemm.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,7 @@ struct gemm_impl : multi_stage_primitive<gemm> {
if (instance.get_input_layout(0).count() == 0 ||
instance.get_input_layout(1).count() == 0) {
stream& stream = instance.get_network().get_stream();
stream.enqueue_barrier();
return instance.output_memory_ptr()->fill(stream);
}

Expand Down
9 changes: 2 additions & 7 deletions src/plugins/intel_gpu/src/runtime/ocl/ocl_memory.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -517,14 +517,9 @@ event::ptr gpu_usm::fill(stream& stream, unsigned char pattern) {
auto& cl_stream = downcast<ocl_stream>(stream);
auto ev = stream.create_base_event();
cl::Event& ev_ocl = downcast<ocl_event>(ev.get())->get();
// enqueueFillUsm call will never finish. Driver bug? Uncomment when fixed. Some older drivers doesn't support enqueueFillUsm call at all.
// cl_stream.get_usm_helper().enqueue_fill_mem<unsigned char>(cl_stream.get_cl_queue(), _buffer.get(), pattern, _bytes_count, nullptr, &ev_ocl)
// Workarounded with enqeue_memcopy. ToDo: Remove below code. Uncomment above.
std::vector<unsigned char> temp_buffer(_bytes_count, pattern);
// TODO: Do we really need blocking call here? Non-blocking one causes accuracy issues right now, but hopefully it can be fixed in more performant way.
const bool blocking = true;
try {
cl_stream.get_usm_helper().enqueue_memcpy(cl_stream.get_cl_queue(), _buffer.get(), temp_buffer.data(), _bytes_count, blocking, nullptr, &ev_ocl);
cl_stream.get_usm_helper().enqueue_fill_mem(
cl_stream.get_cl_queue(), _buffer.get(), static_cast<const void*>(&pattern), sizeof(unsigned char), _bytes_count, nullptr, &ev_ocl);
} catch (cl::Error const& err) {
OPENVINO_THROW(OCL_ERR_MSG_FMT(err));
}
Expand Down

0 comments on commit ac96e15

Please sign in to comment.