You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The spatial samplers are all currently done as single threaded function call on the CPU, there might be scope for accelerating the majority for many-core architectures given they basically all loop over a set of points.
Immediate issues:
Memory transfer - the cell list is built for each data frame, so either that or the individual "data_points" subset found from the cell list would need to be transferred - this will be costly and hard to hide, might be scope for using CUDA MPI type approach.
The calls to filter() usually work on small subsets (< 50 points) - these would need to be bundled up and run in parallel to make the most of a GPU.
Ideally a solution should be hardware agnostic, so should be focused either on low-level like OpenCL or higher-level like SYCL type approach.
The text was updated successfully, but these errors were encountered:
Work is currently underway through a funded Intel oneAPI Centre of Excellence to accelerate parts of the library that rely on linear algebra (e.g. Radial Basis) using SYCL
The spatial samplers are all currently done as single threaded function call on the CPU, there might be scope for accelerating the majority for many-core architectures given they basically all loop over a set of points.
Immediate issues:
The text was updated successfully, but these errors were encountered: