You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The search() function in StackSearch runs the GPU-based search for the entire cross product of starting pixels and search_list velocities. Given N x M images and V different velocities, it runs N * M * V searches and stores 8 * N * M * V results in a results vector. To avoid overloading python, it provides a function to read the results back in batches.
We could reduce memory usage by doing the batching at the search function instead. The search function would run a subset of the searches (maybe a range of starting pixels) and return the results directly. It would not need to store the full result set in memory in either the GPU (during computation) or CPU (after computation). Individual batches could be filtered before the next set is run.
In order to do this batching efficiently, we need to do the precomputation (creating the psi_images and phi_images) once and copy the data to the GPU once. This would require new functions to prepare and clean up the GPU memory (both based off of deviceSearchFilter).
The text was updated successfully, but these errors were encountered:
The
search()
function inStackSearch
runs the GPU-based search for the entire cross product of starting pixels andsearch_list
velocities. Given N x M images and V different velocities, it runs N * M * V searches and stores 8 * N * M * V results in aresults
vector. To avoid overloading python, it provides a function to read the results back in batches.We could reduce memory usage by doing the batching at the search function instead. The search function would run a subset of the searches (maybe a range of starting pixels) and return the results directly. It would not need to store the full result set in memory in either the GPU (during computation) or CPU (after computation). Individual batches could be filtered before the next set is run.
In order to do this batching efficiently, we need to do the precomputation (creating the
psi_images
andphi_images
) once and copy the data to the GPU once. This would require new functions to prepare and clean up the GPU memory (both based off ofdeviceSearchFilter
).The text was updated successfully, but these errors were encountered: