Repeated calls to nearest with same y #116

alekseynp · 2021-11-25T17:37:26Z

In my work I find myself making frequent cuda calls to torch_cluster.nearest with the form nearest(different_every_time, same_every_time) without providing a batch_x or batch_y. different_every_time is on the order of say (40000, 3) and same_every_time is (2000, 3). If this could be accelerated an order of magnitude, that would have significant value to me.

Any suggestions?
Does anyone else find themselves in a similar situation?
Do they have a solution?
Would a solution have significant value to the community?

I assume that the strategy would be to pre-compute some kind of tree data structure, and then provide that nearest_with_tree(different_every_time, precomputed_tree_structure)
On CPU I guess this would be a kd-tree and it would be orders and orders of magnitude faster than computing the 40000 * 2000 pairwise distances. On CUDA, I think the current torch_cluster.nearest is computing all of those distances, but of course the parallelisation and memory access patterns on CUDA might change the game in terms of gains with a tree structure?

The text was updated successfully, but these errors were encountered:

rusty1s · 2021-11-25T17:46:27Z

For CPU, we are currently using scipy.cluster.vq, while you can also make use of scipy.spatial.KDTree to maintain the information about the kd-tree. Adding kd-tree CUDA support is something that I always wanted to add (especially for knn and ball query graph constructions), but haven't had yet the resources and knowledge to implement it efficiently by myself. Any pointers/help are highly appreciated.

rusty1s · 2021-11-25T17:46:58Z

Also pinging @mrjel here who promised me some time ago to look into this ;)

alekseynp · 2022-02-14T17:28:27Z

I eventually found this project: https://github.com/lxxue/FRNN
The pre-computed data structure is a voxel grid and the api allows to hold and re-use it next call.
The caveat is that the implementation is FRNN not NN. "Fixed Radius Nearest Neighbour".

The project is in a good state in that I was able to pull it straight off the shelf and use it to accelerate my training.

I haven't spoken to the author, but I don't think the project is in a state where it will be well maintained.
The code is definitely non-trivial because it is doing an advanced thing, and it would be much more complex to understand and maintain than, say, the existing NN implementation in pytorch_cluster which is very straightforward.

I don't have enough experience with open source software or the pytorch geometric project to judge, but the obvious question from my perspective is: Should torch_cluster have a fixed_radius_nearest function based on the FRNN implementation?

rusty1s · 2022-02-14T18:57:15Z

Thanks for the pointer. Pinging @mrjel here as well as he always shared interest in adding this feature to torch-cluster :)

github-actions · 2022-08-14T01:23:39Z

This issue had no activity for 6 months. It will be closed in 2 weeks unless there is some new activity. Is this issue already resolved?

github-actions bot added the stale label Aug 14, 2022

rusty1s added enhancement New feature or request and removed stale labels Aug 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repeated calls to nearest with same y #116

Repeated calls to nearest with same y #116

alekseynp commented Nov 25, 2021

rusty1s commented Nov 25, 2021

rusty1s commented Nov 25, 2021 •

edited

Loading

alekseynp commented Feb 14, 2022

rusty1s commented Feb 14, 2022

github-actions bot commented Aug 14, 2022

Repeated calls to nearest with same y #116

Repeated calls to nearest with same y #116

Comments

alekseynp commented Nov 25, 2021

rusty1s commented Nov 25, 2021

rusty1s commented Nov 25, 2021 • edited Loading

alekseynp commented Feb 14, 2022

rusty1s commented Feb 14, 2022

github-actions bot commented Aug 14, 2022

rusty1s commented Nov 25, 2021 •

edited

Loading