-
-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Linear and efficient neighbour finder #393
Conversation
could you just share your benchmark script? For completeness sake ;) |
Done! I updated the comment, it is hidden in the "See the full script" collapsible at the top of the benchmarks. |
Just a comment. In your ase timing, I think you are also timing the conversion of the sisl object to the ase object. Try and move this conversion out and see if this changes the picture. I get quite good timings with ase all across. |
Hmm I moved the conversion out of the timing and I get almost the same. Could you share which systems are you testing? When I began testing, I only tiled the |
As I said, I think the problem is that the |
I fucked it up ;) Nevertheless moving it outside is appropriate :) |
I incorporated the distance filtering inside the fortran loop (previously it was done a posteriori in python) and now it is even more efficient. I kept the old ones to compare for now (dashed lines in this plot): Now one can scale up the tiled fcc even more (compare with previous fcc plot which showed up to 1.6M atoms): In fact I could keep making it larger if it wasn't because it results in an integer overflow 😅 |
This is amazing!! I'll have a look asap, but I'll try and get #388 done first. |
Ok! I'm going to keep optimizing some things now that I know everything works fine :) |
I added (optional) periodic boundary conditions. For now, what is done is to return the supercell indices of the neighbors (in three extra columns). Also, I added a |
This is almost ready, the only two things that need to be done are:
|
Thanks, I still plan on getting to this very interesting and useful thing! :) I would however, probably change it to a Cython code, since fortran does not play nice with C-arrays (it may incur copies that are not necessary. So I first need to understand what you are doing, and then subsequently change it a bit. :) |
Isn't there a way to check if copies are made? I tried to do it and didn't get any warning, but probably it is just because I did it wrong 😅 |
pip3 install . --f2py-report-copy should suffice, however, I would still like to use C instead of fortran. Simply because of compatibility issues going forward. numpy distutils will be discontinued in 3.12, and then I need to change to |
I'm trying to move this to cython 👍 |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #393 +/- ##
==========================================
- Coverage 86.14% 85.78% -0.37%
==========================================
Files 364 367 +3
Lines 43511 43904 +393
==========================================
+ Hits 37483 37662 +179
- Misses 6028 6242 +214 ☔ View full report in Codecov by Sentry. |
I have a Cython question. Imagine I have the geometry's coordinates in an array called cdef np.ndarray[np.float_64, ndim=2] xyz
cdef np.ndarray[np.float_64] atom_xyz
atom_xyz[:] = xyz[3, :] Cython shows python interaction and it slows down the code. Is there any way of avoiding the loop: for i in range(3):
atom_xyz[i] = xyz[3, i] without hurting performance? |
You should generally avoid using Also, I guess you mean: atom_xyz[:] = xyz[ia, :] no? |
also, try and use |
Yes, I was already doing that! I got that there was python interaction in this part, but I didn't know that with a memoryview you could do this. I've changed it now and all good! |
Great job! I'll probably take over and clean things up, also so I know what's happening, I want to learn this one ;) |
Just let me know when I should take over! :0 |
Ok, you can look at it now, everything is implemented 👍 |
Ok some comments:
I am still working on it, so don't push anything ;) |
|
Do you have a plan for making |
Signed-off-by: Nick Papior <[email protected]>
future.annotations requires Cython>=3 Signed-off-by: Nick Papior <[email protected]>
changed class and argument names to make them simpler. Also changed default of overlap to False. The internal R is now an array, regardless, but it can have either ndim 0 or 1, depending on whether it is an array or not. Signed-off-by: Nick Papior <[email protected]>
This is not to diminish other contributors. Pol has made tremendous contributions. This is a delicate decision, and is in no way a discouragement of other contributions made. Signed-off-by: Nick Papior <[email protected]>
Signed-off-by: Nick Papior <[email protected]>
Signed-off-by: Nick Papior <[email protected]>
Signed-off-by: Nick Papior <[email protected]>
Signed-off-by: Nick Papior <[email protected]>
Signed-off-by: Nick Papior <[email protected]>
In some cases (when the bin-size is big) the maximum pairs it would allow could be quite big. When the bins has many atoms, the error in the actual neighbors can be significant, leading to memory problems. Now the finder uses a maximum of 200MB of initial neighbor lists. This results in incremental re-allocations which is a bit more heavy, but it reduces the chances of running out of memory. One can change the default memory by changing the class variable: Neighbor.memory accordingly. Added it to the documentation. Signed-off-by: Nick Papior <[email protected]>
Yes, however I think it would be best to figure out the #553 problem in a better way. Perhaps |
Ok! Also, can we make |
Signed-off-by: Nick Papior <[email protected]>
Thank you! One less thing to worry about hehe |
Thanks for contribution! |
This follows #393 naming convention (US-style). Signed-off-by: Nick Papior <[email protected]>
This PR implements a linear scaling algorithm to find neighbours.
The algorithm
It is mostly inspired by the method described here. I don't know if it is the same way SIESTA does it.
Basically you divide the whole unit cell into 3D bins and you "store" each atom in the corresponding bin. Each bin is as big as the maximum distance that you have to look for neighbours. E.g. if the maximum radius is 1.5 Ang, then the bins need to be of size 3 Ang (6 Ang if you are looking for sphere overlaps.). That ensures that when you look for the neighbors of a given atom/position you only need to go one bin away in each direction. Therefore, you just need to explore
2**3 = 8
bins for each atom/position instead of all the atoms in the structure.The concept is the same as what ASE uses in their neighbour list (https://wiki.fysik.dtu.dk/ase/ase/neighborlist.html). The implementation is quite different though and you will see this in the benchmarks (below).
How to use it
Currently, you have to import the class
NeighFinder
:Then instantiate it with a geometry, which will do the organization of the bins. The parameters
R
andsphere_overlap
need to be provided here as well, since the size of the bins depends on them.Once instantiated, you will be able to query the neighbour finder as much as you want. Although the most efficient way to query it is all at once. There are two methods to find neighbours:
find_neighbours
andfind_all_unique_pairs
.The first one accepts
atoms
as an argument to find only the neighbours for a subset.You can also choose to return the neighbours as pairs if this is the way you need them, which is faster (splitting to get the above result is actually just an extra internal step):
Also, if you want just all unique pairs of neighbouring atoms you can call
find_all_unique_pairs
, which will save time and memory.Benchmarks
Here I benchmark the implementation against
ASE
and a trivialclose
search.The
quadratic
search is just a loop like:The
ASE
call looks like:and this implementation call is:
See the full script
First, I compare the calculation of neighbors for an
fcc
system, which I tile uniformly in all directionsAgainst quadratic implementation:
From as little as 40 atoms, this implementation is faster. After that, the quadratic implementation blows up. This is even more critical if the system has periodic boundary conditions. With
nsc=[3,3,3]
:Note that the implementation in this PR only needs to calculate some offsets if there are pbc and we are at the border of the cell, but still the search is only performed on 8 bins.
Note also that this implementation can scale to huge systems:
It works even better in sparse systems, where the number of candidate neighbors for each atom is much lower since some bins don't contain atoms:
Now, comparing with
ASE
, I have found that their algorithm scales pretty bad with the cell size:This is (I think), because they store the location of each atom in a dense array of shape
(nbins, max_atoms_per_bin)
. I followed the implementation described here. The usage of memory seems to be much more optimized. In fact, I have checked that this implementation almost keeps constant time if we increase the cell while keeping the number of atoms constant, whileASE
's scales linearly with it:Work to do
The current implementation only finds neighbors in the unit cell. As I mentioned before, one only needs to apply offsets if need it. I already compute the supercell indices of the bins that are searched, so it shouldn't be much of a problem.
However, I have some doubts:
nsc > 1
. However, the user may not know beforehand if there are inter-cell interactions for a given radius. Should we requirensc
to be 3 anyway?nsc
. So the user would need to set nsc themselves or receive the new values ofnsc
for them to use the indices properly.Also if the neighbor search spans more than 1 neighboring cell (
nsc > 3
), I will make it compute neighbors with the quadratic algorithm.Cheers!