-
Notifications
You must be signed in to change notification settings - Fork 663
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: add two point distance function #1512
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -208,6 +208,23 @@ def _check_lengths_match(*arrays): | |
raise ValueError("Input arrays must all be same shape" | ||
"Got {0}".format([a.shape for a in arrays])) | ||
|
||
|
||
def distance(a, b, box=None): | ||
"""calc distances between two points. Uses algorithm from Tuckerman | ||
|
||
""" | ||
if box is None: | ||
return np.linalg.norm(a - b) | ||
|
||
box = triclinic_vectors(box) | ||
ibox = np.linalg.inv(box) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You can explicitly calculate the inverse as in #1475. This is marginally faster in numpy but might be a lot faster in cython. import numpy as np
def make_box():
a = [np.random.random(), 0, 0]
b = np.zeros(3)
b[:2] = np.random.random(2)
c = np.random.random(3)
return 20 * np.array([a, b, c])
def ibox(box):
a,b,c = box
return np.array([[1/a[0], 0, 0], [-b[0]/(a[0]*b[1]), 1/b[1], 0], [(b[0]*c[1] - b[1]*c[0])/(a[0]*b[1]*c[2]), -c[1]/(b[1]*c[2]), 1/c[2]]]) Timing: box = make_box()
%timeit ibox(box)
# 7.33 µs ± 108 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit np.linalg.inv(box)
#9.98 µs ± 247 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each) Correctness In [42]: np.dot(box, ibox(box))
Out[42]:
array([[ 1.00000000e+00, 0.00000000e+00, 0.00000000e+00],
[ 1.90479818e-18, 1.00000000e+00, 0.00000000e+00],
[ 3.40613054e-17, -7.40127872e-17, 1.00000000e+00]]) (Closed form from sympy...). from sympy.interactive.printing import init_printing
init_printing(use_unicode=False, wrap_line=False, no_global=True)
a0, b0, b1, c0, c1, c2 = sym.symbols("a0, b0, b1, c0, c1, c2")
h = sym.Matrix([[a0, b0, c0], [0, b1, c1], [0, 0, c2]])
h.inv()
In [35]: h
Out[35]:
[a0 b0 c0]
[ ]
[0 b1 c1]
[ ]
[0 0 c2]
In [37]: h.inv()
Out[37]:
[1 -b0 b0*c1 - b1*c0]
[-- ----- -------------]
[a0 a0*b1 a0*b1*c2 ]
[ ]
[ 1 -c1 ]
[0 -- ----- ]
[ b1 b1*c2 ]
[ ]
[ 1 ]
[0 0 -- ]
[ c2 ] There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes with cython this can be speed up. But I have to rewrite to use less variables to get the full speedup and not keep calling python functions. from numba import jit
import numpy as np
@jit
def jit_ibox(box):
inv = np.zeros(box.shape)
inv[0, 0] = 1 / box[0, 0]
inv[1, 0] = -box[1,0] / (box[0,0]*box[1,1])
inv[1, 1] = 1 / box[1,1]
inv[2, 0] = (box[1,0]*box[2,1] - box[1,1]*box[2,0])/(box[0,0]*box[1,1]*box[2,2])
inv[2, 1] = -box[2,1]/(box[1,1]*box[2,2])
inv[2, 2] = 1 / box[2,2]
return inv note: I tested this with numba because it has similar performance but much shorter test cycles
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This looks promising; what is the performance of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sour don't what you report. It might not help much though because of the dot products that are used. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry autocorrect. My performance of Linalg.inv is comparable to yours. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks. And yes, you're right, of course, the matrix-vector multiplications are bad (although I don't know if There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it does but the python function call as well as the input verification that numpy does is expensive. Maybe we can unpack that more using explicit formulas calculated with sympy. |
||
sa = np.dot(ibox, a) | ||
sb = np.dot(ibox, b) | ||
s = sa - sb | ||
s -= np.round(s) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. they are equivalent |
||
return np.linalg.norm(np.dot(box, s)) | ||
|
||
|
||
def distance_array(reference, configuration, box=None, result=None, backend="serial"): | ||
"""Calculate all distances between a reference set and another configuration. | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc string should state that this takes any PBC into account. I would also reference Tuckerman properly (and possibly even write out the math because we would want to use it as a test case for correctness and in this case, one cannot be explicit enough.)