You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Two unit tests on main are failing when run on GPUs. One of them should be easy to resolve via the usual strategy of changing a comparison tolerance, but the other one looks a bit more serious.
=================================== FAILURES ===================================
__________________________ test_minimize[CG-float32] ___________________________
dtype = <class 'jax.numpy.float32'>, method = 'CG'
@pytest.mark.parametrize("dtype", [snp.float32, snp.complex64])
@pytest.mark.parametrize("method", ["CG", "L-BFGS-B"])
def test_minimize(dtype, method):
from scipy.linalg import block_diag
B, M, N = (4, 3, 2)
# Models a 12x8 block-diagonal matrix with 4x3 blocks
A, key = random.randn((B, M, N), dtype=dtype)
x, key = random.randn((B, N), dtype=dtype)
y = snp.sum(A * x[:, None], axis=2) # contract along the N axis
# result by directly inverting the dense matrix
A_mat = block_diag(*A)
> expected = np.linalg.pinv(A_mat) @ y.ravel()
scico/test/test_solver.py:172:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
<__array_function__ internals>:5: in pinv
???
/miniconda3/envs/py39gpu/lib/python3.9/site-packages/numpy/linalg/linalg.py:2002: in pinv
u, s, vt = svd(a, full_matrices=False, hermitian=hermitian)
<__array_function__ internals>:5: in svd
???
/miniconda3/envs/py39gpu/lib/python3.9/site-packages/numpy/linalg/linalg.py:1660: in svd
u, s, vh = gufunc(a, signature=signature, extobj=extobj)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
err = 'invalid value', flag = 8
def _raise_linalgerror_svd_nonconvergence(err, flag):
> raise LinAlgError("SVD did not converge")
E numpy.linalg.LinAlgError: SVD did not converge
/miniconda3/envs/py39gpu/lib/python3.9/site-packages/numpy/linalg/linalg.py:97: LinAlgError
_________________________ test_binary_op[float32-add] __________________________
testobj = <test_linop.LinearOperatorTestObj object at 0x7faa2049e160>
operator = <built-in function add>
@pytest.mark.parametrize("operator", [op.add, op.sub])
def test_binary_op(testobj, operator):
# Our AbsMatOp class does not override the __add__, etc
# so AbsMatOp + AbsMatOp -> LinearOperator
# So to verify results, we evaluate the new LinearOperator on a random input
comp_mat = operator(testobj.A, testobj.B) # composite matrix
comp_op = operator(testobj.Ao, testobj.Bo) # composite linop
assert isinstance(comp_op, linop.LinearOperator) # Ensure we don't get a Map
assert comp_op.input_dtype == testobj.A.dtype
> np.testing.assert_allclose(comp_mat @ testobj.x, comp_op @ testobj.x, rtol=5e-5)
E AssertionError:
E Not equal to tolerance rtol=5e-05, atol=0
E
E Mismatched elements: 1 / 8 (12.5%)
E Max absolute difference: 9.536743e-07
E Max relative difference: 7.0016424e-05
E x: array([-2.417045, 1.69006 , 2.91617 , 0.009365, 3.242247, 9.085916,
E 7.729687, -7.39012 ], dtype=float32)
E y: array([-2.417045, 1.69006 , 2.916171, 0.009364, 3.242248, 9.085917,
E 7.729687, -7.39012 ], dtype=float32)
scico/test/linop/test_linop.py:108: AssertionError
=========================== short test summary info ============================
FAILED scico/test/test_solver.py::test_minimize[CG-float32] - numpy.linalg.Li...
FAILED scico/test/linop/test_linop.py::test_binary_op[float32-add] - Assertio...
====== 2 failed, 3071 passed, 3 skipped, 11 xfailed in 742.76s (0:12:22) =======
The text was updated successfully, but these errors were encountered:
I only get the second (easier to fix) failure when I run these, with jaxlib == 0.3.5+cuda11.cudnn82 and jax == 0.3.5. @bwohlberg can you help me reproduce this?
I was able to replicate the error with Python 3.9.12, jaxlib 0.3.5 (0.3.5+cuda11.cudnn82) and jax 0.36. It does not occur if one just runs pytest -x scico/test/test_solver.py, so there must be some sort of state-dependency related to tests that have run before the failing test.
Two unit tests on
main
are failing when run on GPUs. One of them should be easy to resolve via the usual strategy of changing a comparison tolerance, but the other one looks a bit more serious.The text was updated successfully, but these errors were encountered: