Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2.5.0] Center of charge test failures on macOS runners #4211

Closed
IAlibay opened this issue Jul 26, 2023 · 30 comments · Fixed by #4220
Closed

[2.5.0] Center of charge test failures on macOS runners #4211

IAlibay opened this issue Jul 26, 2023 · 30 comments · Fixed by #4220

Comments

@IAlibay
Copy link
Member

IAlibay commented Jul 26, 2023

Expected behavior

All tests on our latest production release should pass across expected machine types (especially ubuntu, and macos).

Actual behavior

As reported by @orbeckst here (and reproduced in #4210), tests for CoC unwrapping are failing.

This isn't seen on develop.

Code to reproduce the behavior

Run tests on the 2.5.0 on an intel macOS machine

@orbeckst
Copy link
Member

Do you even have an idea what could cause the issue?

Is it specific to Intel Mac??

@IAlibay
Copy link
Member Author

IAlibay commented Jul 26, 2023

Yeah Intel Mac only. It's odd because the tests haven't changed, which makes me think that the base numpy version we use to cythonize might be to blame here. I want to check that once the new CI checks are in place.

@IAlibay
Copy link
Member Author

IAlibay commented Jul 29, 2023

Actually I don't know if it's specific to x86_64 intel - can anyone with a macos-arm64 confirm that this happens if they pull from conda-forge?

Note: this is only happening on conda-forge. I'm currently thinking it might be down to clang 15.

@IAlibay
Copy link
Member Author

IAlibay commented Jul 29, 2023

I will add here that the osx-arm64 builds on the feedstock aren't facing the issue, so maybe it is just x86_64: https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=753485&view=logs&j=9c5ef928-2cd6-52e5-dbe6-9d173a7d951b&t=c66c4975-bbef-551b-2ee6-85005a231b12

@IAlibay
Copy link
Member Author

IAlibay commented Jul 29, 2023

tagging @RMeli who has both an intel and arm macos machine

@IAlibay
Copy link
Member Author

IAlibay commented Jul 29, 2023

Where I'm at on all this:

  1. Conda-forge builds have the above failure, but not PyPi ones
  2. Building from PyPi source is also fine
  3. All other platforms but x86_64 macos seem fine
  4. The relevant code in 2.6.0 does not seem to have changed.
  5. The previous conda-forge builds of 2.5.0 also fail (so it's unlikely to be something related to building with Cython 3)
  6. Building with Cython 3 using PyPi does not seem to cause any issues
  7. Using clang 14 (as is used by default on gh actions) instead of clang 15 (conda-forge default) does not change the issue

@IAlibay
Copy link
Member Author

IAlibay commented Jul 29, 2023

Short of someone giving me access to a macos x86_64 machine, I'm rapidly running out of options (and weekend patience..) on this one.

@RMeli
Copy link
Member

RMeli commented Jul 29, 2023

I can confirm that I don't see this issue on osx-arm64:

mamba create -n mda-2.5.0-test MDAnalysis=2.5.0 MDAnalysisTests pytest pytest-xdist
mamba activate mda-2.5.0-test
pytest --disable-pytest-warnings --pyargs MDAnalysisTests --numprocesses 4
================================ 18559 passed, 403 skipped, 8 xfailed, 2 xpassed, 9533 warnings in 230.61s (0:03:50) =================================

I'll try x86_64 next and see what I can dig up.

@hmacdope
Copy link
Member

Where I'm at on all this:

  1. Conda-forge builds have the above failure, but not PyPi ones
  2. Building from PyPi source is also fine
  3. All other platforms but x86_64 macos seem fine
  4. The relevant code in 2.6.0 does not seem to have changed.
  5. The previous conda-forge builds of 2.5.0 also fail (so it's unlikely to be something related to building with Cython 3)
  6. Building with Cython 3 using PyPi does not seem to cause any issues
  7. Using clang 14 (as is used by default on gh actions) instead of clang 15 (conda-forge default) does not change the issue

This smells of compiler flags to me, I'll try and see if any extra optimisations on CF.

@IAlibay
Copy link
Member Author

IAlibay commented Jul 30, 2023

This smells of compiler flags to me, I'll try and see if any extra optimisations on CF.

Here are the flags apparently set by CF:

2023-07-29T12:06:29.9787810Z +CXXFLAGS=-march=core2 -mtune=haswell -mssse3 -ftree-vectorize -fPIC -fPIE -fstack-protector-strong -O2 -pipe -stdlib=libc++ -fvisibility-inlines-hidden -fmessage-length=0 -isystem $PREFIX/include -fdebug-prefix-map=$SRC_DIR=/usr/local/src/conda/mdanalysis-2.5.0 -fdebug-prefix-map=$PREFIX=/usr/local/src/conda-prefix

@hmacdope
Copy link
Member

Strange that they use -mtune=haswell rather than generic but shouldn't effect the accuracy. Could we try drop -msse3 and see what happens? Could be an extra SIMD optimisation that is changing something.

@IAlibay
Copy link
Member Author

IAlibay commented Jul 30, 2023

:/ I think we're going to have to create a custom build.sh on conda-forge, that's a lot of work. Maybe @RMeli or anyone else with an Intel MacOS machine can first check locally if building mdanalysis with or without -msse3 does the trick?

@RMeli
Copy link
Member

RMeli commented Jul 31, 2023

I'm AFK, but I can certainly try ASAP.

@orbeckst
Copy link
Member

orbeckst commented Aug 1, 2023

If someone gives me instructions that even an idiot (I mean, PI) could follow then I'll be happy to do it. Sorry, a bit short on time to figure it out.

@hmacdope
Copy link
Member

hmacdope commented Aug 1, 2023

Should be enough to uncomment extra_cflags here and add -msse3 ,
Should look like

extra_cflags = -msse3 

then recompile on your Intel Mac and see if that breaks the tests

@orbeckst
Copy link
Member

orbeckst commented Aug 1, 2023

On x86_64 intel macOS 13.5 with Python 3.10 and the -msse3 flag I still get the same failures.

diff --git a/package/setup.cfg b/package/setup.cfg
index b4151d55f..18ca2f399 100644
--- a/package/setup.cfg
+++ b/package/setup.cfg
@@ -9,7 +9,7 @@
 ## Uncomment the below line to generate the annotated cython HTML output
 # annotate_cython=True
 ## any additional compiler flags for core C routines, excluding ENCORE
-#extra_cflags =
+extra_cflags = -msse3

 [build_sphinx]
 all-files = 1
$ clang --version
Apple clang version 14.0.3 (clang-1403.0.22.14.1)
Target: x86_64-apple-darwin22.6.0
Thread model: posix
InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin

set up and testing

mamba create -c conda-forge -n mda310y python==3.10 pytest pytest-xdist
mamba activate mda310y
pip install -v -e package/
pip install -v -e testsuite/

Inspecting the output shows that

  • Cython 3.0 is used (and complains about deprecation of DEF)
  • -msse3 shows up in clang lines

run tests

pytest --disable-pytest-warnings --pyargs MDAnalysisTests -k Testcenter_of_charge

results

(mda310y) deathstar:mdanalysis oliver$ pytest --disable-pytest-warnings --pyargs MDAnalysisTests -k Testcenter_of_charge
======================================================= test session starts ========================================================
platform darwin -- Python 3.10.0, pytest-7.4.0, pluggy-1.2.0
rootdir: ~/MDAnalysis/mdanalysis
plugins: hypothesis-6.82.0, xdist-3.3.1
collected 19037 items / 19022 deselected / 7 skipped / 15 selected

testsuite/MDAnalysisTests/core/test_topologyattrs.py ..........FFFFF                                                         [100%]

============================================================= FAILURES =============================================================
___________________________________________ Testcenter_of_charge.test_coc_unwrap[group] ____________________________________________

self = <MDAnalysisTests.core.test_topologyattrs.Testcenter_of_charge object at 0x130740430>, u = <Universe with 4 atoms>
compound = 'group'

    @pytest.mark.parametrize('compound', compounds)
    def test_coc_unwrap(self, u, compound):
        u.atoms.wrap
        coc = u.atoms[:2].center_of_charge(compound=compound, unwrap=True)
>       assert_equal(coc.flatten(), [0, -0.5, 0])

~/MDAnalysis/mdanalysis/testsuite/MDAnalysisTests/core/test_topologyattrs.py:693:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

args = (<built-in function eq>, array([0.        , 0.49999997, 0.        ]), [0, -0.5, 0])
kwds = {'err_msg': '', 'header': 'Arrays are not equal', 'strict': False, 'verbose': True}

    @wraps(func)
    def inner(*args, **kwds):
        with self._recreate_cm():
>           return func(*args, **kwds)
E           AssertionError:
E           Arrays are not equal
E
E           Mismatched elements: 1 / 3 (33.3%)
E           Max absolute difference: 0.99999997
E           Max relative difference: 1.99999994
E            x: array([0. , 0.5, 0. ])
E            y: array([ 0. , -0.5,  0. ])

~/tmp/micromamba-root/envs/mda310y/lib/python3.10/contextlib.py:79: AssertionError
__________________________________________ Testcenter_of_charge.test_coc_unwrap[segments] __________________________________________

self = <MDAnalysisTests.core.test_topologyattrs.Testcenter_of_charge object at 0x1307405b0>, u = <Universe with 4 atoms>
compound = 'segments'

    @pytest.mark.parametrize('compound', compounds)
    def test_coc_unwrap(self, u, compound):
        u.atoms.wrap
        coc = u.atoms[:2].center_of_charge(compound=compound, unwrap=True)
>       assert_equal(coc.flatten(), [0, -0.5, 0])

~/MDAnalysis/mdanalysis/testsuite/MDAnalysisTests/core/test_topologyattrs.py:693:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

args = (<built-in function eq>, array([0.        , 0.49999997, 0.        ]), [0, -0.5, 0])
kwds = {'err_msg': '', 'header': 'Arrays are not equal', 'strict': False, 'verbose': True}

    @wraps(func)
    def inner(*args, **kwds):
        with self._recreate_cm():
>           return func(*args, **kwds)
E           AssertionError:
E           Arrays are not equal
E
E           Mismatched elements: 1 / 3 (33.3%)
E           Max absolute difference: 0.99999997
E           Max relative difference: 1.99999994
E            x: array([0. , 0.5, 0. ])
E            y: array([ 0. , -0.5,  0. ])

~/tmp/micromamba-root/envs/mda310y/lib/python3.10/contextlib.py:79: AssertionError
__________________________________________ Testcenter_of_charge.test_coc_unwrap[residues] __________________________________________

self = <MDAnalysisTests.core.test_topologyattrs.Testcenter_of_charge object at 0x130740640>, u = <Universe with 4 atoms>
compound = 'residues'

    @pytest.mark.parametrize('compound', compounds)
    def test_coc_unwrap(self, u, compound):
        u.atoms.wrap
        coc = u.atoms[:2].center_of_charge(compound=compound, unwrap=True)
>       assert_equal(coc.flatten(), [0, -0.5, 0])

~/MDAnalysis/mdanalysis/testsuite/MDAnalysisTests/core/test_topologyattrs.py:693:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

args = (<built-in function eq>, array([0.        , 0.49999997, 0.        ]), [0, -0.5, 0])
kwds = {'err_msg': '', 'header': 'Arrays are not equal', 'strict': False, 'verbose': True}

    @wraps(func)
    def inner(*args, **kwds):
        with self._recreate_cm():
>           return func(*args, **kwds)
E           AssertionError:
E           Arrays are not equal
E
E           Mismatched elements: 1 / 3 (33.3%)
E           Max absolute difference: 0.99999997
E           Max relative difference: 1.99999994
E            x: array([0. , 0.5, 0. ])
E            y: array([ 0. , -0.5,  0. ])

~/tmp/micromamba-root/envs/mda310y/lib/python3.10/contextlib.py:79: AssertionError
_________________________________________ Testcenter_of_charge.test_coc_unwrap[molecules] __________________________________________

self = <MDAnalysisTests.core.test_topologyattrs.Testcenter_of_charge object at 0x1307406d0>, u = <Universe with 4 atoms>
compound = 'molecules'

    @pytest.mark.parametrize('compound', compounds)
    def test_coc_unwrap(self, u, compound):
        u.atoms.wrap
        coc = u.atoms[:2].center_of_charge(compound=compound, unwrap=True)
>       assert_equal(coc.flatten(), [0, -0.5, 0])

~/MDAnalysis/mdanalysis/testsuite/MDAnalysisTests/core/test_topologyattrs.py:693:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

args = (<built-in function eq>, array([0.        , 0.49999997, 0.        ]), [0, -0.5, 0])
kwds = {'err_msg': '', 'header': 'Arrays are not equal', 'strict': False, 'verbose': True}

    @wraps(func)
    def inner(*args, **kwds):
        with self._recreate_cm():
>           return func(*args, **kwds)
E           AssertionError:
E           Arrays are not equal
E
E           Mismatched elements: 1 / 3 (33.3%)
E           Max absolute difference: 0.99999997
E           Max relative difference: 1.99999994
E            x: array([0. , 0.5, 0. ])
E            y: array([ 0. , -0.5,  0. ])

~/tmp/micromamba-root/envs/mda310y/lib/python3.10/contextlib.py:79: AssertionError
_________________________________________ Testcenter_of_charge.test_coc_unwrap[fragments] __________________________________________

self = <MDAnalysisTests.core.test_topologyattrs.Testcenter_of_charge object at 0x130740760>, u = <Universe with 4 atoms>
compound = 'fragments'

    @pytest.mark.parametrize('compound', compounds)
    def test_coc_unwrap(self, u, compound):
        u.atoms.wrap
        coc = u.atoms[:2].center_of_charge(compound=compound, unwrap=True)
>       assert_equal(coc.flatten(), [0, -0.5, 0])

~/MDAnalysis/mdanalysis/testsuite/MDAnalysisTests/core/test_topologyattrs.py:693:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

args = (<built-in function eq>, array([0.        , 0.49999997, 0.        ]), [0, -0.5, 0])
kwds = {'err_msg': '', 'header': 'Arrays are not equal', 'strict': False, 'verbose': True}

    @wraps(func)
    def inner(*args, **kwds):
        with self._recreate_cm():
>           return func(*args, **kwds)
E           AssertionError:
E           Arrays are not equal
E
E           Mismatched elements: 1 / 3 (33.3%)
E           Max absolute difference: 0.99999997
E           Max relative difference: 1.99999994
E            x: array([0. , 0.5, 0. ])
E            y: array([ 0. , -0.5,  0. ])

~/tmp/micromamba-root/envs/mda310y/lib/python3.10/contextlib.py:79: AssertionError
===================================================== short test summary info ======================================================
FAILED testsuite/MDAnalysisTests/core/test_topologyattrs.py::Testcenter_of_charge::test_coc_unwrap[group] - AssertionError:
FAILED testsuite/MDAnalysisTests/core/test_topologyattrs.py::Testcenter_of_charge::test_coc_unwrap[segments] - AssertionError:
FAILED testsuite/MDAnalysisTests/core/test_topologyattrs.py::Testcenter_of_charge::test_coc_unwrap[residues] - AssertionError:
FAILED testsuite/MDAnalysisTests/core/test_topologyattrs.py::Testcenter_of_charge::test_coc_unwrap[molecules] - AssertionError:
FAILED testsuite/MDAnalysisTests/core/test_topologyattrs.py::Testcenter_of_charge::test_coc_unwrap[fragments] - AssertionError:
============================== 5 failed, 10 passed, 7 skipped, 19022 deselected, 6 warnings in 27.97s ==============================

@hmacdope
Copy link
Member

hmacdope commented Aug 1, 2023

Sorry @orbeckst my theory was that the flag causes the issue. Do you get the issue without the flag ?

@orbeckst
Copy link
Member

orbeckst commented Aug 1, 2023

Ok, I'll try a build without first.

@orbeckst
Copy link
Member

orbeckst commented Aug 1, 2023

(FYI:

warning: MDAnalysis/lib/nsgrid.pyx:86:0: The 'DEF' statement is deprecated and will be removed in a future Cython version. Consider using global variables, constants, and in-place literals instead. See cython/cython#4310

from Cython...)

@orbeckst
Copy link
Member

orbeckst commented Aug 1, 2023

I built without -msse3 and the tests still fail in the same way.

@orbeckst
Copy link
Member

orbeckst commented Aug 1, 2023

I removed -ffast-math and suddenly the test passes (without -msse3).

We wanted to remove it anyway #3821 — can I submit a PR?

diff --git a/package/setup.py b/package/setup.py
index 898ab5b3b..f6467bc4b 100755
--- a/package/setup.py
+++ b/package/setup.py
@@ -268,7 +268,7 @@ def extensions(config):
     use_openmp = config.get('use_openmp', default=True)
     annotate_cython = config.get('annotate_cython', default=False)

-    extra_compile_args = ['-std=c99', '-ffast-math', '-O3', '-funroll-loops',
+    extra_compile_args = ['-std=c99', '-O3', '-funroll-loops',
                           '-fsigned-zeros'] # see #2722
     define_macros = []
     if config.get('debug_cflags', default=False):

result

(mda310y) deathstar:mdanalysis oliver$ pytest --disable-pytest-warnings --pyargs MDAnalysisTests -k Testcenter_of_charge
======================================================= test session starts ========================================================
platform darwin -- Python 3.10.0, pytest-7.4.0, pluggy-1.2.0
rootdir: ~/MDAnalysis/mdanalysis
plugins: hypothesis-6.82.0, xdist-3.3.1
collected 19037 items / 19022 deselected / 7 skipped / 15 selected

testsuite/MDAnalysisTests/core/test_topologyattrs.py ...............                                                         [100%]

=================================== 15 passed, 7 skipped, 19022 deselected, 6 warnings in 6.68s ====================================

@orbeckst
Copy link
Member

orbeckst commented Aug 1, 2023

@hmacdope I also tested with -msse3 and tests still pass, as long as we don't use fast-math.

diff --git a/package/setup.cfg b/package/setup.cfg
index b4151d55f..18ca2f399 100644
--- a/package/setup.cfg
+++ b/package/setup.cfg
@@ -9,7 +9,7 @@
 ## Uncomment the below line to generate the annotated cython HTML output
 # annotate_cython=True
 ## any additional compiler flags for core C routines, excluding ENCORE
-#extra_cflags =
+extra_cflags = -msse3

 [build_sphinx]
 all-files = 1
diff --git a/package/setup.py b/package/setup.py
index 898ab5b3b..f6467bc4b 100755
--- a/package/setup.py
+++ b/package/setup.py
@@ -268,7 +268,7 @@ def extensions(config):
     use_openmp = config.get('use_openmp', default=True)
     annotate_cython = config.get('annotate_cython', default=False)

-    extra_compile_args = ['-std=c99', '-ffast-math', '-O3', '-funroll-loops',
+    extra_compile_args = ['-std=c99', '-O3', '-funroll-loops',
                           '-fsigned-zeros'] # see #2722
     define_macros = []
     if config.get('debug_cflags', default=False):

orbeckst added a commit that referenced this issue Aug 1, 2023
- fix #4211
- fix #3821
- Without -ffast-math, the AtomGroup.center_of_charge(..., unwrap=True) pass again
  on Intel x86_64 macOS.
- updated CHANGELOG
@orbeckst
Copy link
Member

orbeckst commented Aug 1, 2023

@hmacdope is the output of the tests on intel mac actually wrong or just showing up on the other side of the box, i.e., still correct under PBC?

@hmacdope
Copy link
Member

hmacdope commented Aug 1, 2023

Playing with it on paper it seems that way. I will spare you my shoddily drawn diagrams. I swear we agreed to remove --fast-math a while ago. Thanks for investigating.

@RMeli
Copy link
Member

RMeli commented Aug 1, 2023

Thanks @orbeckst for stepping in, I was on holidays the past couple of days.

Great investigation, thanks for all the details! I can reproduce your investigation, so #4220 LGTM (since we wanted to remove the flag anyways).

@IAlibay
Copy link
Member Author

IAlibay commented Aug 2, 2023

Probably something to add here, I think the reason that our PyPi cron checks weren't failing but the conda-forge ones are is that the former is defaulting to gcc whilst the latter is using clang.

We probably need to rethink what our build matrix's default compilers are at some point (I started doing that at some point and it became super messy).

@hmacdope
Copy link
Member

hmacdope commented Aug 2, 2023

I would advocate for clang if possible.

@IAlibay
Copy link
Member Author

IAlibay commented Aug 2, 2023

🤦🏽 I think I know what I did wrong. This is a me forgetting what I did in the past issue.

@orbeckst
Copy link
Member

orbeckst commented Aug 2, 2023

@hmacdope just to clarify your comment #4211 (comment) : I read it as that you're confirming that the failing tests here are still showing scientifically correct results but that they are producing alternative results (i.e., it's more of a reproducibility issue). Is that a correct interpretation?

(I am asking because that seems to determine how we will handle PR #4220 .)

@hmacdope
Copy link
Member

hmacdope commented Aug 2, 2023

@hmacdope just to clarify your comment #4211 (comment) : I read it as that you're confirming that the failing tests here are still showing scientifically correct results but that they are producing alternative results (i.e., it's more of a reproducibility issue). Is that a correct interpretation?

(I am asking because that seems to determine how we will handle PR #4220 .)

Yep that is correct. It's basically a sign bit issue due to some denormal to 0 squashing. Results are scientifically correct.

RMeli pushed a commit that referenced this issue Aug 2, 2023
* Fix AtomGroup.center_of_charge(..., unwrap=True) giving
    inconsistent (but scientifically correct) results on Intel macOS
    (Issue #4211)

  * Removed `-ffast-math` compiler flag to avoid potentially incorrect
    floating point results in MDAnalysis *and* other codes when
    MDAnalysis shared libraries are loaded --- see 
    https://moyix.blogspot.com/2022/09/someones-been-messing-with-my-subnormals.html
    (Issues #3821, #4211)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants