Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tests fails on Mac ARM and packages from macports #162

Closed
nilason opened this issue Feb 2, 2024 · 26 comments
Closed

Tests fails on Mac ARM and packages from macports #162

nilason opened this issue Feb 2, 2024 · 26 comments

Comments

@nilason
Copy link

nilason commented Feb 2, 2024

Running the tests in a MacPorts setting, on an Apple ARM machine, the architecture test fails:

Executing:  cd "/opt/local/var/macports/build/_Users_nilason_ports_python_py-threadpoolctl/py311-threadpoolctl/work/threadpoolctl-3.2.0" && /opt/local/Library/Frameworks/Python.framework/Versions/3.11/bin/python3.11 -m installer /opt/local/var/macports/build/_Users_nilason_ports_python_py-threadpoolctl/py311-threadpoolctl/work/threadpoolctl-3.2.0-py3-none-any.whl --destdir /opt/local/var/macports/build/_Users_nilason_ports_python_py-threadpoolctl/py311-threadpoolctl/work/destroot 
--->  Testing py311-threadpoolctl
Executing:  cd "/opt/local/var/macports/build/_Users_nilason_ports_python_py-threadpoolctl/py311-threadpoolctl/work/threadpoolctl-3.2.0" && py.test-3.11 -o addopts='' 
========================================================================= test session starts ==========================================================================
platform darwin -- Python 3.11.7, pytest-7.4.3, pluggy-1.3.0
rootdir: /opt/local/var/macports/build/_Users_nilason_ports_python_py-threadpoolctl/py311-threadpoolctl/work/threadpoolctl-3.2.0
plugins: env-1.1.3, cov-4.1.0, xdist-3.3.1, anyio-4.2.0
collected 63 items

tests/test_threadpoolctl.py ....s.sss.ssssssss.ssssss..s..s.......sssssss.sssss..sss...F..s                                                                      [100%]

=============================================================================== FAILURES ===============================================================================
__________________________________________________________________________ test_architecture ___________________________________________________________________________

    def test_architecture():
        expected_openblas_architectures = (
            # XXX: add more as needed by CI or developer laptops
            "armv8",
            "Haswell",
            "Prescott",  # see: https://github.com/xianyi/OpenBLAS/pull/3485
            "SkylakeX",
            "Sandybridge",
            "VORTEX",
            "Zen",
        )
        expected_blis_architectures = (
            # XXX: add more as needed by CI or developer laptops
            "skx",
            "haswell",
        )
        for lib_info in threadpool_info():
            if lib_info["internal_api"] == "openblas":
>               assert lib_info["architecture"] in expected_openblas_architectures
E               AssertionError: assert 'ARMV8' in ('armv8', 'Haswell', 'Prescott', 'SkylakeX', 'Sandybridge', 'VORTEX', ...)

tests/test_threadpoolctl.py:617: AssertionError
======================================================================= short test summary info ========================================================================
FAILED tests/test_threadpoolctl.py::test_architecture - AssertionError: assert 'ARMV8' in ('armv8', 'Haswell', 'Prescott', 'SkylakeX', 'Sandybridge', 'VORTEX', ...)
=============================================================== 1 failed, 26 passed, 36 skipped in 0.44s ===============================================================
Command failed:  cd "/opt/local/var/macports/build/_Users_nilason_ports_python_py-threadpoolctl/py311-threadpoolctl/work/threadpoolctl-3.2.0" && py.test-3.11 -o addopts='' 
Exit code: 1
Error: Failed to test py311-threadpoolctl: command execution failed

The following patch make the tests succeed:

--- tests/test_threadpoolctl.py.orig
+++ tests/test_threadpoolctl.py
@@ -600,6 +600,7 @@
     expected_openblas_architectures = (
         # XXX: add more as needed by CI or developer laptops
         "armv8",
+        "ARMV8",
         "Haswell",
         "Prescott",  # see: https://github.com/xianyi/OpenBLAS/pull/3485
         "SkylakeX",

Would that be a proper fix?

@ogrisel
Copy link
Contributor

ogrisel commented Feb 6, 2024

Interesting, I cannot reproduce on my M1 mac. Out of curiosity, how did you install numpy / openblas?

  • when installing numpy from pip
$ python -m threadpoolctl -i numpy
[
  {
    "user_api": "blas",
    "internal_api": "openblas",
    "num_threads": 8,
    "prefix": "libopenblas",
    "filepath": "/Users/ogrisel/miniforge3/envs/tmp/lib/python3.11/site-packages/numpy/.dylibs/libopenblas64_.0.dylib",
    "version": "0.3.23.dev",
    "threading_layer": "pthreads",
    "architecture": "armv8"
  }
]
  • when installing numpy from conda-forge:
$ python -m threadpoolctl -i numpy
[
  {
    "user_api": "blas",
    "internal_api": "openblas",
    "num_threads": 8,
    "prefix": "libopenblas",
    "filepath": "/Users/ogrisel/miniforge3/envs/tmp/lib/libopenblas.0.dylib",
    "version": "0.3.26",
    "threading_layer": "openmp",
    "architecture": "VORTEX"
  },
  {
    "user_api": "openmp",
    "internal_api": "openmp",
    "num_threads": 8,
    "prefix": "libomp",
    "filepath": "/Users/ogrisel/miniforge3/envs/tmp/lib/libomp.dylib",
    "version": null
  }
]

@ogrisel
Copy link
Contributor

ogrisel commented Feb 6, 2024

Irrespective of this, please feel free to open a PR with your suggested fix.

@ogrisel
Copy link
Contributor

ogrisel commented Feb 6, 2024

From your initial report, I suppose you used macports to install numpy and openblas. It's weird that it introduces casing variability but so be it. We might want to make the assertion in the test case-insensitive.

threadpoolctl should probably still report the name with the original casing of the underlying library though, just in case it matters for debugging low-level runtime/packaging problems.

@nilason
Copy link
Author

nilason commented Feb 7, 2024

Yes, I'm working with MacPorts, where I also co-maintain a couple of GIS related ports (MP lingo for packages).

I turns out the command gives the following result:

 % /opt/local/bin/python3.12 -m threadpoolctl -i numpy
[]

I also tried programmatically using examples in the README with the same negative result. What could be the reason to this?

@nilason
Copy link
Author

nilason commented Feb 7, 2024

For what it is worth, MPs OpenBLAS in not built with OpenMP support:

cat /opt/local/lib/pkgconfig/openblas.pc
libdir=/opt/local/lib
libsuffix=
includedir=/opt/local/include/openblas

openblas_config=USE_64BITINT= NO_CBLAS= NO_LAPACK= NO_LAPACKE= DYNAMIC_ARCH=OFF DYNAMIC_OLDER=OFF NO_AFFINITY=1 USE_OPENMP= ARMV8 MAX_THREADS=56 
Name: OpenBLAS
Description: OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version
Version: 0.3.25
URL: https://github.com/OpenMathLib/OpenBLAS
Libs:  -L${libdir} -lopenblas${libsuffix} 
Cflags: -I${includedir}

According to Conda recipe, the build target for Apple ARM is VORTEX, whereas MP use ARMV8.

@jeremiedbb
Copy link
Collaborator

We might want to make the assertion in the test case-insensitive.

I think so and opened #165

@jeremiedbb
Copy link
Collaborator

jeremiedbb commented Feb 7, 2024

I also tried programmatically using examples in the README with the same negative result. What could be the reason to this?

Hard to tell. Maybe the name of the shared library file doesn't match what we expect or we're not able to find the expected symbols in it. I'd apply this patch before running the command line again to check to check if we're missing the library somehow or if it's just not loaded in the first place.

diff --git a/threadpoolctl.py b/threadpoolctl.py
index cfebcb5..d913bc8 100644
--- a/threadpoolctl.py
+++ b/threadpoolctl.py
@@ -986,6 +986,7 @@ class ThreadpoolController:
         for i in range(n_dyld):
             filepath = ctypes.string_at(libc._dyld_get_image_name(i))
             filepath = filepath.decode("utf-8")
+            print(filepath)
 
             # Store the library controller if it is supported and selected
             self._make_controller_from_path(filepath)
@@ -1070,6 +1071,7 @@ class ThreadpoolController:
             # library. move to next library.
             if prefix is None:
                 continue
+            print("detected prefix", prefix)
 
             # workaround for BLAS libraries packaged by conda-forge on windows, which
             # are all renamed "libblas.dll". We thus have to check to which BLAS
@@ -1102,6 +1104,8 @@ class ThreadpoolController:
                 for func in controller_class.check_symbols
             ):
                 self.lib_controllers.append(lib_controller)
+            else:
+                print("no required symbols")
 
     def _check_prefix(self, library_basename, filename_prefixes):
         """Return the prefix library_basename starts with

@nilason
Copy link
Author

nilason commented Feb 7, 2024

The patch results in a number of lines with "detected prefix None".

@jeremiedbb
Copy link
Collaborator

Right, I updated the patch to only show the relevant ones to avoid the noise

@nilason
Copy link
Author

nilason commented Feb 7, 2024

Now we have tree lines of "detected prefix libblas".

@jeremiedbb
Copy link
Collaborator

jeremiedbb commented Feb 7, 2024

hum... We explicitely rule out shared libs named libblas (except on windows) because they might not include all the symbols that threadpoolctl needs. We had an issue where openblas came with a libopenblas.so containing all the required symbols and a libblas.so containing only a subset of symbols making threadpoolctl crash.

It may be time to reconsider supporting libblas on all platforms but I don't know how to avoid the issue I described above.

@jeremiedbb
Copy link
Collaborator

And the fact that you detect 3 libblas and not just one is a hint that there's something fishy. I'm not sure we can reliably handle libblas

@nilason
Copy link
Author

nilason commented Feb 7, 2024

Where is this "libblas.dylib" supposed to be? I don't have it in MacPorts directory.

@jeremiedbb
Copy link
Collaborator

jeremiedbb commented Feb 7, 2024

you should have the path just above the "detected prefix libblas" lines. Could you paste the full output here ?

@nilason
Copy link
Author

nilason commented Feb 7, 2024

Where is this "libblas.dylib" supposed to be? I don't have it in MacPorts directory.

Or where (in code) is this detected?

@nilason
Copy link
Author

nilason commented Feb 7, 2024

Aha, is comes from system:
/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib

@jeremiedbb
Copy link
Collaborator

note that this is not openblas but macos' veclib which we don't support because it doesn't expose an api to control the number of threads which is the purpose of threadpoolctl

@nilason
Copy link
Author

nilason commented Feb 7, 2024

Is this something loaded from threadpoolctl or numpy?

@jeremiedbb
Copy link
Collaborator

the shared lib is first loaded by numpy, then threadpoolctl inspect all loaded libs but doesn't load them itself

@nilason
Copy link
Author

nilason commented Feb 7, 2024

Seem numpy isn't involved in that part. I run with a non-existing module:

python3.12 -m threadpoolctl -i xyz     
WARNING: could not import xyz
detected prefix libblas libblas.dylib /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
detected prefix libblas libblas.dylib /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
detected prefix libblas libblas.dylib /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
[]

@nilason
Copy link
Author

nilason commented Feb 7, 2024

Or do you mean all loaded libs at all?

@jeremiedbb
Copy link
Collaborator

jeremiedbb commented Feb 7, 2024

Then it means that python itself is linked against veclib (which is kind if odd). threadpoolctl just inspect the list of shared libs loaded by a python program (and importing a package like numpy can trigger the loading of new shared libs like blas) but won't try to load a lib that is not already loaded by the program.

@ogrisel
Copy link
Contributor

ogrisel commented Feb 9, 2024

note that this is not openblas but macos' veclib which we don't support because it doesn't expose an api to control the number of threads which is the purpose of threadpoolctl

Note that we will support it partially via flexiblas: you won't be able to inspect / set the number of threads but you should be able to see that it's the active backend of flexiblas. See also the related #135.

@ogrisel ogrisel changed the title Tests fails on Mac ARM Tests fails on Mac ARM and packages from macports Feb 9, 2024
@ogrisel
Copy link
Contributor

ogrisel commented Feb 13, 2024

Since #165 was merged, let's close this issue and a follow-up in #135 for the matter of detecting accelerate when installed with macports:

#162 (comment)

@nilason
Copy link
Author

nilason commented Feb 13, 2024

Thanks. The MP's Numpy seems to be faulty, I filed a ticket for that in https://trac.macports.org/ticket/69326.

@chetooo40
Copy link

otool -L /Users/ogrisel/mambaforge-disabled/envs/dev/lib/libblas.3.dylib
/Users/ogrisel/mambaforge-disabled/envs/dev/lib/libblas.3.dylib:
@rpath/libvecLibFort-ng.dylib (compatibility version 0.0.0, current version 0.0.0)
/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib (compatibility version 1.0.0, current version 1.0.0, reexport)
@rpath/liblapack-netlib.3.9.0.dylib (compatibility version 0.0.0, current version 0.0.0, reexport)
@rpath/liblapacke-netlib.3.9.0.dylib (compatibility version 0.0.0, current version 0.0.0, reexport)
/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1292.0.0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants