Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[openblas/clapack] FindLapack/FindBLAS was not working. #6786

Merged
merged 21 commits into from
Jul 18, 2019

Conversation

Neumann-A
Copy link
Contributor

due to extra ascii symbol "%3b" in linker dependencies which is ";"
(Lists in IMPORTED_LOCATION do not work as expected)
This commit is based on CMakes FindLAPACK and adds f2c to be checked

due to extra ascii symbol "%3b" in linker dependencies which is ";"
(Lists in IMPORTED_LOCATION do not work as expected)
This commit is based on CMakes FindLAPACK and adds f2c to be checked
@cenit
Copy link
Contributor

cenit commented Jun 5, 2019

could you please tell me if it works better for you what's in #6730?
Because FindLAPACK and FindBLAS integrated in CMake are broken for vcpkg tbh... since they do not realise that LAPACK does not export BLAS and BLAS does not export LAPACK.
In that PR I am trying to fix a lot of problems for downstream projects, and in fact i had to rework a little bit the FindLAPACK.cmake that is in vcpkg atm

@cenit
Copy link
Contributor

cenit commented Jun 5, 2019

if it works, please let's work together there. We are wasting so much effort duplicating work for common broken things

@Neumann-A
Copy link
Contributor Author

@cenit :

since they do not realise that LAPACK does not export BLAS and BLAS does not export LAPACK.

I think you have something wrong here. CMakes LAPACK does export BLAS! From FindLAPACK.cmake:
set(${LIBRARIES} ${${LIBRARIES}} ${_blas} ${_threads})
Your are probably fooled by a port/package supplied FindLAPACK.cmake!

My diff to CMakes supplied module:
LAPACK_diff.txt

@cenit
Copy link
Contributor

cenit commented Jun 6, 2019

(I wrote the bugged FindLAPACK that you are trying to solve, sorry for the trouble)

I think you almost catched the problem. It works both way: also FindBLAS sets up LAPACK_LIBRARIES, pointing to the same BLAS libraries! Because the two libraries are really entangled! Just as an example, clapack contains a blas reference implementation and openblas contains a lapack reference implementation!!

But the problem here is that somehow it was decided to split them here instead of using one to do both (mostly due to the absence of a fortran reference implementation on vcpkg).

OpenBLAS builds only BLAS functions
CLapack builds only LAPACK functions

Technically LAPACK depends on BLAS behind the doors, while the other way is not true.

So if you trigger a default FindLAPACK.cmake or FindBLAS.cmake you find all libraries, while in reality half of them do not work here on vcpkg (on systems different than windows). The cmake cache mechanism also works against you, because if you search both libraries, the second one does nothing because cache is already filled and so you still get half of the cmake variables already filled with libraries that do not export the symbols that you need. So the mess, where in linux no port at the moment using lapack/blas really works. Note that they build (due to the static linking which does not check proper symbol resolution), but in the end they do not work! It was a nightmare at the beginning, but in that PR the lapack/blas mess is solved (and in the meantime I also improved the FindLAPACK that you see so that it should work also with VS generators).

@Neumann-A
Copy link
Contributor Author

@cenit

It works both way: also FindBLAS sets up LAPACK_LIBRARIES,

I don't see that in CMakes FindBLAS module. Maybe another port supplied FindBLAS ? So at least in CMake modules there is no cycle dependency between the find modules

@Neumann-A
Copy link
Contributor Author

@cenit Concerning BLAS/LAPACK libraries.
Do you have any knowledge why check_function_exists (sgemm_) in cmakes FindBLAS module do not work on Linux? (Same for cheev_ in lapack if forced to use pkgconfig for BLAS) (I only tested it in WSL so it might be a WSL issue; Have you done any tests on a normal linux?).
I have the feeling that the current BLAS and LAPACK libraries might be broken on Linux since the check fails. Maybe the packages assume a FORTRAN compiler on non WIN machines and thus the symbols do not work correctly or are undefined. (Currently compiling OpenBLAS with FORTRAN and LAPACK on WSL Linux from vcpkg)

@cenit
Copy link
Contributor

cenit commented Jun 6, 2019

Which kind of implementation are you trying? OpenBLAS and CLapack?
OpenBLAS is configured to export symbols with a trailing underscore in Windows but not on Linux and so it fails the CMake official module. Clapack expects them with a trailing underscore also on Linux and so it fails as a code, not related to any build toolchain. There’s a lot of literature about this trailing underscore and the linking of FORTRAN symbols in a C function...
As I told you, please let’s not duplicate efforts. In that pr it is already fixed. And it works... there’s so much work to do!

@cenit
Copy link
Contributor

cenit commented Jun 6, 2019

You had a small problem about a module I have written and never tested with vs generators. Official CMake modules fail for clapack and OpenBLAS, especially in the way vcpkg admins decided to use them. It is necessary to use custom module for lapack and self-configured config file for BLAS in order to be able to use them. IMHO no point at all in forcing a broken module by CMake with some additional (but still not sufficient) fixes!

@Neumann-A
Copy link
Contributor Author

@cenit

Clapack expects them with a trailing underscore

before i wrote my last comment I already changed the search name in FindBLAS to search without the trailing _ but the function check_function_exists also failed for sgemm. So today I investigated a bit deeper to find the real issue and found that FindBLAS is simply not adding a threads library to the call which is required for OpenBLAS if it detects more than 2 cores! (add NUM_THREADS=1 to configure step to disable the usage of threads ) If I add pthreads FindBLAS finds the libraries. Also the trailing _ should be added to openblas by simply adding "-DBU=_" to the configure step. (From the looks of it it is a bug in OpenBLAS because it sets BU itself but way to late in the CMakeLists file. )

TL;DR; OpenBLAS has bugs (setting BU too late?); FindBLAS also has bugs (missing thread library)

@Neumann-A
Copy link
Contributor Author

if (BLA_VENDOR STREQUAL "OpenBLAS" OR BLA_VENDOR STREQUAL "All")
  if(NOT BLAS_LIBRARIES)
    # OpenBLAS (http://www.openblas.net)
    check_fortran_libraries(
      BLAS_LIBRARIES
      BLAS
      sgemm
      ""
      "openblas"
      ""
      )
  endif()
endif ()

should be

if (BLA_VENDOR STREQUAL "OpenBLAS" OR BLA_VENDOR STREQUAL "All")
  if(NOT BLAS_LIBRARIES)
    # OpenBLAS (http://www.openblas.net)
    check_fortran_libraries(
      BLAS_LIBRARIES
      BLAS
      sgemm
      ""
      "openblas"
      "${CMAKE_THREAD_LIBS_INIT}"
      )
  endif()
endif ()

(probably true for all checked BLAS libraries; I'll think I will make a MR to CMake for this)

@Neumann-A
Copy link
Contributor Author

@ras0219-msft
Copy link
Contributor

Given the above, it seems like the right fix is then to:

  1. Always have trailing underscores via passing -DBU=_
  2. In the portfile for openblas, we download @Neumann-A's fixed FindBLAS.cmake file from [1] and deploy it into share/openblas/
  3. We add a share/blas/vcpkg-cmake-wrapper.cmake which checks CMAKE_VERSION and pulls in the fixed find script if the current version is version_less than (say) 3.16. We'll need to guess at what version the fixed script will be included into. The reason to check the cmake version is so that in the future, after CMake has merged the fix, if additional fixes are added to cmake (say to support new implementations for BLAS), we won't prevent users from taking advantage of those.

It looks like this will also have impacts on #6730.

[1] https://gitlab.kitware.com/cmake/cmake/raw/210dac76c9af7679d52d565731966bfdc084443e/Modules/FindBLAS.cmake

@Neumann-A
Copy link
Contributor Author

Neumann-A commented Jun 8, 2019

After thinking about it and reading the discussion in #6786 I thing the linux error was to use clapack to begin with.... a) it is outdated (see http://nicolas.limare.net/pro/notes/2014/10/31_cblas_clapack_lapacke/); b) it is no longer maintained. So the solution should be:
All current changes to LAPACK\OpenBLAS in PR #6730 should be isolated and made into their own PR

on windows: use openblas + clapack

on linux: only use openblas + builtin lapack. And make ports depending on clapack using !linux so that clapack will not be used under linux (@seanwarren)

(and maybe change openblas from cmake to configure scripts because i have the feeling that the cmakelists are not that well defined)

Other solution would be:
Build Ninja from source and hack the version nummer to include dyndep-1 (see ninja-build/ninja#1521 (comment)) to get FORTRAN support for Ninja in CMake. Make the installation of a Fortran compiler a requirement and then build either openblas + lapack or lapack as its own package/port.

Althoug I solved the FindBLAS problem I still have not solved the FindLAPACK problem:
current errors:

[ 50%] Linking CXX executable lapack_test
../vcpkg/installed/x64-linux/lib/liblapack.a(cheev.c.o): In function `cheev_':
cheev.c:(.text+0xe9): undefined reference to `xerbla_'
../vcpkg/installed/x64-linux/lib/liblapack.a(chetrd.c.o): In function `chetrd_':
chetrd.c:(.text+0xd1): undefined reference to `xerbla_'
../vcpkg/installed/x64-linux/lib/liblapack.a(clascl.c.o): In function `clascl_':
clascl.c:(.text+0xb5): undefined reference to `xerbla_'
../vcpkg/installed/x64-linux/lib/liblapack.a(csteqr.c.o): In function `csteqr_':
csteqr.c:(.text+0x60f): undefined reference to `xerbla_'
../vcpkg/installed/x64-linux/lib/liblapack.a(cungtr.c.o): In function `cungtr_':
cungtr.c:(.text+0xc8): undefined reference to `xerbla_'
../vcpkg/installed/x64-linux/lib/liblapack.a(slascl.c.o):slascl.c:(.text+0xc2): more undefined references to `xerbla_' follow

so have to check that symbol in openblas. It is there but missing the trailing _:
NM tells me:

xerbla.c.o:
0000000000000000 T __xerbla
0000000000000000 W xerbla

I'll try to figure that last one out so that we have a workaround for linux.
Fixed

@Neumann-A Neumann-A changed the title [clapack] FindLapack was not working for VS Generators [openblas/clapack] FindLapack was not working for VS Generators Jun 9, 2019
@Neumann-A Neumann-A changed the title [openblas/clapack] FindLapack was not working for VS Generators [openblas/clapack] FindLapack/FindBLAS was not working. Jun 9, 2019
@Neumann-A
Copy link
Contributor Author

@cbezault Since you merged #6730 without breaking it up into single PRs i will leave this one to you :P

@cenit
Copy link
Contributor

cenit commented Jun 21, 2019

Ok I restored your PR and merged with master. There were many nice improvements that can still apply! Let's just see regressions for now, then we can better decide

@dan-shaw dan-shaw merged commit db807ad into microsoft:master Jul 18, 2019
@dan-shaw
Copy link
Contributor

Thanks for the PR!

@Neumann-A Neumann-A deleted the improve_find_lapack branch August 7, 2019 07:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants