Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: meson on windows (clang-cl.exe) #246

Closed
wants to merge 56 commits into from

Conversation

h-vetinari
Copy link
Member

@h-vetinari h-vetinari commented Sep 3, 2023

I want to say "it's finally happening", but it'll probably still be a decent chunk of work to get this off the ground...

At least we got conda-forge/flang-feedstock#28 in now. However, the binary is still named flang-new, not sure if meson can deal with this yet (upstream naming discussion in LLVM has more details; last blocker for rename has an RFC already...).

Then there's the question of mixing msvc, clang & flang, etc. etc...

@conda-forge-webservices
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@h-vetinari
Copy link
Member Author

However, the binary is still named flang-new, not sure if meson can deal with this yet (upstream naming discussion in LLVM has more details; last blocker for rename has an RFC already...).

Yup, seems that meson doesn't now flang-new:

..\meson.build:82:0: ERROR: Unknown compiler(s): [['ifort'], ['gfortran'], ['flang'], ['pgfortran'], ['g95']]
The following exception(s) were encountered:
Running `ifort --version` gave "[WinError 2] The system cannot find the file specified"
Running `ifort -V` gave "[WinError 2] The system cannot find the file specified"
Running `gfortran --version` gave "[WinError 2] The system cannot find the file specified"
Running `gfortran -V` gave "[WinError 2] The system cannot find the file specified"
Running `flang --version` gave "[WinError 2] The system cannot find the file specified"
Running `flang -V` gave "[WinError 2] The system cannot find the file specified"
[...]

@isuruf wanted to keep the name as-is for now. Given the indications that upstream flang will do the rename to flang soon-ish, I'm not sure that meson will want to wire up flang-new? Is there a way to hack around that @eli-schwartz? Perhaps we could do this on the meson feedstock (rather than upstream)?

Otherwise, I'd be fine with renaming flang-new -> flang within conda-forge. It's not like the "classic" flang and llvm-based flang could ever be installed into the same environment, so I don't see a big problem there.

@eli-schwartz
Copy link

FC=flang-new meson setup builddir will override the default iterative probing of various possible compiler binaries most of which you don't want. Exactly like $CC or $CXX.

This is necessary if you have multiple compilers installed and you need to use a specific one and ensure the wrong one isn't used. ;) In this case it will just let you detect a compiler you might not otherwise have found at all.

This does also assume that meson knows how to identify it based on the --version output, so you'll need to try it and see.

I would prefer to avoid adding probes for an experimental-only name, but if it still doesn't work after overriding the compiler name then I'd be happy to help get that working.

@h-vetinari
Copy link
Member Author

Thanks for the super quick response! 🙏
I'll try this tomorrow.

@h-vetinari
Copy link
Member Author

This is gonna be a whack-a-mole for a while I think 🙈

..\meson.build:82:0: ERROR: Unable to detect linker for compiler `flang-new -Wl,--version`
stdout: LINK : warning LNK4044: unrecognized option '/-version'; ignored
LINK : warning LNK4001: no object files specified; libraries used
LINK : warning LNK4068: /MACHINE not specified; defaulting to X64
LINK : error LNK2001: unresolved external symbol mainCRTStartup
a.exe : fatal error LNK1120: 1 unresolved externals

I have to admit that the interplay between MSVC, clang & flang here is something that still confuses me.

  • clang has a msvc-compatible mode (and then uses MSVC's linker?)
  • flang is built on the clang-driver (so presumably could do the same?)
  • we're already mixing clang and MSVC here:

REM set compilers to clang-cl
set "CC=clang-cl"
set "CXX=clang-cl"

AFAICT that means we're doing compilation with clang, but linkage with MSVC's linker? Not sure how I'd be able to tell meson to check that, much less in a syntax that flang understands.

Sidenote: If it helps, I have no issues to switch the linker over to lld, that should at least be the same stack as clang/flang.

@h-vetinari
Copy link
Member Author

@eli-schwartz, could you maybe share your thoughts how it'd be best to proceed here from your POV? :)

@h-vetinari
Copy link
Member Author

h-vetinari commented Sep 15, 2023

OK, using clang1 as the main compiler on windows allows us to get past the linker check and start compiling. However, we hit a problem pretty immediately:

[24/1628] Linking target scipy/_lib/_ccallback_c.cp39-win_amd64.pyd
FAILED: scipy/_lib/_ccallback_c.cp39-win_amd64.pyd 
"clang.exe"  -Wl,/OUT:scipy/_lib/_ccallback_c.cp39-win_amd64.pyd scipy/_lib/_ccallback_c.cp39-win_amd64.pyd.p/meson-generated__ccallback_c.c.obj "-Wl,/nologo" "-Wl,/OPT:REF" "-Wl,/DLL" "-Wl,/IMPLIB:scipy\_lib\_ccallback_c.cp39-win_amd64.lib" "-nostdlib" "-Xclang" "--dependent-lib=msvcrt" "-fuse-ld=lld" "-Wl,-defaultlib:%BUILD_PREFIX%/lib/clang/17.0.0/lib/windows/clang_rt.builtins-x86_64.lib" "-march=nocona" "-mtune=haswell" "-ftree-vectorize" "-fstack-protector-strong" "-O2" "-ffunction-sections" "-pipe" "-D_CRT_SECURE_NO_WARNINGS" "-D_MT" "-D_DLL" "-nostdlib" "-Xclang" "--dependent-lib=msvcrt" "-fuse-ld=lld" "-fno-aligned-allocation" "-Wl,--version-script=C:/bld/scipy-split_1694757427011/work/scipy/_build_utils/link-version-pyinit.map" "%PREFIX%\libs\python39.lib" "-lkernel32" "-luser32" "-lgdi32" "-lwinspool" "-lshell32" "-lole32" "-loleaut32" "-luuid" "-lcomdlg32" "-ladvapi32"
lld-link: warning: ignoring unknown argument '--version-script=C:/bld/scipy-split_1694757427011/work/scipy/_build_utils/link-version-pyinit.map'

lld-link: error: undefined symbol: __declspec(dllimport) strdup

strdup should be in the ucrt, and it's present in the host&build env, so I can only imagine that clang is not looking in the right place? Looking at the content, it only contains DLLs and no import libraries - I'm guessing it's possible that those come would with the compiler usually? But the azure image should still contain those... 🤔

Looking at the last successful build with MSVC for windows for that lib, we see that it's linking to:

   INFO (scipy,Lib/site-packages/scipy/_lib/_ccallback_c.cp39-win_amd64.pyd): Needed DSO python39.dll found in conda-forge::python-3.9.18-h4de0772_0_cpython
   INFO (scipy,Lib/site-packages/scipy/_lib/_ccallback_c.cp39-win_amd64.pyd): Needed DSO C:/Windows/System32/KERNEL32.dll found in $SYSROOT
   INFO (scipy,Lib/site-packages/scipy/_lib/_ccallback_c.cp39-win_amd64.pyd): Needed DSO C:/Windows/System32/VCRUNTIME140.dll found in $SYSROOT
   INFO (scipy,Lib/site-packages/scipy/_lib/_ccallback_c.cp39-win_amd64.pyd): Needed DSO C:/Windows/System32/downlevel/api-ms-win-crt-heap-l1-1-0.dll found in $SYSROOT
   INFO (scipy,Lib/site-packages/scipy/_lib/_ccallback_c.cp39-win_amd64.pyd): Needed DSO C:/Windows/System32/downlevel/api-ms-win-crt-runtime-l1-1-0.dll found in $SYSROOT
   INFO (scipy,Lib/site-packages/scipy/_lib/_ccallback_c.cp39-win_amd64.pyd): Needed DSO C:/Windows/System32/downlevel/api-ms-win-crt-string-l1-1-0.dll found in $SYSROOT
   INFO (scipy,Lib/site-packages/scipy/_lib/_ccallback_c.cp39-win_amd64.pyd): Needed DSO C:/Windows/System32/downlevel/api-ms-win-crt-math-l1-1-0.dll found in $SYSROOT

So it looks like something about the "sysroot" isn't working for our clang-on-win... @conda-forge/clang-win-activation

Footnotes

  1. homogenized the version with flang in https://github.com/conda-forge/clang-win-activation-feedstock/pull/23

@h-vetinari h-vetinari force-pushed the win_meson branch 2 times, most recently from 7d4bc48 to 8ba8310 Compare September 15, 2023 07:34
@h-vetinari
Copy link
Member Author

h-vetinari commented Sep 16, 2023

Beyond the setup questions about clang_win-64, I wanted to try building scipy with clang+flang just to see how far flang would get at all. But I realized that we don't have clang_linux-64 (presumably it's too risky that feedstocks start compiling against it, with a different ABI & stdlib than all our other linux builds).

So I guess I'll retry building flang for osx-64, there at least we'd be able to test the clang+flang combo. And beyond that, I can see what happens with GCC+flang on linux here...

@h-vetinari h-vetinari force-pushed the win_meson branch 2 times, most recently from 05d338d to 5e5df36 Compare September 16, 2023 07:04
@h-vetinari
Copy link
Member Author

Not sure what's happening, but I doubt the GCC+flang combination is well-supported, or even supported at all:

[10/1628] Compiling C object scipy/_lib/_test_ccallback.cpython-310-x86_64-linux-gnu.so.p/src__test_ccallback.c.o
x86_64-conda-linux-gnu-cc: warning: $PREFIX/include: linker input file unused because linking not done
x86_64-conda-linux-gnu-cc: warning: $PREFIX/include: linker input file unused because linking not done
x86_64-conda-linux-gnu-cc: warning: $PREFIX/include: linker input file unused because linking not done
x86_64-conda-linux-gnu-cc: warning: $PREFIX/include: linker input file unused because linking not done
[11/1628] Linking target scipy/_lib/_test_ccallback.cpython-310-x86_64-linux-gnu.so
FAILED: scipy/_lib/_test_ccallback.cpython-310-x86_64-linux-gnu.so 
$BUILD_PREFIX/bin/x86_64-conda-linux-gnu-cc  -o scipy/_lib/_test_ccallback.cpython-310-x86_64-linux-gnu.so scipy/_lib/_test_ccallback.cpython-310-x86_64-linux-gnu.so.p/src__test_ccallback.c.o -L$PREFIX/lib -Wl,--as-needed -Wl,--allow-shlib-undefined -Wl,-O1 -shared -fPIC -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,--allow-shlib-undefined -Wl,-rpath,$PREFIX/lib -Wl,-rpath-link,$PREFIX/lib -Wl,-O2 -Wl,--sort-common -Wl,--as-needed -Wl,-z,relro -Wl,-z,now -Wl,--disable-new-dtags -Wl,--gc-sections -Wl,--allow-shlib-undefined -Wl,-rpath,$PREFIX/lib -Wl,-rpath-link,$PREFIX/lib -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe $PREFIX/include -fdebug-prefix-map=$SRC_DIR=/usr/local/src/conda/scipy-split-1.11.2 -fdebug-prefix-map=$PREFIX=/usr/local/src/conda-prefix -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections $PREFIX/include -fdebug-prefix-map=$SRC_DIR=/usr/local/src/conda/scipy-split-1.11.2 -fdebug-prefix-map=$PREFIX=/usr/local/src/conda-prefix -O2 $PREFIX/include -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 $PREFIX/include
$BUILD_PREFIX/bin/../lib/gcc/x86_64-conda-linux-gnu/12.3.0/../../../../x86_64-conda-linux-gnu/bin/ld: error: $PREFIX/include: read: Is a directory
collect2: error: ld returned 1 exit status

@eli-schwartz
Copy link

eli-schwartz commented Sep 18, 2023

That looks suspiciously like something was supposed to normalize paths but is actually chopping off the -I part, and now the compiler thinks that an include headers path is actually an object file input.

otherwise, we're calling `shutil.which()` on the literal string
"%PREFIX%\Library\bin\lld-link.exe", which unsurprisingly fails.
@h-vetinari
Copy link
Member Author

linker detection doesn't work correctly, currently needs FC_LD

I think this might be the root of the issues here. Even when I try to work around the non-functional linker detection for flang on windows by setting FC_LD=lld-link, I still end up with

Fortran compiler for the host machine: flang-new (flang 17.0.0 "flang-new version 17.0.0")
Fortran linker for the host machine: flang-new ld.lld 17.0.0
                                               ^^^^^^
                                               wrong!

This probably explains why the syntax for the linker is not the right one, and all the problems that follow from that.

I haven't managed to fully dig through the list of class inheritance to understand where the linker detection happens for flang, but if I understand the code for guess_win_linker correctly, it specifically looks for a string LLD in the compiler version output, but neither clang-cl nor flang-new have that:

(%BUILD_PREFIX%) %SRC_DIR%>clang-cl.exe --version 
clang version 17.0.0
Target: x86_64-pc-windows-msvc
Thread model: posix
InstalledDir: %BUILD_PREFIX%\Library\bin

(%BUILD_PREFIX%) %SRC_DIR%>flang-new.exe --version 
flang-new version 17.0.0
Target: x86_64-pc-windows-msvc
Thread model: posix
InstalledDir: %BUILD_PREFIX%\Library\bin

Somehow, on clang things work (and lld-link) is picked up, not sure where the respective code is. If I set a full path like or FD_LD=%FULL_PATH%\lld-link.exe, the code falls through (AFAICT) to ClangCompiler.use_linker_args and then GnuLikeCompiler.use_linker_args, where it fails with:

..\meson.build:82:0: ERROR: Unsupported linker, only bfd, gold, and lld are supported, not %PREFIX%\Library\bin\lld-link.exe.

I'll try overriding use_linker_args in FlangCompiler, but that doesn't seem to be where the unix-y flavour of lld is being picked up...?

@eli-schwartz
Copy link

See also the invoked_directly flag. On Windows, there are two types of compiler/linker setups:

  • the MSVC style, where you run the compiler program to produce objects, and then you run the linker program to combine objects into libraries/executables
  • the mingw style, where you run the compiler program to produce objects, and then you run the compiler program in "driver mode" to combine objects into libraries/executables

On Unix, only the latter exists. You can technically invoke /usr/bin/ld.bfd <args> to link programs together, but it is very arcane and requires a bunch of arguments the compiler driver usually handles correctly for you.

I haven't managed to fully dig through the list of class inheritance to understand where the linker detection happens for flang, but if I understand the code for guess_win_linker correctly, it specifically looks for a string LLD in the compiler version output, but neither clang-cl nor flang-new have that:

It looks for it first in the compiler version output, and in such a case handles the linker as llvm/clang-cl and expects the compiler itself to act as the driver.

If that fails, and guess_win_linker is invoked with invoked_directly, it looks up the linker value as a program binary, the uses that for further probes -- specifically, the first thing it does is go ahead and execute the linker directly this time, and check for 'LLD' in the output.

@h-vetinari
Copy link
Member Author

See also the invoked_directly flag.

I think I see the issue now. detect_fortran_compiler only does guess_nix_linker for flang. That also explains the how the C/C++ compilers pick up lld-link, because of the try-except with invoked_directly=False on windows.

@h-vetinari
Copy link
Member Author

Woohoo 🥳

Fortran compiler for the host machine: flang-new (flang 17.0.0 "flang-new version 17.0.0")
Fortran linker for the host machine: flang-new lld-link 17.0.0

@h-vetinari
Copy link
Member Author

Ah, the joy was shortlived. Now we have the rsp issue also for flang:

[353/1628] Linking target scipy/linalg/_interpolative.cp311-win_amd64.pyd
FAILED: scipy/linalg/_interpolative.cp311-win_amd64.pyd 
"flang-new" @scipy/linalg/_interpolative.cp311-win_amd64.pyd.rsp
flang-new: error: no such file or directory: 'scipylib_fortranobject.a'
flang-new: error: no such file or directory: 'C:bldscipy-split_1695940085251_h_envlibspython311.lib'

@h-vetinari
Copy link
Member Author

h-vetinari commented Sep 28, 2023

Aaaaand marking rsp as unsupported for lld-link still triggers the same:

[352/1628] Linking target scipy/linalg/_interpolative.cp311-win_amd64.pyd
FAILED: scipy/linalg/_interpolative.cp311-win_amd64.pyd 
"flang-new" @scipy/linalg/_interpolative.cp311-win_amd64.pyd.rsp
flang-new: error: no such file or directory: 'scipylib_fortranobject.a'

flang-new: error: no such file or directory: 'C:bldscipy-split_1695946340147_h_envlibspython311.lib'

😑

@h-vetinari
Copy link
Member Author

@isuruf: You have to use the options as done in https://github.com/conda-forge/clang-win-activation-feedstock/blob/2f9e6d3774c9a702bc9b48ccc2ad83d0e7ea64b1/recipe/activate-clang_win-64.sh#L116

Particularly -nostdlib -Xclang --dependent-lib=msvcrt

From the meson logs:

clang-cl: warning: unknown argument ignored in clang-cl: '-nostdlib' [-Wunknown-argument]
clang-cl: warning: unknown argument ignored in clang-cl: '-fno-aligned-allocation' [-Wunknown-argument]
lld-link: warning: ignoring unknown argument '-Wl,-defaultlib:%PREFIX%\Library/lib/clang/17.0.0/lib/windows/clang_rt.builtins-x86_64.lib'

It looks like this affects regular C compilation as well:

[14/1628] Linking target scipy/_lib/_fpumode.cp311-win_amd64.pyd
lld-link: warning: ignoring unknown argument '--target=x86_64-pc-windows-msvc'
lld-link: warning: ignoring unknown argument '-nostdlib'
lld-link: warning: ignoring unknown argument '-Xclang'
lld-link: warning: ignoring unknown argument '--dependent-lib=msvcrt'
lld-link: warning: ignoring unknown argument '-fuse-ld=lld'
lld-link: warning: ignoring unknown argument '-Wl,-defaultlib:%PREFIX%\Library/lib/clang/17.0.0/lib/windows/clang_rt.builtins-x86_64.lib'

Could it be that we have to adapt the activation for windows to use clang-cl (& respective syntax) rather than clang?

Similarly, flang ignores --dependent-lib

flang-new: warning: -Wl,-defaultlib:%PREFIX%\Library/lib/clang/17.0.0/lib/windows/clang_rt.builtins-x86_64.lib: 'linker' input unused
flang-new: warning: argument unused during compilation: '-Xclang --dependent-lib=msvcrt'

resp. bails when it's done through -Xflang:

error: unknown argument: '--dependent-lib=msvcrt'

What are your thoughts on dealing with this? Can we get away without using --dependent-lib for flang (probably means we'd be linking to compiler-rt, but then, we're already encouraging that in LDFLAGS anyway...)?

PS. I raised this a while ago upstream

@h-vetinari
Copy link
Member Author

h-vetinari commented Sep 29, 2023

MESON_RSP_THRESHOLD not working as expected

OK, it does seem to work with a ridiculously high threshold (used 320'000 instead of 32'000).

That means we passed the first build & install! 🥳

@h-vetinari
Copy link
Member Author

IT'S GREEN!!! 🤩 🥳 🚀

giphy

@rgommers
Copy link
Contributor

🚀 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants