-
-
Notifications
You must be signed in to change notification settings - Fork 30.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
macOS linker warnings in macOS ventura #97524
Comments
CC @ambv @ned-deily |
Unfortunately this warning also propagates to all extension modules built with the given Python. |
Linking with libpython is also problematic because on macOS shared libraries are referenced by absolute path which means linking with libpython ties the extension to a specific interpreter (e.g. makes it impossible to build an extension using the Python.org installer and load it using a homebrew install). For the user this will result in seemingly random crashes (due to libpython being loaded twice while only one is initialised). I haven't spend time on looking into this particular issue yet, with some luck there's another option beyond disabling chained lookups. |
FYI.. https://issues.guix.gnu.org/issue/57849 Looks like using -Wl,-w will help. |
That doesn't really help, it just suppresses the warning. |
You can pass |
It will likely won't because chained fixups are the new linker information for the dynamic symbol tables. So we will be just delaying the inevitable |
Another workaround is to target an older deployment target (before 11.0), but that's also not a long term solution. |
For static builds using This doesn't work with dynamic builds though, even when using Note that just linking extensions with libpython isn't a realistic option because that results in C extensions that target a specific python prefix and hence would loose interoperability for wheels build with our installer and homebrew. |
What does work for a dynamic build (again with a tweaked Something similar should also work for framework builds (but using a different reexport flag), but I haven't tested this yet. Mostly because I'm not happy with the link flags I've used for a shared build. All of this was tested by manual patching of the Makefile, I haven't looked into tweaking the configure script and makefile template yet. |
Hi — this issue has also been raised in the pybind11 and CMake projects, including a fix similar to what was suggested above by @ronaldoussoren. Links:
All of this raises a few questions (pybind/pybind11#4301 (comment)) that would be good to figure out together with somebody who is familiar with the intricacies of Darwin's dynamic linker. |
Hi @wjakob, We are currently waiting for some clarifications from Apple. Will follow up with anything we find. |
Anyone with Apple clout could push this PR into their attention sphere: apple-opensource/ld64#1 It basically says "here's a library; make any symbol that is resolved here |
Correct me if I'm wrong, but it seems to me like this PR targets a different use case: it enables While the issue discussed here is that the very mechanism that drives |
One more data point in case it is useful: explicitly enumerating all relevant Python symbols using the
|
|
* lisp/emacs-lisp/comp.el (native-comp-driver-options): Add "-Wl,-w" on Darwin systems. * etc/NEWS: Describe change.
See comment in body for details on why and how it can be replaced. The python community has been running into the same issues because cpython extensions operate very similarly to ruby. They got an explanation from Apple on some of how the changes work which can be found here and gives some additional useful context for understanding this change: python/cpython#97524 (comment)
…ace” error and ld: warning: -undefined dynamic_lookup may not work with chained fixups. python/cpython#97524 (comment) Ronald Oussoren > Another workaround is to target an older deployment target (before 11.0), but that's also not a long term solution. This works and seems needed on my ARM M1 MacOS 12.7 21G816, when compiling with either Apple clang-1400.0.29.202 or Homebrew clang version 16.0.5. The discussion there suggests that the Clang in Xcode 14.3 might fix this, but that won’t run on my MacOS.
chained fixups linker warning and subsequent “symbol not found in flat namespace” crash. Thanks to Ronald Oussoren (python/cpython#97524 (comment)). This works and seems needed on my ARM M1 MacOS 12.7 21G816, when compiling with either Apple clang-1400.0.29.202 or Homebrew clang version 16.0.5. The discussion there suggests that the Clang in Xcode 14.3 might fix this, but that won’t run on my MacOS. I’m not submitting my change to the CMake file (7c642778), since it presumably isn’t affecting most users and might be a performance hit. Feel free to pull it if you disagree.
* Example for Mac with the new pass manager and a real file. * Troubleshooting suggestion for Mac dynamic_lookup/ chained fixups linker warning and subsequent “symbol not found in flat namespace” crash. Thanks to Ronald Oussoren (python/cpython#97524 (comment)). This works and seems needed on my ARM M1 MacOS 12.7 21G816, when compiling with either Apple clang-1400.0.29.202 or Homebrew clang version 16.0.5. The discussion there suggests that the Clang in Xcode 14.3 might fix this, but that won’t run on my MacOS. I’m not submitting my change to the CMake file (FlashSheridan@7c642778), since it presumably isn’t affecting most users and might be a performance hit. Feel free to pull it if you disagree.
…ace” error and ld: warning: -undefined dynamic_lookup may not work with chained fixups. python/cpython#97524 (comment) Ronald Oussoren > Another workaround is to target an older deployment target (before 11.0), but that's also not a long term solution. This works and seems needed on my ARM M1 MacOS 12.7 21G816, when compiling with either Apple clang-1400.0.29.202 or Homebrew clang version 16.0.5. The discussion there suggests that the Clang in Xcode 14.3 might fix this, but that won’t run on my MacOS.
Recent versions of MacOS use a version of ld where `-fixup_chains` is on by default. This is incompatible with our usage of `-undefined dynamic_lookup`. Therefore we explicitly disable `fixup-chains` by passing `-no_fixup_chains` to the linker on darwin. This results in a warning of the form: ld: warning: -undefined dynamic_lookup may not work with chained fixups The manual explains the incompatible nature of these two flags: -undefined treatment Specifies how undefined symbols are to be treated. Options are: error, warning, suppress, or dynamic_lookup. The default is error. Note: dynamic_lookup that depends on lazy binding will not work with chained fixups. A relevant ticket is #22429 Here are also a few other links which are relevant to the issue: Official comment: https://developer.apple.com/forums/thread/719961 More relevant links: https://openradar.appspot.com/radar?id=5536824084660224 python/cpython#97524 Note in release notes: https://developer.apple.com/documentation/xcode-release-notes/xcode-13-releas e-notes (cherry picked from commit 8c0ea25)
With some luck most build systems would pick up flags from #103306 contains a related discussion about using |
Generating the file should be easy enough, e.g.: libpython.sym: $(LDLIBRARY)
nm $(LDLIBRARY) | grep ' T ' | awk '{ nm = substr($3, 2); print("-U ", nm) }' > libpython.sym I have not yet tried to actually use this though, let alone thought about the correct way to integrate this into the installation and build system. This will list more symbols than are technically part of the API, but keeps us closer to the current linking behaviour w.r.t. using private but exported symbols. That's not ideal, but does avoid bug reports about link errors on macOS that don't happen on other unix-y platforms. |
I submitted an upstream patch to GNU libtool, to add This would mean that packages wanting to use chained fixups would need to override this new libtool default. |
Sorry, I mean to include that link: |
This PR fixes two subtle, related issues that are blocking updates from going through downstream in the Kasmer project. At a high level, the issues are: - Flat namespace linking on macOS produces incorrect symbol lookups in dynamic libraries. - #1097 misses a subtle edge case related to tail-call optimisation. The actual code changes required are small, but warrant some detailed explanation. ## Flat Namespaces For a long time, macOS has implemented a system known as _two-level_ namespaces, whereby undefined symbol names in a dynamic library are prefixed with the name of the library in which the loader expects to be able to find them at run-time. This is a conservative behaviour; even if a symbol with the same name exists in a different library, it won't be selected. For example, the dynamic libraries built by `llvm-kompile` in `c` mode link against `libgmp`. Two-level namespaces produce dynamic symbol tables that look like: ```console $ dyld_info test/c/Output/flat-namespace.kore.tmp.dir/libtest.so -symbolic_fixups | grep gmpz_clear +0x2B28 bind pointer libgmp.10.dylib/___gmpz_clear ``` This behaviour is different to Linux, which does not have a notion of two-level namespaces. For legacy compatibility purposes, Apple supply a linker flag `-flat_namespace` that behaves more similarly to Linux behaviour. Its use is discouraged in new code, but we had enabled it to work around an issue in the Python bindings (python/cpython#97524) that should be fixed in a future CPython / macOS combination.[^1] When enabled, the symbol table looks something like this for the same example: ```console $ dyld_info test/c/Output/flat-namespace.kore.tmp.dir/libtest.so -symbolic_fixups | grep gmpz_clear +0x2EE8 bind pointer flat-namespace/___gmpz_clear ``` As a consequence of this, if the symbol `___gmpz_clear` exists in multiple dynamic libraries loaded by the same process, then the order in which they will be selected by the dynamic loader is not clearly well-defined,[^2] and when it's referenced we could end up loading either the correct or the incorrect symbol. This caused the initial bug observed as follows:[^3] - The Haskell backend statically links the `kore-rpc-booster` executable against `libgmp`, meaning that some GMP symbols appear in that binary. - The backend compiles shared libraries that dynamically link against `libgmp`. - `kore-rpc-booster` dynamically loads one of these libraries, and when resolving symbols to load, the flat namespace environment selects the static version for some and the dynamic version for others. - A call to `__gmpz_clear` from a backend hook ends up referencing the statically linked symbol, rather than the dynamically linked version. Generally, I think this situation is harmless - GMP is very stable and it's plausible that doing this for most symbols is not observable. - However, the dynamically-linked GMP library has been set up to use the KORE memory management functions. When the static version is called, it tries to `free()` a pointer allocated by the backend's GC, and crashes. The fix for this issue is to drop our usage of `-flat_namespace` for C shared libraries compiled by the backend. This breaks a few places we were relying on the old (incorrect) behaviour in the presence of C++ RTTI; having multiple instances of identically-named typeinfo symbols in a process is known to be broken there: - `libunwind` is actually implicitly linked via the macOS system library; if we explicitly link it as well, then code that handles exceptions will break. - The `k-rule-apply` tool linked two copies of the KORE AST library, causing `dynamic_cast` to break. #1110 addresses this. ## Tail-Call Optimisation In #1097, we made some changes that explicitly mark K functions as `musttail` when we know they're tail recursive. In doing so, we removed the need to use the `-tailcallopt` flag in most cases. However, the change in that PR missed that as well as IR-level transformations, `-tailcallopt` sets a lower-level flag in the backend[^4] code generator that guarantees tail-call code generation. For large programs, this meant I could observe stack overflows when traversing large terms. The fix is just to enforce that this internal option gets set properly; doing so is just a restoration of the behaviour we got from `-tailcallopt` before. [^1]: But isn't yet fixed, unfortunately - the underlying bug is still present on my system. Should be revisited in the future, ideally! [^2]: It might be defined somewhere, but the initial manifestation of this bug appeared in an apparently unrelated commit, so I think we were just getting lucky previously. The fix in this PR is morally correct whether or not things worked accidentally beforehand. [^3]: I intend to write this up fully later in a separate issue. [^4]: As in the X86 or arm backend of LLVM itself.
- don't use a bundle loader (esp. for installed GAP) - don't use flat namespace, instead use dynamic lookup - use `-Wl,-no_fixup_chains` to avoid warnings (and resolve potential issues) with certain Xcode versions, see <python/cpython#97524>
- don't use a bundle loader (esp. for installed GAP) - don't use flat namespace, instead use dynamic lookup - use `-Wl,-no_fixup_chains` to avoid warnings (and resolve potential issues) with certain Xcode versions, see <python/cpython#97524>
Turns out
ld
shipped with Xcode 14+ emits a warning when using-
undefined dynamic_lookup.`Some investigation reveals that in fact
-undefined dynamic_lookup
doesn't work when:This is because
-undefined dynamic_lookup
uses lazy-bindings and they won't be bound while dyld fixes-up by traversing chained-fixup info.However, we build extension modules with as bundles (
-bundle
) and they are loaded only throughdlopen
, so it's safe to use -undefined dynamic_lookup in theory. So the warning produced by ld64 islikely to be a false-positive.
ld64
also provides the-bundle_loader <executable>
option, which allows resolving symbols defined in the executable symbol table while linking. It behaves almost the same with-undefined dynamic_lookup
, but it makes the following changes:-undefined dynamic_lookup
lookups all symbol tables as flat namespace)See "New Features" subsection under "Linking" section for chained fixup
developer.apple.com/documentation/xcode-release-notes/xcode-13-release-notes for more information.
Also, check the same problem in ruby where most of this information is taken from.
The text was updated successfully, but these errors were encountered: