Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This PR fixes two subtle, related issues that are blocking updates from going through downstream in the Kasmer project. At a high level, the issues are: - Flat namespace linking on macOS produces incorrect symbol lookups in dynamic libraries. - #1097 misses a subtle edge case related to tail-call optimisation. The actual code changes required are small, but warrant some detailed explanation. ## Flat Namespaces For a long time, macOS has implemented a system known as _two-level_ namespaces, whereby undefined symbol names in a dynamic library are prefixed with the name of the library in which the loader expects to be able to find them at run-time. This is a conservative behaviour; even if a symbol with the same name exists in a different library, it won't be selected. For example, the dynamic libraries built by `llvm-kompile` in `c` mode link against `libgmp`. Two-level namespaces produce dynamic symbol tables that look like: ```console $ dyld_info test/c/Output/flat-namespace.kore.tmp.dir/libtest.so -symbolic_fixups | grep gmpz_clear +0x2B28 bind pointer libgmp.10.dylib/___gmpz_clear ``` This behaviour is different to Linux, which does not have a notion of two-level namespaces. For legacy compatibility purposes, Apple supply a linker flag `-flat_namespace` that behaves more similarly to Linux behaviour. Its use is discouraged in new code, but we had enabled it to work around an issue in the Python bindings (python/cpython#97524) that should be fixed in a future CPython / macOS combination.[^1] When enabled, the symbol table looks something like this for the same example: ```console $ dyld_info test/c/Output/flat-namespace.kore.tmp.dir/libtest.so -symbolic_fixups | grep gmpz_clear +0x2EE8 bind pointer flat-namespace/___gmpz_clear ``` As a consequence of this, if the symbol `___gmpz_clear` exists in multiple dynamic libraries loaded by the same process, then the order in which they will be selected by the dynamic loader is not clearly well-defined,[^2] and when it's referenced we could end up loading either the correct or the incorrect symbol. This caused the initial bug observed as follows:[^3] - The Haskell backend statically links the `kore-rpc-booster` executable against `libgmp`, meaning that some GMP symbols appear in that binary. - The backend compiles shared libraries that dynamically link against `libgmp`. - `kore-rpc-booster` dynamically loads one of these libraries, and when resolving symbols to load, the flat namespace environment selects the static version for some and the dynamic version for others. - A call to `__gmpz_clear` from a backend hook ends up referencing the statically linked symbol, rather than the dynamically linked version. Generally, I think this situation is harmless - GMP is very stable and it's plausible that doing this for most symbols is not observable. - However, the dynamically-linked GMP library has been set up to use the KORE memory management functions. When the static version is called, it tries to `free()` a pointer allocated by the backend's GC, and crashes. The fix for this issue is to drop our usage of `-flat_namespace` for C shared libraries compiled by the backend. This breaks a few places we were relying on the old (incorrect) behaviour in the presence of C++ RTTI; having multiple instances of identically-named typeinfo symbols in a process is known to be broken there: - `libunwind` is actually implicitly linked via the macOS system library; if we explicitly link it as well, then code that handles exceptions will break. - The `k-rule-apply` tool linked two copies of the KORE AST library, causing `dynamic_cast` to break. #1110 addresses this. ## Tail-Call Optimisation In #1097, we made some changes that explicitly mark K functions as `musttail` when we know they're tail recursive. In doing so, we removed the need to use the `-tailcallopt` flag in most cases. However, the change in that PR missed that as well as IR-level transformations, `-tailcallopt` sets a lower-level flag in the backend[^4] code generator that guarantees tail-call code generation. For large programs, this meant I could observe stack overflows when traversing large terms. The fix is just to enforce that this internal option gets set properly; doing so is just a restoration of the behaviour we got from `-tailcallopt` before. [^1]: But isn't yet fixed, unfortunately - the underlying bug is still present on my system. Should be revisited in the future, ideally! [^2]: It might be defined somewhere, but the initial manifestation of this bug appeared in an apparently unrelated commit, so I think we were just getting lucky previously. The fix in this PR is morally correct whether or not things worked accidentally beforehand. [^3]: I intend to write this up fully later in a separate issue. [^4]: As in the X86 or arm backend of LLVM itself.
- Loading branch information