Skip to content

Commit

Permalink
Drop flat namespace option (#1109)
Browse files Browse the repository at this point in the history
This PR fixes two subtle, related issues that are blocking updates from
going through downstream in the Kasmer project. At a high level, the
issues are:
- Flat namespace linking on macOS produces incorrect symbol lookups in
dynamic libraries.
- #1097 misses a
subtle edge case related to tail-call optimisation.

The actual code changes required are small, but warrant some detailed
explanation.

## Flat Namespaces

For a long time, macOS has implemented a system known as _two-level_
namespaces, whereby undefined symbol names in a dynamic library are
prefixed with the name of the library in which the loader expects to be
able to find them at run-time. This is a conservative behaviour; even if
a symbol with the same name exists in a different library, it won't be
selected. For example, the dynamic libraries built by `llvm-kompile` in
`c` mode link against `libgmp`. Two-level namespaces produce dynamic
symbol tables that look like:
```console
$ dyld_info test/c/Output/flat-namespace.kore.tmp.dir/libtest.so -symbolic_fixups | grep gmpz_clear
           +0x2B28      bind pointer   libgmp.10.dylib/___gmpz_clear
```

This behaviour is different to Linux, which does not have a notion of
two-level namespaces. For legacy compatibility purposes, Apple supply a
linker flag `-flat_namespace` that behaves more similarly to Linux
behaviour. Its use is discouraged in new code, but we had enabled it to
work around an issue in the Python bindings
(python/cpython#97524) that should be fixed in
a future CPython / macOS combination.[^1] When enabled, the symbol table
looks something like this for the same example:
```console
$ dyld_info test/c/Output/flat-namespace.kore.tmp.dir/libtest.so -symbolic_fixups | grep gmpz_clear
           +0x2EE8      bind pointer   flat-namespace/___gmpz_clear
```

As a consequence of this, if the symbol `___gmpz_clear` exists in
multiple dynamic libraries loaded by the same process, then the order in
which they will be selected by the dynamic loader is not clearly
well-defined,[^2] and when it's referenced we could end up loading
either the correct or the incorrect symbol. This caused the initial bug
observed as follows:[^3]
- The Haskell backend statically links the `kore-rpc-booster` executable
against `libgmp`, meaning that some GMP symbols appear in that binary.
- The backend compiles shared libraries that dynamically link against
`libgmp`.
- `kore-rpc-booster` dynamically loads one of these libraries, and when
resolving symbols to load, the flat namespace environment selects the
static version for some and the dynamic version for others.
- A call to `__gmpz_clear` from a backend hook ends up referencing the
statically linked symbol, rather than the dynamically linked version.
Generally, I think this situation is harmless - GMP is very stable and
it's plausible that doing this for most symbols is not observable.
- However, the dynamically-linked GMP library has been set up to use the
KORE memory management functions. When the static version is called, it
tries to `free()` a pointer allocated by the backend's GC, and crashes.

The fix for this issue is to drop our usage of `-flat_namespace` for C
shared libraries compiled by the backend. This breaks a few places we
were relying on the old (incorrect) behaviour in the presence of C++
RTTI; having multiple instances of identically-named typeinfo symbols in
a process is known to be broken there:
- `libunwind` is actually implicitly linked via the macOS system
library; if we explicitly link it as well, then code that handles
exceptions will break.
- The `k-rule-apply` tool linked two copies of the KORE AST library,
causing `dynamic_cast` to break. #1110 addresses this.

## Tail-Call Optimisation

In #1097, we made some changes that explicitly mark K functions as
`musttail` when we know they're tail recursive. In doing so, we removed
the need to use the `-tailcallopt` flag in most cases. However, the
change in that PR missed that as well as IR-level transformations,
`-tailcallopt` sets a lower-level flag in the backend[^4] code generator
that guarantees tail-call code generation. For large programs, this
meant I could observe stack overflows when traversing large terms.

The fix is just to enforce that this internal option gets set properly;
doing so is just a restoration of the behaviour we got from
`-tailcallopt` before.

[^1]: But isn't yet fixed, unfortunately - the underlying bug is still
present on my system. Should be revisited in the future, ideally!
[^2]: It might be defined somewhere, but the initial manifestation of
this bug appeared in an apparently unrelated commit, so I think we were
just getting lucky previously. The fix in this PR is morally correct
whether or not things worked accidentally beforehand.
[^3]: I intend to write this up fully later in a separate issue.
[^4]: As in the X86 or arm backend of LLVM itself.
  • Loading branch information
Baltoli authored Jul 19, 2024
1 parent feb291b commit 0399026
Show file tree
Hide file tree
Showing 6 changed files with 6,602 additions and 6 deletions.
26 changes: 21 additions & 5 deletions bin/llvm-kompile-clang
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,6 @@ if [[ "$OSTYPE" == "darwin"* ]]; then
flags=(
"-L@BREW_PREFIX@/opt/libffi/lib"
"-L@BREW_PREFIX@/lib"
"-L@LLVM_LIBRARY_DIR@"
"-Wl,-u,_table_getArgumentSortsForTag"
"-I" "@BREW_PREFIX@/include"
)
Expand All @@ -158,6 +157,19 @@ else
set_visibility_hidden="$LIBDIR/libSetVisibilityHidden.so"
fi

# On macOS, we get libunwind supplied as part of the developer tools in the OS,
# and so don't need to link it directly. If we instead try to explictly link
# against the libunwind that's part of Homebrew-supplied LLVM, it's easy to end
# up in a situation where exceptions thrown in a shared library are not
# compatible with the unwinding machinery in the main application binary. This
# then manifests as BAD_ACCESS errors (or similar) when an exception is thrown,
# _even if the exception should in principle be caught_.
if [[ "$OSTYPE" == "darwin"* ]]; then
libunwind=""
else
libunwind="-lunwind"
fi

# When building the Python AST module, there's no runtime and no main file, so
# we skip this entire step. The library code is just C++, so we can skip
# straight to invoking the C++ compiler.
Expand Down Expand Up @@ -207,7 +219,7 @@ if [[ "$OSTYPE" == "darwin"* ]]; then
start_whole_archive="-force_load"
end_whole_archive=""

flags+=("-Wl,-flat_namespace" "-Wl,-undefined" "-Wl,dynamic_lookup")
flags+=("-Wl,-undefined" "-Wl,dynamic_lookup")
else
start_whole_archive="-Wl,--whole-archive"
end_whole_archive="-Wl,--no-whole-archive"
Expand All @@ -218,9 +230,13 @@ if [ "$main" = "static" ]; then
elif [[ "$main" =~ "python" ]]; then
# Don't link jemalloc when building a python library; it clashes with the
# pymalloc implementation that Python expects you to use.
all_libraries=("${libraries[@]}" "-lgmp" "-lgmpxx" "-lmpfr" "-lpthread" "-ldl" "-lffi" "-lunwind")
all_libraries=("${libraries[@]}" "-lgmp" "-lgmpxx" "-lmpfr" "-lpthread" "-ldl" "-lffi" "$libunwind")
flags+=("-fPIC" "-shared" "-I${INCDIR}" "-fvisibility=hidden")

if [[ "$OSTYPE" == "darwin"* ]]; then
flags+=("-Wl,-flat_namespace")
fi

read -r -a python_include_flags <<< "$("${python_cmd}" -m pybind11 --includes)"
flags+=("${python_include_flags[@]}")

Expand All @@ -240,11 +256,11 @@ elif [ "$main" = "c" ]; then

# Avoid jemalloc for similar reasons as Python; we don't know who is loading
# this library so don't want to impose it.
all_libraries=("${libraries[@]}" "-lgmp" "-lgmpxx" "-lmpfr" "-lpthread" "-ldl" "-lffi" "-lunwind")
all_libraries=("${libraries[@]}" "-lgmp" "-lgmpxx" "-lmpfr" "-lpthread" "-ldl" "-lffi" "$libunwind")
flags+=("-fPIC" "-shared" "$start_whole_archive" "$LIBDIR/libkllvmcruntime.a" "$end_whole_archive")
clangpp_args+=("-o" "${output_file}")
else
all_libraries=("${libraries[@]}" "-lgmp" "-lgmpxx" "-lmpfr" "-lpthread" "-ldl" "-lffi" "-ljemalloc" "-lunwind")
all_libraries=("${libraries[@]}" "-lgmp" "-lgmpxx" "-lmpfr" "-lpthread" "-ldl" "-lffi" "-ljemalloc" "$libunwind")
fi

if $link; then
Expand Down
1 change: 1 addition & 0 deletions lib/codegen/ApplyPasses.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,7 @@ void generate_object_file(llvm::Module &mod, llvm::raw_ostream &os) {

auto features_string = features.getString();
auto options = TargetOptions{};
options.GuaranteedTailCallOpt = true;

#if LLVM_VERSION_MAJOR >= 16
std::optional<CodeModel::Model> model = std::nullopt;
Expand Down
2 changes: 1 addition & 1 deletion nix/llvm-backend.nix
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ stdenv.mkDerivation {
--replace '"-liconv"' '"-L${libiconv}/lib" "-liconv"' \
--replace '"-lncurses"' '"-L${ncurses}/lib" "-lncurses"' \
--replace '"-ltinfo"' '"-L${ncurses}/lib" "-ltinfo"' \
--replace '"-lunwind"' '"-L${libunwind}/lib" "-lunwind"' \
--replace '"$libunwind"' '"-L${libunwind}/lib" "-lunwind"' \
--replace '"-L@BREW_PREFIX@/opt/libffi/lib"' ' ' \
--replace '"-L@LLVM_LIBRARY_DIR@"' ' ' \
--replace '-L@BREW_PREFIX@/lib' '-L${libcxx}/lib' \
Expand Down
40 changes: 40 additions & 0 deletions test/c/Inputs/flat-namespace.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
#include "api.h"

#include <stdio.h>
#include <stdlib.h>

/**
* The K program corresponding to this test will evaluate the `Int2Bytes` hooked
* function when it is run; this hook ends up allocating and freeing a temporary
* MPZ integer locally. If we're in a situation where `-flat_namespace` has been
* enabled on macOS, it's possible for the hook to end up resolving
* `__gmpz_clear` to a symbol defined in the host binary (rather than libgmp's
* dynamic library!).
*
* This originally manifested when the HB booster loaded a C bindings library
* (the booster statically links libgmp and so contains a symbol with this
* name); this test is a minimised reproduction of that issue.
*/
void __gmpz_clear(void *p) {
abort();
}

int main(int argc, char **argv) {
if (argc <= 1) {
return 1;
}

struct kllvm_c_api api = load_c_api(argv[1]);

api.kllvm_init();

kore_sort *sort_foo = api.kore_composite_sort_new("SortFoo");
kore_pattern *pat = api.kore_composite_pattern_new("Lblfoo");

kore_pattern *input = api.kore_pattern_make_interpreter_input(pat, sort_foo);

block *term = api.kore_pattern_construct(input);
block *after = api.take_steps(-1, term);

printf("%s", api.kore_block_dump(after));
}
Loading

0 comments on commit 0399026

Please sign in to comment.