Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
postgresql11: preliminary JIT support
This adds preliminary support for JIT'ing SQL queries via LLVM to PostgreSQL, enabled on versions 11+, on Linux. The default stdenv of PostgreSQL itself and all exposed sub-packages are overridden to use a particular major version of llvmPackages.stdenv so that 'clang' is chosen as the default compiler. Currently, we use llvmPackages_6. To do this, we expose another component from postgresqlXXPackages -- a new .stdenv attribute, which exposes which stdenv PostgreSQL was built with. PostgreSQL JIT support works by compiling all of the source code for the system into LLVM bitcode, and shipping this as part of the binary distribution (under the .lib output, in our case, $lib/lib/bitcode/). At runtime, Postgres uses the LLVM API in order to load this bitcode and JIT queries directly against the source code. This allows inlining database code into queries, in particular inlining operator and expression definitions for custom types, in any extensions. This feature is enabled via the '--with-llvm' flag during configurePhase. This feature is integrated with 'PGXS', the Makefile infrastructure for writing and distributing Postgres extensions. If an extension is being compiled against a version of PostgreSQL that has LLVM JIT support, then the extension code is *also* compiled to bitcode and distributed transparently in the binary output, alongside the extension shared object. In order to ensure consistent bitcode is emitted, when Postgres is compiled with LLVM support, it *must* be compiled with Clang (hence the stdenv choice), and it also *must* embed the copy/path of Clang into the resulting binaries, so PGXS can use the exact same compiler later on during 3rd party extension builds. This design decision is likely because on systems such as Debian, multiple versions of Clang can exist, so any compiled Postgres code/extensions must use a consistent compiler for all future builds. However, this decision has extremely negative consequences for Nix-based packages, because it inserts LLVM and Clang into the closure of the PostgreSQL derivations as a hard runtime-dependency. This bloats the closure size by over a gigabyte (~140MB -> 1.4GB), which is fairly unwieldly and unlikely to be permissible by default. Currently this bloat only applies to the .out (binary) outputs. But this is made worse by the fact that the .lib output is also bloated by having a hard runtime dependency on llvmPackages.llvm.lib. This is because postgresql.lib now ships libllvmjit.so which talks to libLLVM.so, but postgresql.lib also contains client libraries like libpq.so. This effectively bloats every libpq client expression as well by about 200MB. Finally, because of the way PGXS's default installation logic works, it wants to install binary artifacts into the postgresql lib/ directory, which obviously isn't possible in the Nix store as it's read-only. Hence, we create environments composed of all extensions outputs and patch postgresql to load that. But this means the installPhase for every extension is currently a custom hand-written script, and *that* means every extension must now contain logic to install LLVM .bc files on top of .sql and .so files. Instead, we should probably patch PGXS to install to a proper external directory so its install logic can take over and we can remove custom installPhase scripts for most extensions. Postgres' LLVM support logic in its configure script and PGXS code will need to be patched to remove hard-coded references to clang-wrapper, since we always control the exact version of clang used and can remove it as a run-time dependency. Finally, bitcode should probably be moved to separate .bitcode derivations for all server versions and extension outputs, so that libpq clients aren't bloated by indirect dependencies on libLLVM.so (by way of libllvmjit.so). Oh, and PostGIS fails to build with JIT support/clang as the compiler, for some reason. As a result of these significant complications, this support is disabled by default, and should only be considered supported for vanilla PostgreSQL with no third-party extensions. Darwin may also be supported in the future; it may even build, but can't be tested. Signed-off-by: Austin Seipp <[email protected]>
- Loading branch information