Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Statically Linked version of PK CLI #84

Open
CMCDragonkai opened this issue Dec 21, 2023 · 4 comments
Open

Statically Linked version of PK CLI #84

CMCDragonkai opened this issue Dec 21, 2023 · 4 comments
Labels
development Standard development r&d:polykey:supporting activity Supporting core activity

Comments

@CMCDragonkai
Copy link
Member

CMCDragonkai commented Dec 21, 2023

Specification

Being inspired by Golang and other products like airplane.dev... etc. I can see that statically linked CLI executables is far easier to distribute. For example in https://github.com/airplanedev/cli/releases/tag/v0.3.201 you can just download the executable, and immediately run it on NixOS without problems:

»» ~/Projects/airplane
 ♖ file airplane
airplane: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, Go BuildID=Y7wWH3mifiqz5aem5Cxt/7sovkpNEjt7IJRP97P-1/CWBYjxSmrDYD9oEeD2kS/4L0HFVWRnfaSbHYuxre4, stripped

»» ~/Projects/airplane
 ♖ ldd ./airplane
	not a dynamic executable

»» ~/Projects/airplane
 ♜ ./airplane
Airplane CLI

One thing that would make it easier, is that we would be able to run polykey on NixOS and other Linux distributions far more easily.

Right now when we produce the Linux release of the CLI executable, we have to use steam-run on NixOS to run it because it uses the precompiled nodejs from https://github.com/yao-pkg/pkg-fetch and https://github.com/yao-pkg/pkg, which is a dynamically linked nodejs that refers to the standard Linux FHS location for the link loader, which is not the same location as it is on NixOS.

We also saw recently that due to the latest release of rocksdb, it didn't work on an older version of Ubuntu because it didn't have the latest C++ libraries.

We can sidestep all of this by statically building a binary. In fact pkg does support this. However it mentions this:

Note that fully static Node binaries are not capable of loading native bindings, so you may not use Node bindings with linuxstatic.

And our libraries that are native like js-db, js-quic... etc are all loaded using the dynamic link loader which relies on require or process.dlopen.

So we need to figure out how to produce statically linkable object files from js-db for example, and have that be consumed by Polykey-CLI to statically compile with the nodejs static. I'm not even sure if this is in fact possible, without also bringing in the compilation process of nodejs itself in PK-CLI somehow (right now we don't compile nodejs, it's just assumed to exist in binary form when we build the binary).

I raised a question about this in yao-pkg/pkg#12.

Additional context

Tasks

  1. Investigate static compilation using pkg - compiled nodejs from source, along with bringing in static binaries
  2. Make sure the virtual filesystem that pkg needs to use still works
  3. Incorporate it into the nix process, the ci process, and distribution process
  4. Test that it works, and make sure we try to squeeze it down
  5. Test what "static" compilation means in the case of MacOS and Windows
@CMCDragonkai CMCDragonkai added the development Standard development label Dec 21, 2023
@CMCDragonkai
Copy link
Member Author

CMCDragonkai commented May 12, 2024

Based on some research in:

Node Executable

The node executable that we use from https://github.com/yao-pkg/pkg like for example https://github.com/yao-pkg/pkg-fetch/releases/download/v3.5/node-v20.11.1-linux-x64 is a slimmed down node executable designed specifically to be used for embedding for pkg.

It is however a fully featured node executable, if you were to launch it, it asks you for a path to a file to execute. Unlike the regular node executable, it does not launch a prompt on startup.

Unlike the node that comes out of nixpkgs, it has all of its dependencies statically compiled already. The only thing left are:

  1. Glibc dependencies
  2. C++ standard library dependencies specifically 2 libraries: libstdc++.so.6 and libgcc_s.so.1

Things like zlib, openssl... etc are already statically linked into the binary.

If we want to do further static linking (for portability reasons, not necessarily performance reasons), you could try to statically link the C++ standard library dependencies first, but for the Glibc, you'd need to use musl or something that is intended for static linking.

This will require bringing the compilation of the node binary into the fold of Matrix AI, and running our own CI to do this. We should aim to do this for security reasons anyway and supply chain security.

To make the downloaded node executable work in NixOS you just need to add the interpreter path first as this solves the Glibc standard library, and then add a rpath to load the C++ standard library.

patchelf --set-interpreter $(cat $NIX_CC/nix-support/dynamic-linker) ./node-v20.11.1-linux-x64
patchelf --set-rpath ${pkgs.gcc.cc.lib}/lib ./node-v20.11.1-linux-x64

The role of pkg or similar executable bundling

Now executable bundling/archiving into a single file is not new. These are sometimes called "fat binaries" or "self-executing archives" or "application bundles" or "executable bundles".

Here I'm going to talk about linux specifically.

How these things work is by combining the executable interpreter like the node above, and just simply appending additional data at the end of the ELF.

According to the ELF standard, the last thing in the file might be a section header table, or it could be a section.

By serialising a directory of interpretable scripts into the end of the file, this becomes known as the embedded payload or resource section or tail section. What pkg is doing is then taking the node binary - adding an initial script for it to execute, and then somehow binding into the end of the file.

In deno, there is a special deno signature of 40 bytes, and this contains positions of an ESZIP archive https://github.com/denoland/eszip. Deno understands how to execute such a thing, and that's basically it when you run the executable bundle.

Doing this with interpretable scripts can sometimes require a sort of self-extracting archive, where either the scripts are being extracted onto disk first, cached, and then have the interpreter load things directly from the FS.

Alternatively some sort of VFS is possible too, for example https://github.com/vercel/pkg uses a VFS, and it's possible that the FS operations is being simulated instead of accessing the real fs, and the fs is all loaded into RAM of the application.

So basically the application bundle will first have the regular interpreter executable (possibly customised a bit to execute automatically), then subsequently a serialisable bunch of JS code.

Now if you use something like esbuild or webpack as JS bundler, you sort of reduce the need or reliance on a VFS/extracting archive, because you combine all the JS into a single file already, resolving all the file referencing, to instead just some sort of internal module loader that just jumps between different parts of the JS code. However there are still cases where this is not possible, such as the case of process.dlopen on native binaries, the esbuild does not understand how to "embed" such things because it was designed for the web, and native binaries do not exist on the web.

As of now, these native binaries are left by themselves, and thus the application archiving like pkg or deno would still rely on self extracting archive or some sort of VFS to load such things.

This may also include other kinds of "data files" like markdown or JSON or HTML. However I believe with custom esbuild loaders/plugins, it's possible to embed such things into the single JS file output too. This sort of data can always just be put into a JS variable/constant. In fact this might even work for the native binaries, however when node uses process.dlopen which actually performs a native syscall, it would not be able to understand the native code that has been embedded into a JS variable, therefore even for the case of native code, esbuild (or other JS bundlers) cannot be used to resolve this problem.

The only way to get EVERYTHING including native code into a single file without the reliance of a VFS or self extracting archive is to PRE-link the native code into the node binary prior to using esbuild.

That would mean that you could use esbuild to first bundle all the JS and JS-accessable data like json and readme through custom plugins, you leave out the native code.

Then when using the application bundler like pkg or otherwise, you need to then have it bundle in the native code.

This is actually what we already do, but in the case of pkg, we rely on the VFS to achieve this.

It is possible to achieve this at an even earlier stage by statically linking those native objects into the node binary, before pkg is involved. Now this leads us to the next point.

Patchelf and why it doesn't work with the application bundles

We know that unpatched binaries don't work on NixOS because NixOS doesn't follow the FHS, and thus the loader/interpreter for ELF does not exist in /lib64/ld-linux-x86-64.so.2 for 64 bit binaries.

So patchelf is used to patch the path to the interpreter which is embedded in the ELF files, as well as patching the RPATH which is a lookup path for dynamic libraries like the C++ standard library.

Patchelf actually adds additional data to the end of the ELF file, in a similar way to application bundling like pkg and deno.

This means the 2 things will be in conflict, because often the resulting fat binary will rely on some special section at the end of the file to load the the serialised JS code.

When you patchelf an application bundle, it adds additional data to the end of the fat binary, and now executing it doesn't work. In the case of deno, the special 40 byte section is still in the patched binary, but it's not at the end of the file anymore. So if you re-add back in the 40 bytes, it works again. However I don't think this is reliable, because patchelf may also patch rpaths and other stuff which may screw up relative byte positions expected. It all sort of depends on if patchelf also changes the internals of the original ELF file and it doesn't just append data to the end.

So the only option to make this work, is to patchelf the original embedded interpreter. In the case of pkg this means patchelf the https://github.com/yao-pkg/pkg-fetch/releases/download/v3.5/node-v20.11.1-linux-x64. This requires both the interpreter and the rpath to point to the C++ standard library.

Then you could use pkg to do an application bundle between the patched node intepreter and the serialised JS code. I believe @tegefaulkes has in fact tested this in the past and it worked. However I have not recently tested this, and I hope that the patchelf binary doesn't expect to use some special signature at the end of the file (since the end of the file is being extended to include serialise JS code).

Proposed pipeline

To setup a proper pipeline we should instead build our own node interpreter first. At this point we can elect to statically compile the C++ libraries to reduce reliance on OS provided C++ libraries, and even use musl if we want to.

Optionally when targeting NixOS, we can then patchelf it, specifically focusing on the interpreter (note that this means it is done on the build system) - NixOS would target a specific loader when it is build on the target system. If you are building on a build system targeting a different host system, that may not actually work. I haven't tested this myself, but there are ways of targeting a different host system, so I'm not sure how the interpreter path that is fixed to /nix/store would work in that case. But generally speaking distribution on nixpkgs is done in relation to specific nixpkgs pins/channels.

At this point we can choose to statically link-in certain native object code that would be found later by dlopen syscall. As far as I can tell https://pubs.opengroup.org/onlinepubs/009696799/functions/dlopen.html indicates that it should be possible to put things into the binary that will end up being found by a dlopen without actually hitting the real filesystem.

Subsequently JS code and JS-accessible data is also bundled together using esbuild and various plugins. The usage of process.dlopen might need to be customised to be able to match the expected paths of prelinked shared objects.

Finally the application bundler puts the 2 things together. If this was on NixOS, the interpreter was already pre-patched, and joining with serialised JS wouldn't be a problem.

We should also bring in the bundler into the fold in order to optimise any usage of VFS or self-extracting archive code. Ideally nothing should really be touching the disk - and everything is in RAM, and minimal filepath resolution should be needed given that all JS has been resolved into a single file, and all shared objects got put into the binary.

@CMCDragonkai
Copy link
Member Author

@brynblack @tegefaulkes this would be the high level issue over #102.

@CMCDragonkai
Copy link
Member Author

The serialised JS should be optimised as much as possible too. With compression and elimination of unnecessary artifacts (tree shaking with esmodule support) too.

@CMCDragonkai
Copy link
Member Author

There's actually one more thing that isn't addressed, and that's https://github.com/MatrixAI/js-workers. The migration to ESM MatrixAI/js-workers#12 has been stumped because of the underlying library isn't suited for it. The way multi-processing and web-workers work in node is relying on launching a copy of the interpreter process but executing some code that may exist on the filesystem.

As per MatrixAI/TypeScript-Demo-Lib#32, I think multiprocessing/multithreading needs to be re-framed entirely... If we are trying to enable to the ability to share memory/library/objects between the threads and without a filesystem... then one has to consider how it decomposes down to using pthreads at all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Standard development r&d:polykey:supporting activity Supporting core activity
Development

No branches or pull requests

2 participants