Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bytecode to bir #1472

Merged
merged 47 commits into from
Jul 31, 2023
Merged

Bytecode to bir #1472

merged 47 commits into from
Jul 31, 2023

Conversation

Bike
Copy link
Member

@Bike Bike commented Jul 21, 2023

Defines a system for compiling bytecode directly into BIR (and thereafter into native code). Hooks it up to cl:compile, so that running cl:compile on a bytecoded function will compile it to native. Note that bytecoded functions are still considered compiled (i.e. are compiled-function-p). I think this behavior is kosher. There is no automatic compilation yet.

This is experimental. It is not ready for prime time and some of my design is pretty sloppy. But it's stable enough that I want to try using it as part of normal workflows so we can see what breaks.

@Bike Bike force-pushed the bytecode-to-bir branch 2 times, most recently from 4b4b90b to a02ece5 Compare July 27, 2023 13:27
@Bike
Copy link
Member Author

Bike commented Jul 28, 2023

@drmeister has asked me not to merge this until we sort out what's going wrong with snapshots - they are not currently working on this branch, but not working on main either.

Bike added 27 commits July 31, 2023 07:52
rather than as calls that futz with MV. This can be used as a
correctness condition, and it makes compiling to IR easier.
The unbound objects still cause all kinds of bad problems like
segffaults that we absolutely do not want to be user-exposed.
This uniformly puts arguments on the VM stack before a call, VM or
not, and saves a copy for multi-argument-form mv calls (the copy
into the mv vector by pop-values).

But my actual impetus is this structure makes the compilation to
BIR easier, as all mv-calls are preceded by push/append-values,
like how BIR requires a value-collect for any mv call.

I had to increase the VM stack size to keep the build working,
though. Bit worrying. The failure was when compiling the package
definition for closer-mop, of all things.
This localizes the LLVM stuff in an entry point that accepts BIR,
so if we produce BIR by some other means (e.g. from bytecode) we
can reuse said stuff.
Still needs work, but it's functional.
If we need to test bytecodeness, we can just look at the type of
the simple fun.
this makes setq a little less ridiculous to generate - we just use
a stack spot instead of a local variable. This will reduce the
number of local variables needed by some functions.

Also makes things a little easier on bytecode-to-bir. It was
screwing things up in that the temporary variable was conflated
with earlier bound variables, which caused problems when the
variable was closed over.
I will be adding more for the sake of further compilation, and
this will make that easier. It also makes it possible to assert
that the debug infos are in order (i.e. their START indices are
nondecreasing) which is very convenient for the compiler and
debugger.
Doing optional variables etc is going to be annoying
The bytecode compiler will have to keep track of optimize
declarations for the btb compiler.
This cuts down on pointless annotations.
Bike added 20 commits July 31, 2023 07:52
By generation, and also in LTV FASLs
not gonna lie, this code is kind of ugly.
No longer necessary given the variable annotations and DUP.
Functions being relocated means that just generating the debug
infos all in the same module vector as you go can result in
non monotonic positions.
This is a pretty messy solution. drop-mv never actually needs to
be executed, but it lets the compiler treat unconditional jumps as
not happening for the purpose of compiling unreachable code
correctly. FIX ME
without the notnilp it always returns true, which screws a few
things up. durr
The more important intent of this is to store information about
PHIs to simplify the btb compiler.
It does not seem to be used anywhere. It adds extra memory to the
threads which is then never used, since those threads never even
run Lisp code. Running the constructors can also be complicated
and is again unnecessary.
Having a large array inside the VirtualMachine (and thus the
ThreadLocalState) seems to crash the linker on Mac. It's also just
sort of wonky.

Probably ideally we'd have a more sophisticated setup here with
mmap and guard pages and growth and yada yada, but this works ok.
There is a definite suboptimality in allocating the entire max
stack size all at once, but it's not _too_ big so maybe it's ok.
It also means that the GC will walk the entire stack for pointers,
including the inactive part, but Boehm is probably not capable
of doing something more complex anyway. Or if it is it's wizardry.
This was apparently used for parallel linking at some point but was
dummied out for mysterious reasons. Even without that, I suspect
that the thread pool wouldn't need to be global.
@Bike Bike merged commit 6a66ec9 into main Jul 31, 2023
8 checks passed
@Bike Bike deleted the bytecode-to-bir branch August 1, 2023 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant