-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: design of a bytecode interpreter for Go #1
Comments
Thank you Seb, for starting this conversation again, especially in so structured a way. Re-reading the We certainly need a VM to target and, to be self-hosting on all Go platforms, that target VM needs a pure Go implementation. The simplest route has us inventing a Go-specific VM that exactly implements what is required for Go, and then writing two components a Go->VMbitcode compiler and a VMbitcode interpreter. Aside from the VMbitcode serialization/deserialization part, this is essentially what As we would control the format of our own VM, extending it in whatever ways were required to enable a REPL user interface would be relatively straightforward. But there are two strong attractions to using an existing VM:
The downside of using an existing VM is the sheer scale of the work involved:
Looking at some of the existing VMs we could choose:
But my personal choice for a VM for this project to target would be |
yes |
@elliott5 PTAL |
I've only just started my investigations @sbinet, but the part we are interested in |
@4ad: is this tentative proposal vaguely in line with what you had in mind wrt |
It's worth to note that both named types and interface types[0] are on the roadmap.
The specs for MVP (i.e. the format) should be done by the end of this year. Unfortunately other features such GC support will start only after the MVP so I'm wondering if we can workaround these limitations until native support is provided. |
I think it may be worth pro-actively addressing security concerns. Dynamic code generation at runtime always broadens the security attack surface, and for internet facing programs, this does need to be managed. We need a good story here. One thought would be to do something like the unsafe package does. To wit, unsafe can be prohibited in contexts where it needs to be. |
I don't think so, my use case and needs are very different from yours. I want to build an low-latency, real-time, low-throughput, non-parallel interpreter (not JIT) specifically tailored to Go and written in ANSI/POSIX C (perhaps with some assembly, but not Go), small enough that can run on today's (bigger) microcontrollers, and with special support for debugging Go, and for creating dynamic analysis tools. This interpreter will be able to function as a shared library (strange concept, coming from me), and will be able to run old code compiled by older versions of my compiler as well (binaries being forward-compatible is an explicit goal). The instruction set will be something that is very easy to compile Go to, and something that is also very easy to implement in a performant (and portable) manner without a JIT (threaded code is still under consideration, especially for microcontrollers). I will then port Go to this virtual machine, rewriting all the runtime, and not only writing a new compiler target, but most likely writing a new middle-end as well. I keep using the world portable here, that doesn't mean I necessarily want my code restricted to ANSI C. It's much easier to implement performant interpreters in assembly. I just mean that porting is literally easy. It's very easy to port a threaded code interpreter to a new architectures, even if it's written in assembly, because it is very straightforward and contains very few lines of code. The inner loop is literally just a few machine instructions. That being said, I still haven't excluded the option of a pure C interpreter. I have three main use cases for this. Even though I keep porting Go to new systems and architectures, the gc implementation still doesn't run on many systems, and it will never run on microcontrollers. As portable the gc implementation is (and as I retargeted it a few times, I know quite a bit about this; it's getting harder and harder to do a port, the gc implementation is becoming less portable), I need a Go implementation that can be ported in maximum one day of work, and preferably less than a couple of hours of work. I need a fully deterministic implementation of Go. This is one of the reasons I won't do a parallel implementation of Go (there are other reasons for that too). I need hard real-time execution behavior with real-time GC. Throughput is of no concern of mine, but I plan to use Go to fly planes and rockets. Perhaps surprisingly to some, it's easier to do real-time with an interpreter. Debugging Go code sucks. My VM will export debugging interfaces that can be targetted by gdb and other debuggers, plus it will implement a debugger itself, with full knowledge about Go internals. Very advanced (not really) features will easily be supported in an interpreter, like reliable watchpoints, always being able to find the origin of allocations, and using copy-on-write memory to find memory corruption bugs. Sometimes I need to do dynamic analysis over Go programs. This is easy to do in an interpreter, including things like race detection. The runtime needs to be completely rearchitected for what I want to do. The compiler is deeply entangled with the runtime. It's probably much easier to rewrite the middle-end. There's also a deeper research goal in all this endeavour. I want to study if it's possible to have a much smaller, and much portable Go implementation, that is still competitively performant, and has an edge over the gc implementation in some niche areas. So even if I can make the gc compiler generate code for my VM, or even if I am able to reuse part of the GC, I still want to write a new compiler and a new GC just to see if I can make a smaller, more maintainable system that is good enough. I will keep an eye on what's happening in your project, and I hope you'll keep an eye on mine. |
@4ad So similar to running Instruction List or Structured Text on a PLC? Would you do single thread per process and require pre-allocating slices and structures? Would you allow dynamic reloading of modules for online editing? @sbinet Implementation options aside, what feature sets do you want to target for an interpreter? I see five classes of use cases for an interpreter:
Each on of these could be executed in either:
In each environment the following could all be features of the interpreter:
|
@themihai thanks for the pointers. I had an old (go-1.5) CL for |
@glycerine good point about security. I am not very well versed in such an area, though. |
@4ad thanks for the input. sounds very ambitious :) (and way above my limited skill set) I will definitely keep an eye on your project (feel free to ping me when it's open sourced) -- if only to see how you did it and take inspirations on your |
@kardianos : coming from the scientific crowd, I am interested in a REPL in Go and the ability to extend/script a (Go) program via an interpreter API. for science work, trusting the environment isn't a showstopper but as @glycerine mentioned, we should have the hooks (?) to provide execution in an untrusted environment, preventing execution of code importing as for the features, I guess the last 2 are the ones I definitely want to provide. |
The generic instructions that our vm will need to work from are defined at https://github.com/golang/go/blob/master/src/cmd/compile/internal/ssa/gen/genericOps.go The simplifications that apply to all backend architectures are at https://github.com/golang/go/blob/master/src/cmd/compile/internal/ssa/gen/generic.rules Thus a generic Go VM design already exists, which is used internally by the core Go compiler to generate code for other architectures ... it is just that no-one has written an interpreter for it yet. |
@elliott5 : That is exactly what I was thinking(!) All the pieces appear to be there in the gc compiler, they just need to be refactored into a pipeline of incrementally updatable elements. The only trivial change would need to assign function definitions to actual variables, implicitly. This would prevent the compiler from inlining function calls and thus allow the functions to be updated after being defined. Also the type checker would need to be restartable in the sense that if a definition of a type changes, then it would need to recheck... a hopefully minimal set of dependents. Probably just need to keep a tree (graph) of type dependencies. It seems likely that guru or go/types already does this. |
Nice article on making the go1.7 SSA form visible at https://pauladamsmith.com/blog/2016/08/go-1.7-ssa.html Using the first test code from @sbinet
The last non-architecture-specific version of the generated SSA code (before "pass lower") is
|
Hi everyone, I'm excited to see this discussion taking place. In response to @sbinet's comment in #2
The development of llir/llvm is progressing steadily and should be ready for initial use by some applications within the months to come. It is currently used in a compiler project to translate a subset of C to LLVM IR. Within the near future, the intention is to use it within a decompilation pipeline translating LLVM IR to Go. The compiler project stress tests the write support for LLVM IR, while the decompiler stress tests the read support for LLVM IR. I'd be very interested in hearing if there are any specific requirements that the interpreter project may have, which are not yet covered by the aforementioned projects. So, brief status update from the llir/llvm project as of 2016-09-02:
Cheers /u |
I participated on the reddit thread about this but I thought having a more 'official' response could be useful -- I wanted to cast a vote for wasm. It seems like most UIs for Go are just webpages and being able to execute Go in a browser would very useful. |
Just to add to the discussion related to wasm. From https://github.com/WebAssembly/design/blob/master/TextFormat.md#official-text-format:
Thus it may be feasible to still target LLVM IR, which in turn may be translated to wasm. One benefit of this approach is that it lifts the requirement of a browser environment for execution, and gives support for natively running applications; with potential performance gains. |
Browser env isn't required to execute wasm: https://github.com/WebAssembly/design/blob/master/NonWeb.md |
What @lologarithm said. Lot of people don't know that, I believe even in the Go team. I think it would be extremely valuable to have the Go team/community's input while wasm is not set in stone. If we wait until the design is frozen, we might not be able to benefit from it. |
Two thoughts -
|
@glycerine wrt mono... how's the community behind it? |
@tiborvass you should probably raise this issue on either golang-nuts or on gophers.slack |
wrt (also: gopherJS will probably provide a |
GopherJS itself is unlikely to provide a wasm backend, there's almost nothing in common between the current approach and wasm. It'd be an entirely new compiler. The main thing I can see being shared are the lessons learned and the API/design of the (Source: see gopherjs/gopherjs#432 (comment).) |
thanks for the input. |
If you're talking about gopherjs/gopherjs#388, the goal of that work is to use To support compiling to wasm, you would need an actual compiler that does that. :) That's quite different. |
yes, I was talking about that. |
It first I thought llvm didn't have windows support, but then I checked and apparently llvm now supports building on windows as well as osx and linux. And with the llgo project, much work has already been done there. So I agree, llvm is probably a stronger choice than mono. A concrete step would be to implement the plugin interface that Ian defined, one that would let go programs load shared objects -- and then after loading, all share the same go runtime (scheduler and garbage collector). That's a pretty good sized chunk of work by itself, but would open up new vistas in terms of being able to run jit-ed and aot-compiled code on the same footing. I heard an unconfirmed rumor that someone on the Go team may be working on plugin loading at present. Has anyone heard more? |
@crawshaw is working on it: (most reasonably, only linux support for go-1.8) |
The golang-codereviews mailing list is a great source of rumors. The plugin work is a 20% time hobby for me. I'd like to do Darwin too, but there me groundwork needed there and I'm not sure I'll find the time. |
Status update from the llir/llvm project as of 2016-11-26. An extensive cleanup/rewrite of the But enough talk, lets look at an example (code on play.golang.org). package main
import (
"fmt"
"github.com/llir/llvm/ir"
"github.com/llir/llvm/ir/constant"
"github.com/llir/llvm/ir/types"
)
func main() {
// This example produces LLVM IR code equivalent to the following C code,
// which implements a pseudo-random number generator.
//
// int abs(int x);
//
// int seed = 0;
//
// // ref: https://en.wikipedia.org/wiki/Linear_congruential_generator
// // a = 0x15A4E35
// // c = 1
// int rand(void) {
// seed = seed*0x15A4E35 + 1;
// return abs(seed);
// }
// Create convenience types and constants.
i32 := types.I32
zero := constant.NewInt(0, i32)
a := constant.NewInt(0x15A4E35, i32) // multiplier of the PRNG.
c := constant.NewInt(1, i32) // increment of the PRNG.
// Create a new LLVM IR module.
m := ir.NewModule()
// Create an external function declaration and append it to the module.
//
// int abs(int x);
abs := m.NewFunction("abs", i32, ir.NewParam("x", i32))
// Create a global variable definition and append it to the module.
//
// int seed = 0;
seed := m.NewGlobalDef("seed", zero)
// Create a function definition and append it to the module.
//
// int rand(void) { ... }
rand := m.NewFunction("rand", i32)
// Create an unnamed entry basic block and append it to the `rand` function.
entry := rand.NewBlock("")
// Create instructions and append them to the entry basic block.
tmp1 := entry.NewLoad(seed)
tmp2 := entry.NewMul(tmp1, a)
tmp3 := entry.NewAdd(tmp2, c)
entry.NewStore(tmp3, seed)
tmp4 := entry.NewCall(abs, tmp3)
entry.NewRet(tmp4)
// NOTE: The sem checker is yet to be implemented.
//
// Perform static semantic analysis on the LLVM IR module.
//if err := sem.Check(m); err != nil {
// log.Fatal(err)
//}
// Print the LLVM IR assembly of the module.
fmt.Println(m)
// Output:
// @seed = global i32 0
// declare i32 @abs(i32 %x)
// define i32 @rand() {
// ; <label>:0
// %1 = load i32, i32* @seed
// %2 = mul i32 %1, 22695477
// %3 = add i32 %2, 1
// store i32 %3, i32* @seed
// %4 = call i32 @abs(i32 %3)
// ret i32 %4
// }
} The API of the I'd be very interested in any feedback on the API so we can make improvements before declaring a stable version. @sbinet Are there any specific requirements of the go-interpreter project which are not yet covered by @sangisos How do you feel about the new API? Any concerns, ideas or feedback in general? What may still be improved? What feels weird? Does it feel better than the last one? It would be great to move forward and stabilize the API of the Anyone up for the challenge? Cheers /u |
Let's do actual work on https://docs.google.com/document/d/1Hvxf6NMPaCUd-1iqm_968SuHN1Vf8dLZQyHjvPyVE0Q/edit
There are currently a few
Go
interpreters on the market:llgoi
, a JIT-enabled interpreter built on top of LLVM andllgo
ssainterp
, an interpreter built on top ofgolang.org/x/tools/go/ssa
sbinet/go-eval
, a package salvaged from pre-Go1
The high-level view of this proposal is to describe a naive interactive interpreter for Go, loosely based on the design and implementation of
Dis
[1], [2], the virtual machine of the Inferno OS.Knowing
Go
lineage, then, why not?Dis
is a register-based interpreter.It's not alone in the register-based space: one of the most known register-based VM is Lua. See: here and there for inspirations.
Alternatively, we could base the design on a stack-based VM such as CPython's.
See here and there for some inspirations.
Stack-based VMs are somewhat easier to implement but literature would seem to indicate register-based VMs are more performance friendly.
Working at the bytecode level, with program counters, instructions and frames, should allow us to reuse a fair amount of the
cmd/compile
infrastructure of the official Go compiler, and to also provide a non-interactive interpreter (i.e.: interpreting whole programs.)PS: design proposal is here: #2
PPS: additional reddit comments at https://www.reddit.com/r/golang/comments/4zu6bc/gointerpreter_design_proposal_of_a_vm_for_go
other VMs:
The text was updated successfully, but these errors were encountered: