Qi Compiler Sync Dec 23 2022

Taking Stock

Qi Compiler Sync Dec 23 2022

Adjacent meetings: Previous | Up | Next

Summary

We discussed the motivation for the project, where we are at now, and next steps. These notes will have the form of a Q&A to be most useful to readers.

Why write a compiler?

The original performance wasn't bad, but we hope that the new performance can be even better. In particular, we are hoping that for functional, immutable operations, that using Qi could be faster than using Racket.

How is the compiler implemented?

The language was initially fully specified using a single macro, flow, which contained all of the forms of the language together with their expansions to Racket.

Now, the flow macro is implemented on two levels: (1) macros that get rewritten to Qi, and (2) Qi core forms that get rewritten to Racket.

(1) contains many forms that were formerly in the original flow macro, but they now expand to Qi rather than Racket, and these will subsequently be expanded to Racket by the compiler, i.e. in stage (2). That is, many forms of the Qi language are now just macros, the same kind of macros that users could write, and there is a "small" core language that will be compiled.

Currently, the compiler essentially just contains the expansions to Racket that were in the original flow macro for all of the core forms (i.e. the ones that aren't macros). It also contains the implementation of the new as binding form.

See this discourse post for more details.

How does it perform?

The current stratified implementation performs worse than the original flow macro since we no longer use optimized Racket code for many of the individual forms (i.e. the macros) as they rewrite to core Qi forms now instead of Racket directly. The next step will be to add compiler optimizations to restore performance of individual forms as much as possible, while also adding non-local optimizations to improve performance in general.

What are some examples of optimizations?

Compiler Optimizations are some optimizations we've talked about before.

Why are we adding bindings to the language?

While bindings are unlikely to be a dominant pattern in Qi, they help in two ways:

Qi can already use bindings defined in Racket, so it seems to make sense to also allow Qi to define bindings itself.
Bindings can be thought of as a non-structural input, which allows them to simplify flows in cases where they would otherwise need complex yet incidental structure.

For the second, we can see this in the example of a "state monad." That is, suppose we have a computation that happens in many stages, like this one:

(~> (5) sqr add1) ; => 26

What if we wanted to get not only the output, but also the history of intermediate values on the way to the result? Without bindings, we could do it this way:

(~> (5)
    (-< list _)
    (== _ (~> sqr (-< list _)))
    (group 2 append _)
    (== _ (~> add1 (-< list _)))
    (group 2 append _))

; => '(5 25 26), 26

This works, but it doesn't preserve the form of the original computation, which, nevertheless, is still the essential computation we are performing here. With bindings, we can do it this way:

(~> (5)
    (-< sqr (~> list (as S)))
    (-< add1 (~>> list (append S) (as S)))
    (-< _ (~>> list (append S) (as S)))
    (-< _ (gen S)))

; => 26, '(5 25 26)

... which is much more clear, since the main computation can be seen in the column on the left, with the "monadic" layer in the right column.

The difference becomes even more stark with more state variables:

(~> (5)
    (-< _ list list)
    (== sqr _ _)
    (-< 1>
        (~> (-< (~> 1> list)
                2>)
            X
            append)
        (~> (-< (~> 1> list)
                3>)
            X
            append))
    (== add1 _ _)
    (-< 1>
        (~> (-< (~> 1> list)
                2>)
            X
            append)
        (~> (-< (~> 1> list)
                3>)
            X
            append)))

;=> 26, '(5 25 26), '(5 25 26)

whereas with bindings:

(~> (5)
    (-< _ (~> list (as S1)) (~> list (as S2)))
    (-< sqr (~>> list (append S1) (as S1)) (~>> list (append S2) (as S2)))
    (-< add1 (~>> list (append S1) (as S1)) (~>> list (append S2) (as S2)))
    (-< _ (gen S1) (gen S2)))

;=> 26, '(5 25 26), '(5 25 26)

... where, once again, the columns of orthogonal computations are clearly visible.

Of course, in practice we would simply write a macro to implement this state monad, and in that case either implementation, with or without bindings, would abstract the incidental complexity from the user so that the essential computation would be equally clear. The point is that bindings allow us to avoid such incidental structural complexity in general.

What is this new meta-DSL that Qi is Using?

It's syntax-spec, and it could be thought of as another step in the evolution of language-oriented programming, another power tool in the language-oriented ecosystem.

Early Lisp macros allowed us to write syntactic transformations to extend the language, but these weren't robust in the sense that their behavior was dependent on the programs in which they were used. Specifically, any bindings defined or used in the macro could interfere with those defined in the use-site code, changing the meaning of the macro depending on its use.

Macro hygiene allowed macro semantics to be clearly differentiated from use-site semantics, allowing macros to compose cleanly with other code. Another advance was the set-of-scopes model for bindings, which is a more robust and general way to implement scoping rules than the "variable renaming" that was used formerly.

Racket's syntax-parse library leverages these technologies to provide an industrial-strength language for writing macros with an expressive vocabulary of syntax classes and patterns.

Even with syntax-parse, though, if you are writing a language, there will be a lot of incidental complexity in your implementation addressing problems that will also need to be solved in the implementation of another language. That is, these problems are generic and not specific to your language, and additionally, they only arise at the level of abstraction of a language, not at the lower level of parsing specific syntax. One such problem is scoping rules for bindings.

For instance, in an expression like this one:

(~> ... 
    (bad-use my-var)
    ...
    (-< blah blah (as my-var))
    etc
    etc
    (use my-var))

... as part of our language design, we want the variable my-var defined in (as my-var) to be in scope when it is used in (use my-var) but not in (bad-use my-var). Implementing the as binding form and its scoping rules to match this design would take a lot of low level knowledge of macro hygiene and the set-of-scopes model in order to do it reliably and provide useful error messages.

On the other hand, syntax-spec recognizes this as one of many general problems that would need to be solved in languages, and so it provides us a way to specify these scoping rules declaratively, allowing language authors to define the language and its scoping rules at the level of abstraction we care about.

See Bindingspec Meta-DSL Intro for more (the language was formerly called bindingspec).

Next Steps

(These are all carried over from last time)

Add docs for syntax-spec
Review the bindings PR towards merging it into the compiler integration branch.
Review the initial design for bindings and see what other bindings cases we could support next.
Re-generate the benchmarks showing the performance difference of each Qi form with respect to the main branch, so that we can take stock and start restoring parity towards merging back to main.
Discuss how to, and then put syntax-spec on the package index

Attendees

Hans, Sid

Provide feedback

Saved searches

Use saved searches to filter your results more quickly