Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reconsider AOT vs. compile-time generation of syntax trees #622

Closed
patrickt opened this issue Sep 8, 2020 · 1 comment
Closed

Reconsider AOT vs. compile-time generation of syntax trees #622

patrickt opened this issue Sep 8, 2020 · 1 comment
Labels
ast:codegen Bugs in Template Haskell AST generation bazel Bazel-specific build concerns build Issues arising when building semantic help wanted Up for grabs infrastructure Items relating to packaging, project management, releases, etc. language-support Language support in general (e.g. new languages, etc.)

Comments

@patrickt
Copy link
Contributor

patrickt commented Sep 8, 2020

Our current formulation of syntax trees assumes that we’ll be able to read the contents of node-types.json files at compile time. This is only true for local development, and files pulled in via pinned Git dependencies. For all other cases, the official word is that this is not expected to work. This means that any future publishing to Hackage is off the table, though things work for local dev and our downstream dependent projects.

But even the situation as it stands is not a hugely optimal one. For example, though Bazel tends to provide better in-IDE tooling, it doesn’t know how to find node-types files in REPLs, and even during standard builds doesn’t know how to find them without preprocessor trickery.

I think it’s time to consider whether generation of this code ahead-of-time is worth exploring. Here are some upsides and downsides of AOT code generation.

Upsides

  • As mentioned above, this basically only works on cabal due to implementation details of the build/REPL process.
  • We already do AOT codegen for the Semantic_Proto serialization files. Note that that file, even though it comes out to like 8000 SLoC, is well-behaved re. compile time and IDE support, in contrast to our stuff that does complicated Template Haskell splices. Indeed, I anticipate that the authors of proto-lens avoided TH generation because, much like us, TH has difficulty finding .proto files, and needs to work with massive protobuf definitions.
  • We also generate code for lingo-haskell.
  • As mentioned above, our build process can become substantially simpler, our IDE tooling will work more reliably (because it won’t ever try to activate a TH splice).
  • We don’t update the grammars super-often, so this shouldn’t institute a tremendous amount of code churn.
  • Better caching (even with Bazel, which is much better at caching than cabal, we still encounter spurious rebuilds).
  • Better project ergonomics (since the codegen splices are defined in tree-sitter).

Downsides

  • More code to write.
  • Less elegant than a pure-TH solution.
  • It’s an extra step we have to be aware of during the update process.

Another approach we could take is to drop cabal support entirely, which would also preclude any Hackage releases, still needs some love to get working in a REPL context, and would entail a degree of tediousl downstream changes. We could also shudder download the grammar definitions in the TH splices themselves, but I hardly think that invoking network calls in TH is something we should encourage, though that’s the only way I can envision this possibly working with cabal.

@patrickt patrickt added ast:codegen Bugs in Template Haskell AST generation bazel Bazel-specific build concerns build Issues arising when building semantic help wanted Up for grabs infrastructure Items relating to packaging, project management, releases, etc. language-support Language support in general (e.g. new languages, etc.) labels Sep 8, 2020
@patrickt
Copy link
Contributor Author

patrickt commented Sep 9, 2020

Good news: with a little elbow grease, we can reuse @aymannadeem’s Template Haskell work here, since it’s possible to run the Q monad from IO. That means that codegen should be as simple as pretty-printing the result of running astDeclarationsForLanguage, with appropriate module headers, imports, and LANGUAGE pragmas. Exciting!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ast:codegen Bugs in Template Haskell AST generation bazel Bazel-specific build concerns build Issues arising when building semantic help wanted Up for grabs infrastructure Items relating to packaging, project management, releases, etc. language-support Language support in general (e.g. new languages, etc.)
Projects
None yet
Development

No branches or pull requests

1 participant