-
Notifications
You must be signed in to change notification settings - Fork 451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AOT codegen #624
AOT codegen #624
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still need to regenerate all the language AST’s (and no doubt fix whatever bugs I encounter), but this is ready for a look.
@@ -69,14 +77,14 @@ annParameterName :: Name | |||
annParameterName = mkName "a" | |||
|
|||
-- Auto-generate Haskell datatypes for sums, products and leaf types | |||
syntaxDatatype :: Ptr TS.Language -> [(String, Named)] -> Datatype -> Q [Dec] | |||
syntaxDatatype language allSymbols datatype = skipDefined $ do | |||
syntaxDatatype :: (String -> Q (Maybe Name)) -> Ptr TS.Language -> [(String, Named)] -> Datatype -> Q [Dec] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We needed to add this extra parameter so we don’t try to call lookupTypeName
from IO
, which doesn’t have any access to type information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call
let traversalInstances = mappend <$> makeStandaloneDerivings (conT name) <*> makeTraversalInstances (conT name) | ||
glue a b c = a : b <> c | ||
name = mkName nameStr | ||
generatedDatatype cons = dataD (cxt []) name [plainTV annParameterName] Nothing cons [deriveStockClause, deriveAnyClassClause] | ||
deriveStockClause = derivClause (Just StockStrategy) [conT ''Generic, conT ''Generic1] | ||
deriveAnyClassClause = derivClause (Just AnyclassStrategy) [conT ''Traversable1 `appT` varT (mkName "someConstraint")] | ||
deriveAnyClassClause = derivClause (Just AnyclassStrategy) [ [t| (forall a. Traversable1 a) |] ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, this is a little unusual. The problem here was that deriving
statements associated with the Traversable1
class weren’t being pretty-printed correctly, to wit:
deriving anyclass Traversable1 someConstraint
Inserting manual parensT
calls into this invocation did not produce a change in behavior: I think there’s some logic in the TH pretty-printer that looks to see if a deriving
statement has one and only one child, and elides the parentheses if this is so. That works for zero-argument typeclasses, but not in this case.
An explicit forall
prevents the parentheses from being elided. However, weirdly enough, GHC 8.8 rejects this as invalid syntax, whereas 8.10 handles it without an issue. Personally, since @rewinfrey did the legwork to switch our downstream projects to 8.10, we should just cut 8.8 support, but I’m open to other ideas. (Previously, I had been fixing this via invoking sed
from generate-ast
).
(path, tf) <- openTempFile "/tmp" "generated.hs" | ||
T.hPutStrLn tf programText | ||
hClose tf | ||
callProcess "ormolu" ["--mode", "inplace", path] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could link in the ormolu
package, at the cost of a ton of dependencies. It seems to me to be easier just to shell out in this invocation. The most idiomatic way to do this would be to use streaming-process
and run ormolu such that it looks from stdin
, but who cares? This is fine.
semantic-ast/BUILD.bazel
Outdated
haskell_binary( | ||
name = "generate-ast", | ||
srcs = glob(["app/**/*.hs"]), | ||
compiler_flags = GHC_FLAGS + EXECUTABLE_FLAGS + ["-XStrictData"], | ||
deps = [ | ||
":semantic-ast", | ||
"//semantic-source", | ||
"//:base", | ||
"//:filepath", | ||
"//:process", | ||
"//:template-haskell", | ||
"//:text", | ||
"@stackage//:directory", | ||
"@stackage//:generic-lens", | ||
"@stackage//:lens", | ||
"@stackage//:tree-sitter", | ||
"@stackage//:neat-interpolation", | ||
"@stackage//:optparse-generic", | ||
] + all_ts_deps, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven’t added this to semantic-ast.cabal
, reasoning that we’re not going to run this anywhere but our local machines, that expressing it in the cabal format would be tedious with few upsides, and that I want people to start migrating to Bazel!
semantic-ast/BUILD.bazel
Outdated
all_ts_deps = ["@tree-sitter-{name}//:tree-sitter-{name}".format(name = name) for name in [ | ||
"go", | ||
"java", | ||
"json", | ||
"php", | ||
"python", | ||
"ql", | ||
"ruby", | ||
"rust", | ||
"tsx", | ||
"typescript", | ||
]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
List comprehensions in build files are the bee’s knees. Indeed, for
loops are prohibited at the top-level in Starlark files, so list comprehensions are our only way of iteration. Really, Starlark is kind of like a functional subset of Python.
adjust :: Dec -> Dec | ||
adjust = _InstanceD . typed . mapped %~ (values %~ truncate) . (functions %~ truncate) | ||
where | ||
-- Need to handle functions with no arguments, which are parsed as ValD entities, | ||
-- as well as those with arguments, which are FunD. | ||
values, functions :: Traversal' Dec Name | ||
values = _ValD . typed . _VarP | ||
functions = _FunD . typed | ||
|
||
truncate :: Name -> Name | ||
truncate = mkName . nameBase |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As always, there’s a dilemma between tedious/unreadable pattern matching and cryptic/quasi-readable lens operators. I went with the latter, since we already use lens
in the project, and it comes with lenses/prisms for the types we need. (TH doesn’t define record selectors for its product types, so you can’t even do record updates.)
import qualified AST.Parse | ||
import qualified AST.Token | ||
import qualified AST.Traversable1.Class | ||
import qualified AST.Unmarshal | ||
import qualified Data.Foldable | ||
import qualified Data.List as Data.OldList | ||
import qualified Data.Maybe as GHC.Maybe | ||
import qualified Data.Text.Internal | ||
import qualified Data.Traversable | ||
import qualified GHC.Base | ||
import qualified GHC.Generics | ||
import qualified GHC.Records | ||
import qualified GHC.Show | ||
import qualified Prelude as GHC.Classes | ||
import qualified TreeSitter.Node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that these imports' assigned names don’t always match up with their actual module names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work; thanks for all the self-review comments. Excited to finally use Bazel.
@@ -69,14 +77,14 @@ annParameterName :: Name | |||
annParameterName = mkName "a" | |||
|
|||
-- Auto-generate Haskell datatypes for sums, products and leaf types | |||
syntaxDatatype :: Ptr TS.Language -> [(String, Named)] -> Datatype -> Q [Dec] | |||
syntaxDatatype language allSymbols datatype = skipDefined $ do | |||
syntaxDatatype :: (String -> Q (Maybe Name)) -> Ptr TS.Language -> [(String, Named)] -> Datatype -> Q [Dec] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good call
The other one is out of date anyway.
As detailed in #622, compile-time codegen is not a viable path forward for Cabal or Bazel projects. However, we can take advantage of the fact that we can run Template Haskell splices in the
IO
monad, with some elbow grease, then pretty-print the resulting TH declarations, piping the result throughormolu
. We’ll run this executable every time we bump the grammars, which will lead to much better compile-time caching.To test:
As always, massive props to @aymannadeem for the TH codegen work: there’s absolutely no way this would have been feasible without being able to piggyback on that work.
Stuff left to do: