-
Notifications
You must be signed in to change notification settings - Fork 805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add three new experimental engines, including SQLite #331
Conversation
+1 This seems much more maintainable than to try to use 3 different parsers with 3 different AST representations. |
This is in a state where it's ready for a first set of eyes. The architecture of this change is more important than the specifics. The best place to start is type Parser interface {
Parse(io.Reader) ([]ast.Statement, error)
} An The new The Again, all of this is subject to change, but I'm happy with the initial progress. |
I'm happy enough with this progress that I'm going to merge this in. The experimental engines, prefixed with an underscore, are not enabled in releases. Once they get to a better state, we'll enable them for testing purposes. |
I'm also going to rebase and split this change into concrete commits before merging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just read through the code. If I understand this correctly, you've added the start of a SQLite engine, containing a complete parser using Antler. For this you created an ast
package that will serve as the intermediate representation that all language generators will consume. You've also added a MySQL engine backed by tidb's parser, and a PostgreSQL engine backed by the existing postgres parser. They also output the intermediate ast
, so generators will only need to be written once for all engines, albeit with some switches for engine-specific features. At the moment though, each engine only supports very basic create/delete table and select ops.
The next steps are probably to fully port the PostgreSQL implementation to this format, and extend the MySQL and the SQLite ones? This seems pretty great. I think it's a good evolution of sqlc given the addition of the new engines and will make things a lot easier for writing the generators.
I'm a bit concerned about the maintainability of the engines though. It seems like it'll be quite a lot of effort to adapt each parser's output to a single fully featured IR. For example, traversing tidb's AST looks like it works really differently from traversing the sqlite antlr AST. Is this really scalable, especially if the end goal is to support all features of all the databases?
I wonder if there's a potential alternative (potentially insane) of using a single grammar that's a superset of all languages. Maybe to start with, use antlr to generate a parser for sql-92, which has published BNFs. From there, trim out the parts no supported engine supports, and extend to add features specific to postgres and mysql. Converting the antlr parsed AST for any of the engines to the IR would use the same code, but with specific code paths disabled or enabled to match engine support. This is probably a larger project up-front, but could pay off in the future in not having to maintain bunch of parser <=> IR adapters?
) | ||
|
||
// Provide a read-only view into the catalog | ||
func Newo(c *catalog.Catalog) InformationSchema { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Newo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, as I said, that package is experimental and not used.
The three new experimental parsers turn engine-specific ASTs into a generic AST shared by the type-checking and code-gen phases.
The new experimental engines are disabled by default.
@kyleconroy any thoughts on my long comment above? |
So far this has not been the case. I've re-implemented the catalog using the new strategy and it's worked well.
The issue with a single grammar is that you have to write a filter per engine that knows which features are allowed for each engine. Right now this is handled by the individual parsers. For example, I have a generate As I've done more work (#402, #401, #400, #399, #398, #397, #391) I'm confident in the current approach. |
Great to hear! I’d like to port the Kotlin code over for basic MySQL support as I have no use for it with PostgreSQL. Are the new parsers in a state for me to code against? |
This will be a long-running pull request implementing SQLite support. SQLite support is being implemented using a different strategy than MySQL and PostgreSQL. Right now type checking, type inference, and code generation can't be shared among database engines.
For SQLite, the plan is use the parser (generated by ANTLR) to output a generic SQL AST. Think of this a form of compiler IR. The goal would then be to port both the MySQL and PostgreSQL engines to the same IR, and then do all (or most) of the work on that IR.
This is all a WIP and may fail horribly. I won't be changing any existing packages with this pull request.
#161