Add three new experimental engines, including SQLite #331

kyleconroy · 2020-02-11T18:49:09Z

This will be a long-running pull request implementing SQLite support. SQLite support is being implemented using a different strategy than MySQL and PostgreSQL. Right now type checking, type inference, and code generation can't be shared among database engines.

For SQLite, the plan is use the parser (generated by ANTLR) to output a generic SQL AST. Think of this a form of compiler IR. The goal would then be to port both the MySQL and PostgreSQL engines to the same IR, and then do all (or most) of the work on that IR.

This is all a WIP and may fail horribly. I won't be changing any existing packages with this pull request.

#161

mightyguava · 2020-02-16T23:10:57Z

The goal would then be to port both the MySQL and PostgreSQL engines to the same IR, and then do all (or most) of the work on that IR.

+1 This seems much more maintainable than to try to use 3 different parsers with 3 different AST representations.

kyleconroy · 2020-02-19T08:51:25Z

This is in a state where it's ready for a first set of eyes. The architecture of this change is more important than the specifics.

The best place to start is internal/endtoend/parsers/parser_test.go. Each database engine has a new package that exposes a Parser struct implementing this interface:

type Parser interface {
    Parse(io.Reader) ([]ast.Statement, error)
}

An ast.Statement is the entry point into the new internal/sql/ast package. This ast package contains a generic set of SQL AST nodes. Each database parser is responsible for turning a query string into a slice of statements.

The new internal/sql/catalog package takes those statements and builds a catalog. It's very similar to the existing internal/catalog / internal/pg packages, but not specific to a single engine.

The internal/sql/info package was just a proof-of-concept, feel free to ignore it.

Again, all of this is subject to change, but I'm happy with the initial progress.

cc @mightyguava @cmoog

kyleconroy · 2020-02-21T22:44:52Z

I'm happy enough with this progress that I'm going to merge this in. The experimental engines, prefixed with an underscore, are not enabled in releases. Once they get to a better state, we'll enable them for testing purposes.

kyleconroy · 2020-02-21T22:45:16Z

I'm also going to rebase and split this change into concrete commits before merging

mightyguava

Just read through the code. If I understand this correctly, you've added the start of a SQLite engine, containing a complete parser using Antler. For this you created an ast package that will serve as the intermediate representation that all language generators will consume. You've also added a MySQL engine backed by tidb's parser, and a PostgreSQL engine backed by the existing postgres parser. They also output the intermediate ast, so generators will only need to be written once for all engines, albeit with some switches for engine-specific features. At the moment though, each engine only supports very basic create/delete table and select ops.

The next steps are probably to fully port the PostgreSQL implementation to this format, and extend the MySQL and the SQLite ones? This seems pretty great. I think it's a good evolution of sqlc given the addition of the new engines and will make things a lot easier for writing the generators.

I'm a bit concerned about the maintainability of the engines though. It seems like it'll be quite a lot of effort to adapt each parser's output to a single fully featured IR. For example, traversing tidb's AST looks like it works really differently from traversing the sqlite antlr AST. Is this really scalable, especially if the end goal is to support all features of all the databases?

I wonder if there's a potential alternative (potentially insane) of using a single grammar that's a superset of all languages. Maybe to start with, use antlr to generate a parser for sql-92, which has published BNFs. From there, trim out the parts no supported engine supports, and extend to add features specific to postgres and mysql. Converting the antlr parsed AST for any of the engines to the IR would use the same code, but with specific code paths disabled or enabled to match engine support. This is probably a larger project up-front, but could pay off in the future in not having to maintain bunch of parser <=> IR adapters?

mightyguava · 2020-02-23T01:07:02Z

internal/sql/info/info.go

+)
+
+// Provide a read-only view into the catalog
+func Newo(c *catalog.Catalog) InformationSchema {


Oops, as I said, that package is experimental and not used.

The three new experimental parsers turn engine-specific ASTs into a generic AST shared by the type-checking and code-gen phases.

The new experimental engines are disabled by default.

mightyguava · 2020-03-12T02:26:17Z

@kyleconroy any thoughts on my long comment above?

kyleconroy · 2020-03-13T21:43:26Z

I'm a bit concerned about the maintainability of the engines though. It seems like it'll be quite a lot of effort to adapt each parser's output to a single fully featured IR.

So far this has not been the case. I've re-implemented the catalog using the new strategy and it's worked well.

I wonder if there's a potential alternative (potentially insane) of using a single grammar that's a superset of all languages.

The issue with a single grammar is that you have to write a filter per engine that knows which features are allowed for each engine. Right now this is handled by the individual parsers. For example, I have a generate CreateTypeStmt node. Since SQLite does not support CREATE TYPE, I don't need to do anything. The SQLite parser will fail to parse CREATE TYPE statements.

As I've done more work (#402, #401, #400, #399, #398, #397, #391) I'm confident in the current approach.

mightyguava · 2020-03-13T22:37:29Z

Great to hear! I’d like to port the Kotlin code over for basic MySQL support as I have no use for it with PostgreSQL. Are the new parsers in a state for me to code against?

kyleconroy force-pushed the sqlite branch from 02596a6 to 8edb7b2 Compare February 19, 2020 01:40

kyleconroy marked this pull request as ready for review February 21, 2020 22:45

kyleconroy changed the title ~~Add SQLite support~~ Add three new experimental engines, including SQLite Feb 21, 2020

mightyguava approved these changes Feb 23, 2020

View reviewed changes

kyleconroy added 6 commits February 24, 2020 11:03

go.mod: Add packages for MySQL and SQLite parsers

6343f3b

sqlite: Add experimental parser for SQLite

47b9f51

dolphin: Add experimental parser for MySQL

01dca9f

postgresql: Add experimental parser for MySQL

e91d12f

sql: Add generic SQL AST

abd9f68

The three new experimental parsers turn engine-specific ASTs into a generic AST shared by the type-checking and code-gen phases.

compiler: Wire up the experimental parsers

b3ee86f

The new experimental engines are disabled by default.

kyleconroy force-pushed the sqlite branch from 3c30271 to b3ee86f Compare February 24, 2020 19:07

kyleconroy merged commit 8c09e3f into master Feb 24, 2020

kyleconroy deleted the sqlite branch February 24, 2020 19:13

thepudds mentioned this pull request Nov 27, 2020

Support crawshaw.io/sqlite? #405

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add three new experimental engines, including SQLite #331

Add three new experimental engines, including SQLite #331

kyleconroy commented Feb 11, 2020 •

edited

Loading

mightyguava commented Feb 16, 2020

kyleconroy commented Feb 19, 2020

kyleconroy commented Feb 21, 2020

kyleconroy commented Feb 21, 2020

mightyguava left a comment

mightyguava Feb 23, 2020

kyleconroy Feb 24, 2020

mightyguava commented Mar 12, 2020

kyleconroy commented Mar 13, 2020

mightyguava commented Mar 13, 2020

Add three new experimental engines, including SQLite #331

Add three new experimental engines, including SQLite #331

Conversation

kyleconroy commented Feb 11, 2020 • edited Loading

mightyguava commented Feb 16, 2020

kyleconroy commented Feb 19, 2020

kyleconroy commented Feb 21, 2020

kyleconroy commented Feb 21, 2020

mightyguava left a comment

Choose a reason for hiding this comment

mightyguava Feb 23, 2020

Choose a reason for hiding this comment

kyleconroy Feb 24, 2020

Choose a reason for hiding this comment

mightyguava commented Mar 12, 2020

kyleconroy commented Mar 13, 2020

mightyguava commented Mar 13, 2020

kyleconroy commented Feb 11, 2020 •

edited

Loading