Support for Postgres enums #580

dstu · 2017-01-15T00:54:39Z

This PR provides the infer_enums! macro, which creates Rust enums for each enum discovered in a database, and extends infer_schema! so that you can name columns .

See #343 for discussion about how this has been developed up until now.

Items to be addressed before this is ready for merge:

Correctly implement all necessary traits for newly declared types.
Clean up changes to infer_schema! and related macros. (Argument-passing to indicate where to look up non-builtin types is rather unslightly.)
Add more Rust documentation describing how the new features work.
Fix migrations and tests so that tests pass on SQLite again. (SQLite does not support custom enum types the way Postgres does.)
Get things working within custom schemas.
Implement and test arrays of custom types.

…ostgres enums. No actual encoding/decoding is done yet, but codegen compiles and looks reasonably okay. Testing is lacking, so the code is far from guaranteed from being correct at this point.

Encoding/decoding of actual enum values is not yet supported. (impls are suggested but just call unimplemented!). Enum-valued table columns is also yet not supported.

New impls of HasSqlType cannot be defined for types inside the diesel crate, so we can't do something like: impl HasSqlType<NewType> for diesel::pg::Pg { ... } This commit adds the trait ProvidesSqlType, which can be implemented for new types, and a blanket impl so that any type with a suitable impl of ProvidesSqlType gets a corresponding impl of HasSqlType. So new types can be added with an impl declaration like: impl ProvidesSqlType<diesel::pg::Pg> for NewType { ... }

diesel_codegen/Cargo.toml: add dependency on syntex_syntax in an attempt at getting access to the "self" keyword for qualifying symbol lookups. May not be the right way to go. diesel_codegen/src/lib.rs: raise macro recursion limit to enable builds with inferred enum types. Pull in syntex_syntax. diesel_codegen/src/schema_inference.rs: propagate information about types being builtin or derived from the DB schema. Some fuzziness here about how exactly to do local name lookups (WRT "self" and not the crate root). diesel_codegen/src/type_inference.rs: move some utility functions to diesel_codegen_shared/src/schema_inference/pg.rs. Update enum code generation to try to implement with all necessary interfaces. diesel_codegen_shared/src/lib.rs: add conditional dependency on itertools. diesel_codegen_shared/src/schema_inference/data_structures.rs: track whether a column type is a builtin or custom. diesel_codegen_shared/src/schema_inference/mod.rs: conditional dependency on itertools, export utility functions. diesel_codegen_shared/src/schema_inference/pg.rs: move utility functions here. Add hacky heuristic for detecting if a type is built-in. diesel_codegen_shared/src/schema_inference/sqlite.rs: mark all SQLite types as builtin. diesel_tests/tests/associations.rs: add values for new enum type columns. diesel_tests/tests/boxed_queries.rs: add values for new enum type columns. diesel_tests/tests/errors.rs: add values for new enum type columns. diesel_tests/tests/expressions/mod.rs: add values for new enum type columns. diesel_tests/tests/filter.rs: add values for new enum type columns. diesel_tests/tests/filter_operators.rs: add values for new enum type columns. diesel_tests/tests/find.rs: check values of new enum type columns. diesel_tests/tests/insert.rs: insert values for new enum type columns. diesel_tests/tests/joins.rs: check values of new enum type columns. diesel_tests/tests/macros.rs: add values for new enum type columns. diesel_tests/tests/order.rs: add values for new enum type columns. diesel_tests/tests/schema.rs: add a call to infer_enums! so that we get our new enum types. Unfortunately, it's still necessary to add an additional "pub use self::blah" because I have not yet figured out how to import the new enum types from their captor module automagically. diesel_tests/tests/select.rs: select values for new enum type columns. diesel_tests/tests/update.rs: update values for new enum type columns. It might be a good idea to try to add a test updating an enum typed column. migrations/postgresql/20161224024421_add_enum_columns/{up,down}.sql: add a column for a new enum type. Still don't yet have things working in custom Postgres schemas yet.

This feels like a bit of a hack, since it side-steps the type resolution issue and continues to rely on a hard-coded heuristic for figuring out which types are declared where, but it seems reasonable. Users of the library can declare or "pub use" anything they need in the namespace they're working with. This leaves types in schemas out in the cold, unless we're willing to assume some sort of convention of parallel structure between Rust namespaces in the extra types module and DB schemas. diesel_codegen/src/lib.rs: bring up to date with changes from master. diesel_codegen/src/schema_inference.rs: Thread extra types module through. diesel_codegen/src/type_inference.rs: bring up to date with changes from master. diesel_codegen_shared/src/schema_inference/mod.rs: Thread extra types module through. diesel_codegen_shared/src/schema_inference/pg.rs: Thread extra types module through and resolve non-builtin types in that namespace. diesel_tests/tests/lib.rs: new module for automagic enum decls. diesel_tests/tests/schema.rs: pass new macro parameter for the extra types module

…tom_types_in_database_columns

killercup · 2017-01-15T13:32:45Z

Nice! I saw your branch a while ago, and am glad to see this PR. This is quite a lot of code! I'll try to read through it this evening, or tomorrow afternoon. Feel free to bug me if I forget about it though :)

Some record of the original variant name is necessary so that SQL generation uses the right literals.

killercup

Whew, this is a lot of new code. That's amazing work!

I read through it, but probably not as in-depth as I should have (esp. the type_inference module). You seem to be working on adding even more things anyway :)

Before this reaches the magic 1000 lines mark: Do you think it'd make sense to land this in several batches? E.g., adding support for custom schemas and enums in PG array can probably be added in future PRs.

killercup · 2017-01-17T08:17:51Z

diesel_codegen/src/type_inference.rs

+
+fn infer_enums_for_schema_name(database_url: &str, schema_name: Option<&str>,
+                               type_list: Option<&str>, camel_case_types: bool,
+                               camel_case_variants: bool) -> quote::Tokens {


I assume this is rustfmt? Can you try to match the fn declaration style used in e.g. column_def_tokens?

Formatting is mostly a combination of what rust-mode does in Emacs and my attempts at respecting cosmetic issues like line length as per a hazy recollection of the recommendations aturon compiled. If you prefer, I'll run rustfmt over things before declaring this ready for a merge.

killercup · 2017-01-17T08:17:56Z

diesel_codegen/src/schema_inference.rs

+                    infer_table_from_schema!(#database_url, #table_name);
+                }
+                pub use self::#mod_ident::*;
+            },


Can you refactor this to let macro_call = if let Some(type_module) = extra_types_module { quote! { infer_table_from_schema!(extra_types_module=#ype_module, #database_url, #table_name); } } else { quote! { infer_table_from_schema!(#database_url, #table_name); } } so the mod and pub use doesn't need to be repeated?

killercup · 2017-01-17T08:17:58Z

diesel_codegen/src/schema_inference.rs

@@ -1,4 +1,5 @@
 use syn;
+// use syntex_syntax::symbol::keywords;


This is the only place where you use syntex right? Please please please remove this dependency if it's not going to be used, it takes ages to compile. (And if we need something that syn doesn't have in this regard, I'd gladly make a PR to syn!)

Yes, I was trying to use syntex to work around some issues but don't need it right now. I'll kill the dependency if it's not needed (which it probably won't be) before any merge.

killercup · 2017-01-17T09:06:59Z

diesel_codegen_shared/src/schema_inference/pg.rs

@@ -74,19 +88,51 @@ mod information_schema {
    }
 }

-pub fn determine_column_type(attr: &ColumnInformation) -> Result<ColumnType, Box<Error>> {
+pub fn camel_cased(snake_case: &str) -> String {


Yes! Finally! PascalCase! Errrr, I mean, "upper camel case". (On a more serious note, I think this fn should be called snake_to_type_case.)

Please note that all other case conversions we do are trivial conversions by choice, to not introduce surprisingly complex behavior.

The snake case version we get here are valid Postgres identifiers, right? What properties do they have in comparison to Rusts type identifiers?

OK on changing the method name.

Right now, this PR converts a Postgres type name or variant foo_bar into FooBar on the Rust side. There are bool arguments to the macro expansion methods to control whether this happens, but they aren't currently used because there is no macro-level interface for toggling them.

I think it makes sense to map to Rust-style naming, but I am willing to be convinced otherwise. If any name manging could happen, clearly documenting default behavior and providing an option to turn it off are also necessary.

killercup · 2017-01-17T09:09:55Z

diesel_codegen_shared/src/schema_inference/pg.rs

+    #[test]
+    fn camel_cased_i18n() {
+        assert_eq!("Außerdem", camel_cased("außerdem"));
+        assert_eq!("AuSSerdem", camel_cased("au_ßerdem"));


N̵͡i̛c͘e ̀e͜d̨͘͢g͜͡e͠͞ ̴̛c̶̨͞a͢͜s͞ȩ̵s.͟͠

Can enum variant names contain umlauts, though?

In Rust, it's an open issue.

In Postgres, it appears perfectly valid to have both types and variants contain umlauts. I'll have to dig deeper into the documentation to see if there are any surprises with regards to other unusual characters.

It seems like eventual convergence (arbitrary whitespace-free Unicode allowed) is an eventual goal, but it would probably be better to check explicitly that we're generating a legal type name and die with a reasonable explanation rather than letting rustc barf.

killercup · 2017-01-17T09:26:08Z

diesel/src/macros/macros_from_codegen.rs

+/// schema name.
+///
+/// This macro can only be used in combination with the `diesel_codegen` or
+/// `diesel_codegen_syntex` crates. It will not work on its own.


Just a heads up: diesel_codegen_syntex will be removed in 0.10

killercup · 2017-01-17T09:26:33Z

diesel_codegen/src/lib.rs

@@ -1,4 +1,5 @@
 #![deny(warnings)]
+#![recursion_limit = "512"]


What is this needed for? (It's fine, just maybe add a comment what requires this, so we can remove it when it's no longer needed.)

The quote! macro makes use of (potentially deep) recursion, and my large invocations of it are running into the default limit. This number should be tuned to fit whatever this PR's eventual needs are; it may be that we wind up needing something smaller at first.

killercup · 2017-01-17T09:26:57Z

diesel/src/types/mod.rs

@@ -249,10 +249,20 @@ pub type VarChar = Text;
 #[cfg(feature = "postgres")]
 pub use pg::types::sql_types::*;

+pub trait ProvidesSqlTypeFor<DB> where DB: TypeMetadata {


I would love to see a doc comment here telling me what this type is all about. I assume it abstracts over structs and enums providing SQL types? Maybe also mention HasSqlType is implemented for everything that implements this trait.

killercup · 2017-01-17T09:27:51Z

diesel_codegen/Cargo.toml

 quote = "0.3.10"
 diesel = { version = "0.9.0", default-features = false }
 diesel_codegen_shared = { version = "0.9.0", default-features = false }

+[dependencies.syn]
+version = "0.10.3"
+features = ["full"]


Can you use the same inline table syntax as above?

syn = { version = "0.10.3", features = ["full"] }

killercup · 2017-01-17T09:28:20Z

diesel_codegen/Cargo.toml

@@ -10,11 +10,15 @@ repository = "https://github.com/diesel-rs/diesel/tree/master/diesel_codegen"
 keywords = ["orm", "database", "postgres", "sql", "codegen"]

 [dependencies]
-syn = "0.10.3"
+syntex_syntax = "0.52.0"


I hope we can not introduce this dependency. Syntex takes ages to compile.

Agreed. I don't think this will be needed.

dstu · 2017-01-18T03:47:24Z

This can be broken up fairly easily; it is quite feasible to implement support simply for generating enum types to match what's in a database.

I'll make changes to the PR that address the comments so far when I have more time this week.

dstu · 2017-01-18T04:16:51Z

diesel/src/macros/macros_from_codegen.rs

@@ -12,6 +12,22 @@
 /// This macro can only be used in combination with the `diesel_codegen` or
 /// `diesel_codegen_syntex` crates. It will not work on its own.
 macro_rules! infer_schema {
+    (extra_types_module = $extra_types_module: expr, $database_url: expr) => {


I really don't like the solution of the user manually specifying extra_types_module.

From a usability standpoint, it would be nicer to be able to infer_everything! and just get tables, types, and (eventually) views from the database.

From a design standpoint, this is just kicking the problem down the field a short distance. As-is, all peers to SQL types live in diesel::types, and library internals freely assume this. Looking up types that are not on a whitelist on a single user-specified alternate path works for now, but a few problems are readily evident:

All DB-derived types are assumed to live in one namespace. In Postgres, you have the potential for multiple namespaces (which naturally arise from each distinct database connection string, or even from schemas nested inside of a database). This could lead to name clashes if, for example, identically named types are declared in different databases or schemas. (It seems like a good solution is to recommend strongly that different invocations of infer_schema! use different extra_types_module settings, but I'd prefer a cleaner design where remembering to do that sort of thing isn't necessary.)

The binding between a DB type and a Rust type is implicit (based on type name alone). This loose coupling could lead to bugs whenever there are name clashes.

It is quite conceivable that a user will want to use types that come from disparate Rust namespaces, at which point they will have to do a lot of tedious and potentially confusing re-exporting with pub use (so that everything is in one namespace), or we will have to extend this mechanism to support having multiple extra type paths.

This could be solved by creating an intermediate representation of types in a database schema and mapping DB types to Rust types through that. This would require threading some sort of global table state through most of the code generation process, so the effects of adopting this design would be felt throughout a lot of Diesel. (Perhaps simply having a global shared structure would be adequate, since it will only be used at compilation time.)

Having a central, explicit set of data structures that describe types, tables, views, procedures, etc., could be quite useful. I suspect that some sort of compile-time metadata model would be very helpful for validating what is read from the database ("table has a primary key structure we don't support"), generating good error messages ("typename foobaz is not a valid Rust identifier"), debugging/testing (write tests against the intermediate representation code), streamlining code generation (all necessary state is in one obvious place), and error reporting (have richer local state from which to produce meaningful error messages).

This approach could go very wrong by being too top-heavy and prematurely putting straightjacketing Diesel. The project has gotten really far without building a top-down intermediate model of database components, and maybe it's best to keep it that way for now. It would be unpleasant to have to refactor a big metadata layer every time you want to support a new DB feature.

If it seems a good idea, I suggest working towards merging just the enum declaration part of this PR and opening a separate issue to discuss whether a metadata model is necessary or a good idea. That discussion may yield solutions to the problems I've described here which can be used to add DB type support to infer_schema! without having to commit to a great deal more overhead.

gsollazzo · 2017-02-16T22:30:07Z

Hi!
When can we expect to see this PR merged in master?
Thanks

dstu · 2017-02-17T05:32:01Z

I apologize for not having been on top of this. I may be able to put some more time into it this weekend. I still need to rebase on top of the latest Diesel release and restrict this PR's scope to consist only of generating types for Postgres enums.

dstu · 2017-03-13T00:49:25Z

After looking at how far diesel has progressed since I wrote this PR, I have concluded that it must be rewritten. The basic approach (query Postgres catalog tables to get enum definitions) is still perfectly valid, but diesel's switch to using SQL schema standard means that the way this is done has to change. I am closing this PR and will move discussion back to #343.

dstu added 8 commits January 14, 2017 04:00

Hammered together boilerplate for generating simple Rust enums from P…

463a1f5

…ostgres enums. No actual encoding/decoding is done yet, but codegen compiles and looks reasonably okay. Testing is lacking, so the code is far from guaranteed from being correct at this point.

Rust types from Postgres enum types working and tested. No schemas yet.

fbf3519

Encoding/decoding of actual enum values is not yet supported. (impls are suggested but just call unimplemented!). Enum-valued table columns is also yet not supported.

Outstanding changes to look up derived types in a special namespace.

5da7130

Missing file that provides namespace for derived types.

f12b230

Merge branch 'master' of https://github.com/diesel-rs/diesel into cus…

6fd40d0

…tom_types_in_database_columns

dstu changed the title ~~Create enums~~ Support for Postgres enums Jan 15, 2017

This was referenced Jan 15, 2017

Support enumerated types #343

Closed

Implementing diesel::Expression, etc., for custom types outside diesel? #562

Closed

Add method to generated enums that preserves original variant names.

e441832

Some record of the original variant name is necessary so that SQL generation uses the right literals.

killercup reviewed Jan 17, 2017

View reviewed changes

dstu commented Jan 18, 2017

View reviewed changes

dstu closed this Mar 13, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for Postgres enums #580

Support for Postgres enums #580

dstu commented Jan 15, 2017 •

edited

Loading

killercup commented Jan 15, 2017

killercup left a comment

killercup Jan 17, 2017

dstu Jan 18, 2017

killercup Jan 17, 2017

killercup Jan 17, 2017

dstu Jan 18, 2017 •

edited

Loading

killercup Jan 17, 2017

dstu Jan 18, 2017

killercup Jan 17, 2017

dstu Jan 18, 2017

killercup Jan 17, 2017

killercup Jan 17, 2017

dstu Jan 18, 2017

killercup Jan 17, 2017

killercup Jan 17, 2017

killercup Jan 17, 2017

dstu Jan 18, 2017

dstu commented Jan 18, 2017

dstu Jan 18, 2017

gsollazzo commented Feb 16, 2017

dstu commented Feb 17, 2017

dstu commented Mar 13, 2017

		@@ -1,4 +1,5 @@
		use syn;
		// use syntex_syntax::symbol::keywords;

		@@ -1,4 +1,5 @@
		#![deny(warnings)]
		#![recursion_limit = "512"]

Support for Postgres enums #580

Support for Postgres enums #580

Conversation

dstu commented Jan 15, 2017 • edited Loading

killercup commented Jan 15, 2017

killercup left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dstu Jan 18, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dstu commented Jan 18, 2017

Choose a reason for hiding this comment

gsollazzo commented Feb 16, 2017

dstu commented Feb 17, 2017

dstu commented Mar 13, 2017

dstu commented Jan 15, 2017 •

edited

Loading

dstu Jan 18, 2017 •

edited

Loading