-
-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support CHECK constraints for autogenerate #508
Comments
Michael Bayer (@zzzeek) wrote: note this is a documented limitation: |
Changes by Michael Bayer (@zzzeek):
|
Changes by Michael Bayer (@zzzeek):
|
Changes by Michael Bayer (@zzzeek):
|
Changes by Michael Bayer (@zzzeek):
|
Alex Rothberg wrote: I started taking a stab at this. Does it looks like I am on the right track? Is there some missing feature in sqla, etc that will lead to a dead end here?
|
Michael Bayer (@zzzeek) wrote: that's a fragment of a beginning, sure, it would need to be in the context of the source code in compare.py to really be reviewable. With this feature, the much bigger job is writing out the tests, which is because the constraint autogen use case has an enormous number of edge cases that are also showstoppers if they are not working correctly, because within autogen, a failed comparison means false positives. A lot of applications use Alembic's comparison logic to generate schema diffs which need to come up clean if nothing has changed, so any area where a backend either doesn't provide CHECK constraint reflection, or the content of the constraint doesn't exactly match what is in the Python model, show-stopping bug. Additionally, SQLAlchemy has datatypes Enum and Boolean which automatically generate CHECK constraints and I'm not sure of how we will will handle accommodating for this, detecting which CHECK constraints need to be explicit and which do not. When the compare indexes and uniques feature was introduced, I spent literally years fixing bugs of this nature, which is why you will see a very large number of tests in https://github.com/zzzeek/alembic/blob/master/tests/test_autogen_indexes.py. I just did a grep, found 31 distinct issues involving unique and index autogenerate in changelog.rst, see https://gist.github.com/zzzeek/530634270e5631b973fa1d70885aba5a. So for this feature, writing a new test suite test_autogen_check.py is a big part of getting this to work. If the "CHECK" constraint autogen feature is added as an optional feature that's enabled in the environment, that would make it easier to release without as much risk of regressions. |
Alex Rothberg wrote: Fair. In the meantime I might start working on this as a plugin with the plan to get it mature enough that it can be submitted for addition to alembic proper. I was wondering if there is a work around to:
and what is the best way to get the "real" name for the check constraint? (ie |
Michael Bayer (@zzzeek) wrote: The rendering of a CHECK constraint object is separate from the detection which is the hard part, you can submit a rendering PR right now. additionally, the renderers.dispatch_for should be able to replace the previous entry, if it doesn't do that right now I can take a PR for that too. |
Michael Bayer (@zzzeek) wrote: sure we can reverse that |
There would be one way to detect the changed check constraint, that would be quite easy to implement and would work in the most environments: |
I think you have a good idea there, however that's the beginning of a django-like system that is going to serialize the entire model that's being built so that it doesnt even have to use database reflection for autogenerate. I'd rather someone implement the real thing rather than a tiny piece of it for some specific use case. |
I totally see that point. I looked a bit further into this, for postgres it should be easy to implement in the 'alembic' way: select pgc.conname as constraint_name,
ccu.table_schema as table_schema,
ccu.table_name,
ccu.column_name,
pg_get_constraintdef(pgc.oid)
from pg_constraint pgc
join pg_namespace nsp on nsp.oid = pgc.connamespace
join pg_class cls on pgc.conrelid = cls.oid
left join information_schema.constraint_column_usage ccu
on pgc.conname = ccu.constraint_name
and nsp.nspname = ccu.constraint_schema
where pgc.contype ='c'
order by pgc.conname; gives the check constraint definitions associated with tables and columns. I don't know about the internals of alembic. Could you tell me where to dig into it, if I wanted to implement it (for postgres)? |
And for the comparison there is also an easy approach if you use an external library (pglast): from pglast import parse_sql, _remove_stmt_len_and_location
statement_1 = "check article_number is null or company_id is not null" # what I wrote in SQLAlchemy
statement_2 = "CHECK (((article_number IS NULL) OR (company_id IS NOT NULL)))" # what I got from the query above
ast1 = parse_sql("SELECT " + statement_1[6:])
ast2 = parse_sql("SELECT " + statement_2[6:])
_remove_stmt_len_and_location(ast1)
_remove_stmt_len_and_location(ast2)
assert ast1 == ast2 |
sqlalchemy does reflect the constraint definition already, it's just that it doesn't necessarily match letter for letter, so if the pglast thing is useful that may be an option. seems like it is only for postgresql ? |
Yes, unfortunately it only targets postgres.
|
oh @lelit wrote this. friend of the show. We can try sqlparse here as well, and pglast optional plugin? |
I looked into sqlparse for some minutes and it does not build a proper AST, so I'm sceptical that it would work. @lelit started pglast because of the problems with sqlparse as far as I can see from the README, so I think a postgres dedicated solution (probably as a plugin) that works for other dialects in most cases (as long as the queries are easily accessible) would be the way I would personally take. |
well if it works it's a start and maybe @lelit wants to expand the scope of it :) |
I just quickly read the above, but sure, I'm willing to expand it's scope, if I can... obviously once I have a better idea of what you mean with that 😄 |
Hi @lelit, what I have seen so far from pglast looks great, thanks for that! @zzzeek is talking about extending the scope from postgres to other dialects. I guess that's quite a big task, but it would solve the check constraint issue completely if it works. Do you see any problems ahead (for postgres only)? The task is to compare SQL statements to see if they are identical. The few examples I tried worked flawlessly. |
I spent considerable time trying to use sqlparse for my need, but although it does quite a good job considering it handles a wide spectrum of SQL syntaxes, I didn't find it "precise" enough, and as said by Adrian the internal model its "parser" builds is not malleable enough.
No, that's definitely not doable with (current) pglast: one would need to implement the equivalent of libpgquery for other dialects, and once there, implement a new set of "printers" (in pglast parlance, a function that takes an AST and serialize it into a corresponding SQL statement). While I can easily do the latter, I'm afraid that "wrapping" other dialects parser into something usable is beyond my current cycles availability, sorry.
No, I would say that's totally doable: pglast already supports a lot of PG syntax, and expanding it to cover what is currently missing is quite easy. |
alembic_utils handles the "comparing sql stored in strings" problem by rendering the entity in a dummy schema, and comparing the reflected definition from the dummy schema to the reflected definition in the target schema. It avoids having to parse every dialect of sql and would work on any database that supports constraint inspection. It is a kinda slow for larger projects but I'm not aware of a use case where autogenerating migrations is done in a tight loop :) |
Ahh, that's smart 👌🏼 |
that is an interesting idea. |
we sort of do a bit of that for datatypes now, rather than comparing each element of a reflected datatype with what the model has we compare the rendered strings. but that doesn't include adding an extra round trip to the DB. the "dummy schema" part makes me a little nervous. if you are running on your company's giant DB2 staging system it's likely not that simple to create a new schema on the fly and then drop it. |
I'm not familiar with DB2, but if the dummy schema is the only part that makes you squeamish, a rolled back transaction would work too. The example below tests if the local definition of a function matches the database's definition. ###
# SETUP
###
def get_sql_function_def(sess, schema, name):
"""Look up the SQL definition of a function from a database"""
return sess.execute("""
select
pg_get_functiondef(p.oid)
from
pg_proc p
join pg_namespace n
on p.pronamespace = n.oid
where
n.nspname = 'public'
and p.proname = 'to_upper'
limit 1
""").fetchone()[0]
# Create an "existing" function
sess.execute("""
create or replace function to_upper(text)
returns text
as
$$ select upper($1); $$
language sql;
""")
sess.commit()
###
# CHECK IF UPDATE IS NECESSARY
###
sess.begin(subtransactions=True)
try:
# Definition of the entity in the database
existing_definition = get_sql_function_def(sess, 'public', 'to_upper')
# Render render the local definition and execute in a transaction
# Note that the function's body was changed
sess.execute("""
create or replace function to_upper(text)
returns text
as
$$ select upper($1) || 'abc' ; $$
language sql;
""")
# Get the new definition
local_definition = get_sql_function_def(sess, 'public', 'to_upper')
# Check if the existing definition and local definition match
requires_update: bool = existing_definition != local_definition
finally:
sess.rollback()
print(requires_update) # True It does require that the target database supports a |
only if there's transactional DDL available which is also not a given. Oracle for example does not support it. |
wrt to your comment
Did you mean that a company wouldn't be happy that it was happening, the end user likely wouldn't have the permissions required, or that there's something specific to DB2 that makes short lived schemas particularly dangerous? |
I mean you wouldnt have privileges to CREATE SCHEMA. that's a less-often granted privilege. |
Any news? That would be an awesome feature. |
alembic is reliant upon external contributors to keep features like these going. we don't have many resources to work on alembic outside of regressions. |
the thing with "dummy schema"? no, i think that's good for alembic-utils but over here that would dramatically change the workflow we put onto users and I'm not in a position to support that. also users already complain about autogenerate needing a database at all (#792) the long term "migrations work great" idea is to emulate more of what django does with the kind of thing sketched out in #117 which does not need any database to do autogenerate, it compares the model against itself, essentially. |
Any news? That would be an awesome feature 💪 |
From my perspective it is better to detect added/removed check constraints than detect nothing. I tried to do it on my own and would like to know how to improve this PR so it can be merged. There no tests yet as I've just started working on it and want to clarify if the overall direction is acceptable. https://github.com/Tishka17/alembic/pull/1/files Currently I see, that there is no |
Hi, you should probably do a pull request to this repository. Just had a quick glance there, and I don't think you can assume that check constraint have a name, especially the ones that are defined in the metadata |
I see 3 cases for the names:
In cases 1,2 constraint has the name in migrations CreateTable, but in the last one database generates the name on its own. If we are checking only names, but not SQL expression, we cannot work in case 3. Should we log a warning and skip all the comparison? |
in case 2 if I'm not mistaken the name of the constraint object in the metadata remains Other than that, I think those are the cases to handle. For case 3 there is also the option that also in db the constraint has no name (IIRC sqlite can have unnamed constraints). |
Is anyone working on it? It will be a nice feature. |
Why not implement at least a half-solution: if it's a new table ( |
Migrated issue, originally created by Alex Rothberg
If I start with the following and then uncomment the CheckConstraint, alembic does not handle the migration adding it or even print a warning. If I just have alembic add the entire class the migration to add the class DOES have the CheckConstraint.
The text was updated successfully, but these errors were encountered: