support CHECK constraints for autogenerate #508

sqlalchemy-bot · 2018-09-25T18:46:17Z

Migrated issue, originally created by Alex Rothberg

If I start with the following and then uncomment the CheckConstraint, alembic does not handle the migration adding it or even print a warning. If I just have alembic add the entire class the migration to add the class DOES have the CheckConstraint.

class TitleDepartmentYear(db.Model):
    id = db.Column(UUID, default=uuid.uuid4, primary_key=True)
    name = db.Column(db.String(), nullable=False)

    __table_args__ = (
#        db.CheckConstraint(
#           sqa.func.char_length(name) > 0,
#          name="title_department_year_name_min_length",
#      ),
    )

The text was updated successfully, but these errors were encountered:

sqlalchemy-bot · 2018-09-25T20:04:37Z

Michael Bayer (@zzzeek) wrote:

note this is a documented limitation:

http://alembic.zzzcomputing.com/en/latest/autogenerate.html#what-does-autogenerate-detect-and-what-does-it-not-detect

sqlalchemy-bot · 2018-09-25T20:04:37Z

Changes by Michael Bayer (@zzzeek):

added labels: low priority

sqlalchemy-bot · 2018-09-25T20:04:37Z

Changes by Michael Bayer (@zzzeek):

removed labels: bug
added labels: feature

sqlalchemy-bot · 2018-09-25T20:04:37Z

Changes by Michael Bayer (@zzzeek):

added labels: autogenerate - detection

sqlalchemy-bot · 2018-09-25T20:04:37Z

Changes by Michael Bayer (@zzzeek):

changed title from "CheckConstraint not detected" to "support CHECK constraints for autogenerate"

sqlalchemy-bot · 2018-10-10T05:49:10Z

Alex Rothberg wrote:

I started taking a stab at this. Does it looks like I am on the right track? Is there some missing feature in sqla, etc that will lead to a dead end here?

@comparators.dispatch_for("table")
def compare_table_level(autogen_context, modify_ops,
    schemaname, tablename, conn_table, metadata_table):

    metadata_check_constraints = set(
        ck for ck in metadata_table.constraints
        if isinstance(ck, sa_schema.CheckConstraint)
    )
    conn_check_constraints = set(
        ck for ck in conn_table.constraints
        if isinstance(ck, sa_schema.CheckConstraint)
    )

    for metadata_check_constraint in metadata_check_constraints:
        # barfs on something about quoting:
        metadata_check_constraint_name = sqla_compat._get_index_final_name(autogen_context.dialect, metadata_check_constraint)
        # i'm lazy; clearly this is not efficient:
        if metadata_check_constraint_name not in (conn_check_constraint.name for conn_check_constraint in conn_check_constraints):
            modify_ops.ops.append(ops.CreateCheckConstraintOp.from_constraint(metadata_check_constraint))

# needed to do this since I am writing this as a plugin and there is already a method registered:
del renderers._registry[(ops.CreateCheckConstraintOp, 'default')]

@renderers.dispatch_for(ops.CreateCheckConstraintOp)
def _add_check_constraint(autogen_context, constraint):
    # not quite right as I need to sub in query params:
    args = ['"%s"' % constraint.constraint_name, '"%s"' % constraint.table_name, '"%s"' % constraint.condition]
    return "%(prefix)screate_check_constraint(%(args)s)" % {
        'prefix': _alembic_autogenerate_prefix(autogen_context),
        'args': ", ".join(args)
    }

sqlalchemy-bot · 2018-10-10T15:28:04Z

Michael Bayer (@zzzeek) wrote:

that's a fragment of a beginning, sure, it would need to be in the context of the source code in compare.py to really be reviewable.

With this feature, the much bigger job is writing out the tests, which is because the constraint autogen use case has an enormous number of edge cases that are also showstoppers if they are not working correctly, because within autogen, a failed comparison means false positives. A lot of applications use Alembic's comparison logic to generate schema diffs which need to come up clean if nothing has changed, so any area where a backend either doesn't provide CHECK constraint reflection, or the content of the constraint doesn't exactly match what is in the Python model, show-stopping bug. Additionally, SQLAlchemy has datatypes Enum and Boolean which automatically generate CHECK constraints and I'm not sure of how we will will handle accommodating for this, detecting which CHECK constraints need to be explicit and which do not.

When the compare indexes and uniques feature was introduced, I spent literally years fixing bugs of this nature, which is why you will see a very large number of tests in https://github.com/zzzeek/alembic/blob/master/tests/test_autogen_indexes.py. I just did a grep, found 31 distinct issues involving unique and index autogenerate in changelog.rst, see https://gist.github.com/zzzeek/530634270e5631b973fa1d70885aba5a.

So for this feature, writing a new test suite test_autogen_check.py is a big part of getting this to work. If the "CHECK" constraint autogen feature is added as an optional feature that's enabled in the environment, that would make it easier to release without as much risk of regressions.

sqlalchemy-bot · 2018-10-10T16:16:47Z

Alex Rothberg wrote:

Fair. In the meantime I might start working on this as a plugin with the plan to get it mature enough that it can be submitted for addition to alembic proper.

I was wondering if there is a work around to:

del renderers._registry[(ops.CreateCheckConstraintOp, 'default')]

and what is the best way to get the "real" name for the check constraint? (ie sqla_compat._get_index_final_name doesn't seem to work for check constraints).

sqlalchemy-bot · 2018-10-10T16:29:01Z

Michael Bayer (@zzzeek) wrote:

The rendering of a CHECK constraint object is separate from the detection which is the hard part, you can submit a rendering PR right now. additionally, the renderers.dispatch_for should be able to replace the previous entry, if it doesn't do that right now I can take a PR for that too.

sqlalchemy-bot · 2018-10-10T16:30:53Z

Alex Rothberg wrote:

https://bitbucket.org/zzzeek/alembic/src/37ef35fbe673d34b5f8d1a0697cf01d10b0abe05/alembic/util/langhelpers.py#lines-285

sqlalchemy-bot · 2018-10-10T16:49:21Z

Michael Bayer (@zzzeek) wrote:

sure we can reverse that

adrianschneider94 · 2020-05-14T18:48:27Z

There would be one way to detect the changed check constraint, that would be quite easy to implement and would work in the most environments:
As far as I can see, the problem is, that for example in postgres 'consrc' is not available anymore. But instead, you could add a further alembic table or extend schema_version to hold the source of the check constraint. Then you can just compare the source in the database to the one of the SQLAlchemy metadata and update (drop and recreate) it when it changes. I'm currently doing that in a project.
However, it would require the developers/admins not to mess with that table, so I think this should be a optional behaviour.

zzzeek · 2020-05-14T19:35:17Z

I think you have a good idea there, however that's the beginning of a django-like system that is going to serialize the entire model that's being built so that it doesnt even have to use database reflection for autogenerate. I'd rather someone implement the real thing rather than a tiny piece of it for some specific use case.

adrianschneider94 · 2020-05-14T19:52:00Z

I totally see that point.

I looked a bit further into this, for postgres it should be easy to implement in the 'alembic' way:

select pgc.conname as constraint_name,
       ccu.table_schema as table_schema,
       ccu.table_name,
       ccu.column_name,
       pg_get_constraintdef(pgc.oid)
from pg_constraint pgc
join pg_namespace nsp on nsp.oid = pgc.connamespace
join pg_class  cls on pgc.conrelid = cls.oid
left join information_schema.constraint_column_usage ccu
          on pgc.conname = ccu.constraint_name
          and nsp.nspname = ccu.constraint_schema
where pgc.contype ='c'
order by pgc.conname;

gives the check constraint definitions associated with tables and columns.

I don't know about the internals of alembic. Could you tell me where to dig into it, if I wanted to implement it (for postgres)?

adrianschneider94 · 2020-05-14T20:19:02Z

And for the comparison there is also an easy approach if you use an external library (pglast):

from pglast import parse_sql, _remove_stmt_len_and_location

statement_1 = "check article_number is null or company_id is not null" # what I wrote in SQLAlchemy
statement_2 = "CHECK (((article_number IS NULL) OR (company_id IS NOT NULL)))" # what I got from the query above

ast1 = parse_sql("SELECT " + statement_1[6:])
ast2 = parse_sql("SELECT " + statement_2[6:])

_remove_stmt_len_and_location(ast1)
_remove_stmt_len_and_location(ast2)

assert ast1 == ast2

zzzeek · 2020-05-14T20:36:24Z

sqlalchemy does reflect the constraint definition already, it's just that it doesn't necessarily match letter for letter, so if the pglast thing is useful that may be an option. seems like it is only for postgresql ?

adrianschneider94 · 2020-05-14T20:40:04Z

Yes, unfortunately it only targets postgres.
But I guess that:

one could find AST parsers for the other dialects, at least in C
pglast is compatible to most statements that would be used in a check constraint. And otherwise it would raise an exception and let the user implement it manually

zzzeek · 2020-05-14T20:48:52Z

oh @lelit wrote this. friend of the show. We can try sqlparse here as well, and pglast optional plugin?

adrianschneider94 · 2020-05-14T21:04:48Z

I looked into sqlparse for some minutes and it does not build a proper AST, so I'm sceptical that it would work. @lelit started pglast because of the problems with sqlparse as far as I can see from the README, so I think a postgres dedicated solution (probably as a plugin) that works for other dialects in most cases (as long as the queries are easily accessible) would be the way I would personally take.

zzzeek · 2020-05-14T21:22:17Z

well if it works it's a start and maybe @lelit wants to expand the scope of it :)

lelit · 2020-05-14T21:29:07Z

I just quickly read the above, but sure, I'm willing to expand it's scope, if I can... obviously once I have a better idea of what you mean with that 😄
Feel free to ask/explain/suggest anything.

adrianschneider94 · 2020-05-14T21:37:05Z

Hi @lelit,

what I have seen so far from pglast looks great, thanks for that!

@zzzeek is talking about extending the scope from postgres to other dialects. I guess that's quite a big task, but it would solve the check constraint issue completely if it works.

Do you see any problems ahead (for postgres only)? The task is to compare SQL statements to see if they are identical. The few examples I tried worked flawlessly.

lelit · 2020-05-14T21:47:05Z

I spent considerable time trying to use sqlparse for my need, but although it does quite a good job considering it handles a wide spectrum of SQL syntaxes, I didn't find it "precise" enough, and as said by Adrian the internal model its "parser" builds is not malleable enough.
OTOH, yes, pglast is very PG specific: the underlying libpgquery is basically the very same parser used by PG itself, so it has access to everything comes out from that.

@zzzeek is talking about extending the scope from postgres to other dialects. I guess that's quite a big task, but it would solve the check constraint issue completely if it works.

No, that's definitely not doable with (current) pglast: one would need to implement the equivalent of libpgquery for other dialects, and once there, implement a new set of "printers" (in pglast parlance, a function that takes an AST and serialize it into a corresponding SQL statement). While I can easily do the latter, I'm afraid that "wrapping" other dialects parser into something usable is beyond my current cycles availability, sorry.

Do you see any problems ahead (for postgres only)? The task is to compare SQL statements to see if they are identical. The few examples I tried worked flawlessly.

No, I would say that's totally doable: pglast already supports a lot of PG syntax, and expanding it to cover what is currently missing is quite easy.

olirice · 2020-05-27T17:32:33Z

alembic_utils handles the "comparing sql stored in strings" problem by rendering the entity in a dummy schema, and comparing the reflected definition from the dummy schema to the reflected definition in the target schema.

It avoids having to parse every dialect of sql and would work on any database that supports constraint inspection. It is a kinda slow for larger projects but I'm not aware of a use case where autogenerating migrations is done in a tight loop :)

adrianschneider94 · 2020-05-27T17:36:15Z

Ahh, that's smart 👌🏼

zzzeek · 2020-05-28T13:12:41Z

that is an interesting idea.

zzzeek · 2020-05-28T13:14:33Z

we sort of do a bit of that for datatypes now, rather than comparing each element of a reflected datatype with what the model has we compare the rendered strings. but that doesn't include adding an extra round trip to the DB. the "dummy schema" part makes me a little nervous. if you are running on your company's giant DB2 staging system it's likely not that simple to create a new schema on the fly and then drop it.

olirice · 2020-05-28T18:27:23Z

I'm not familiar with DB2, but if the dummy schema is the only part that makes you squeamish, a rolled back transaction would work too.

The example below tests if the local definition of a function matches the database's definition.

###
# SETUP
###

def get_sql_function_def(sess, schema, name):
    """Look up the SQL definition of a function from a database"""

    return sess.execute("""
        select
            pg_get_functiondef(p.oid)
        from
            pg_proc p
            join pg_namespace n
                on p.pronamespace = n.oid
        where
            n.nspname = 'public'
            and p.proname = 'to_upper'
        limit 1    
    """).fetchone()[0]

# Create an "existing" function
sess.execute("""
    create or replace function to_upper(text)
    returns text
    as
        $$ select upper($1); $$
    language sql;
""")
sess.commit()


###
# CHECK IF UPDATE IS NECESSARY
###

sess.begin(subtransactions=True)

try:
    # Definition of the entity in the database 
    existing_definition = get_sql_function_def(sess, 'public', 'to_upper') 

    # Render render the local definition and execute in a transaction
    # Note that the function's body was changed
    sess.execute("""
        create or replace function to_upper(text)
        returns text
        as
            $$ select upper($1) || 'abc' ; $$
        language sql;
    """)

    # Get the new definition
    local_definition = get_sql_function_def(sess, 'public', 'to_upper') 

    # Check if the existing definition and local definition match
    requires_update: bool = existing_definition != local_definition

finally:
    sess.rollback()

print(requires_update) # True

It does require that the target database supports a read your own writes transaction isolation level

zzzeek · 2020-05-28T22:33:45Z

I'm not familiar with DB2, but if the dummy schema is the only part that makes you squeamish, a rolled back transaction would work too.

only if there's transactional DDL available which is also not a given. Oracle for example does not support it.

olirice · 2020-05-28T22:49:40Z

wrt to your comment

if you are running on your company's giant DB2 staging system it's likely not that simple to create a new schema on the fly and then drop it

Did you mean that a company wouldn't be happy that it was happening, the end user likely wouldn't have the permissions required, or that there's something specific to DB2 that makes short lived schemas particularly dangerous?

zzzeek · 2020-05-28T23:02:22Z

I mean you wouldnt have privileges to CREATE SCHEMA. that's a less-often granted privilege.

GuillaumeDesforges · 2021-04-22T14:25:50Z

Any news? That would be an awesome feature.

zzzeek · 2021-04-22T14:53:17Z

alembic is reliant upon external contributors to keep features like these going. we don't have many resources to work on alembic outside of regressions.

GuillaumeDesforges · 2021-04-22T15:30:02Z

That's understandable ofc

About @olirice suggestion above, I understand that it is an implementation which makes the feature require a certain set of privileges.

Still, would alembic accept a PR implemented like that?

zzzeek · 2021-04-22T15:41:04Z

the thing with "dummy schema"? no, i think that's good for alembic-utils but over here that would dramatically change the workflow we put onto users and I'm not in a position to support that. also users already complain about autogenerate needing a database at all (#792)

the long term "migrations work great" idea is to emulate more of what django does with the kind of thing sketched out in #117 which does not need any database to do autogenerate, it compares the model against itself, essentially.

ManuelBuri · 2022-11-06T09:48:42Z

Any news? That would be an awesome feature 💪

Tishka17 · 2023-11-08T16:06:27Z

From my perspective it is better to detect added/removed check constraints than detect nothing.

I tried to do it on my own and would like to know how to improve this PR so it can be merged. There no tests yet as I've just started working on it and want to clarify if the overall direction is acceptable.

https://github.com/Tishka17/alembic/pull/1/files

Currently I see, that there is no deferrable/initially support in CreateCheckConstraintOp and Inspector.get_check_constraints. Probably they should be added so we can check them too.

CaselIT · 2023-11-08T19:01:36Z

Hi, you should probably do a pull request to this repository.

Just had a quick glance there, and I don't think you can assume that check constraint have a name, especially the ones that are defined in the metadata

Tishka17 · 2023-11-14T09:34:40Z

I see 3 cases for the names:

Name is explicitly set
Name is generated according to the naming convention
Neither convention nor name are set.

In cases 1,2 constraint has the name in migrations CreateTable, but in the last one database generates the name on its own.

If we are checking only names, but not SQL expression, we cannot work in case 3. Should we log a warning and skip all the comparison?

CaselIT · 2023-11-14T12:46:15Z

in case 2 if I'm not mistaken the name of the constraint object in the metadata remains None, so it's not obvious that name would be used by a constraint also in this case.

Other than that, I think those are the cases to handle. For case 3 there is also the option that also in db the constraint has no name (IIRC sqlite can have unnamed constraints).

m-s-abeer · 2024-01-24T22:35:27Z

Is anyone working on it? It will be a nice feature.
I was trying to add a check for string matching from a list. Alembic couldn't detect anything regarding this.

valq7711 · 2024-10-12T20:12:24Z

Why not implement at least a half-solution: if it's a new table (op.create_table) - add sa.CheckConstraint along with others.
It doesn't even need to deal with diff-stuff at constraints level.
Something is better than nothing!

sqlalchemy-bot added feature autogenerate - detection low priority labels Nov 27, 2018

redshiftzero mentioned this issue Feb 7, 2019

Add alembic tests freedomofpress/securedrop-client#241

Merged

adrianschneider94 mentioned this issue May 27, 2020

Hooks before/after function creation olirice/alembic_utils#3

Closed

CaselIT added the PRs (with tests!) welcome label Oct 12, 2024

support CHECK constraints for autogenerate #508

support CHECK constraints for autogenerate #508

Comments

sqlalchemy-bot commented Sep 25, 2018

sqlalchemy-bot commented Sep 25, 2018

sqlalchemy-bot commented Sep 25, 2018

sqlalchemy-bot commented Sep 25, 2018

sqlalchemy-bot commented Sep 25, 2018

sqlalchemy-bot commented Sep 25, 2018

sqlalchemy-bot commented Oct 10, 2018

sqlalchemy-bot commented Oct 10, 2018

sqlalchemy-bot commented Oct 10, 2018

sqlalchemy-bot commented Oct 10, 2018

sqlalchemy-bot commented Oct 10, 2018

sqlalchemy-bot commented Oct 10, 2018

adrianschneider94 commented May 14, 2020

zzzeek commented May 14, 2020

adrianschneider94 commented May 14, 2020

adrianschneider94 commented May 14, 2020

zzzeek commented May 14, 2020

adrianschneider94 commented May 14, 2020

zzzeek commented May 14, 2020

adrianschneider94 commented May 14, 2020

zzzeek commented May 14, 2020

lelit commented May 14, 2020

adrianschneider94 commented May 14, 2020

lelit commented May 14, 2020

olirice commented May 27, 2020 • edited Loading

adrianschneider94 commented May 27, 2020

zzzeek commented May 28, 2020

zzzeek commented May 28, 2020

olirice commented May 28, 2020

zzzeek commented May 28, 2020

olirice commented May 28, 2020 • edited Loading

zzzeek commented May 28, 2020

GuillaumeDesforges commented Apr 22, 2021

zzzeek commented Apr 22, 2021

GuillaumeDesforges commented Apr 22, 2021

zzzeek commented Apr 22, 2021

ManuelBuri commented Nov 6, 2022

Tishka17 commented Nov 8, 2023

CaselIT commented Nov 8, 2023

Tishka17 commented Nov 14, 2023

CaselIT commented Nov 14, 2023

m-s-abeer commented Jan 24, 2024 • edited Loading

valq7711 commented Oct 12, 2024

olirice commented May 27, 2020 •

edited

Loading

olirice commented May 28, 2020 •

edited

Loading

m-s-abeer commented Jan 24, 2024 •

edited

Loading