Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alembic generates unneeded migrations for expression-based indexes #1321

Closed
vasa-chi opened this issue Oct 12, 2023 · 11 comments
Closed

Alembic generates unneeded migrations for expression-based indexes #1321

vasa-chi opened this issue Oct 12, 2023 · 11 comments

Comments

@vasa-chi
Copy link

Describe the bug
I have a table that has an index like this:

Index(
    "event_requisition_action_idx2", arguments["requisition_id"].as_integer()
),

In database it is represented as follows:

"event_requisition_create_idx2" btree (((response ->> 'id'::text)::integer)) 

After migrating to SQLAlchemy 2.0 Alembic started to generate migrations that drop the index, and create another, that is the same, but uses another syntax:

    op.drop_index('event_requisition_create_idx2', table_name='event')
    op.create_index('event_requisition_create_idx2', 'event', [sa.text("CAST(response ->> 'id' AS INTEGER)")], unique=False)

After upgrade, the index is the same as before:

"event_requisition_create_idx2" btree (((response ->> 'id'::text)::integer)) 

Expected behavior
No migrations are created for the index.

To Reproduce
Please try to provide a Minimal, Complete, and Verifiable example, with the migration script and/or the SQLAlchemy tables or models involved.
See also Reporting Bugs on the website.

from sqlalchemy import Column, Index
from sqlalchemy.orm import declarative_base
from sqlalchemy.dialects.postgresql JSONB

Base = declarative_base()

class Event(Base):
    __tablename__ = 'event'

    id = Column(Integer, primary_key=True)
    arguments = Column(JSONB)

    __table__args__ = (
        Index("event_arguments_idx", arguments["numeric"].as_integer()),
    )

Versions.

  • OS: macOS 14.0
  • Python: 3.7.13
  • Alembic: 1.12.0
  • SQLAlchemy: 2.0.21
  • Database: PostgreSQL 15.3 on aarch64-apple-darwin21.6.0, compiled by Apple clang version 14.0.0 (clang-1400.0.29.102), 64-bit
  • DBAPI: psycopg2==2.9.9

Additional context

Currently, I solved this by using a text() notation for the index expression:

        Index(
            "event_requisition_action_idx2",
            # arguments["requisition_id"].as_integer(),
            text("((arguments ->> 'requisition_id'::text)::integer)"),
        ),

This doesn't cause any new migrations.

Have a nice day!

@vasa-chi vasa-chi added the requires triage New issue that requires categorization label Oct 12, 2023
@zzzeek
Copy link
Member

zzzeek commented Oct 12, 2023

These issues might be impossible to fix without dropping the support for expression-based indexes, since we can never match 100% of PG expressions to what a particular expression sends. your text() work around is the correct approach.

@CaselIT can you suggest other ways to handle this? It seems like we are never going to be able to match these index expressions in all cases.

@zzzeek zzzeek added bug Something isn't working autogenerate - detection postgresql and removed requires triage New issue that requires categorization labels Oct 12, 2023
@CaselIT
Copy link
Member

CaselIT commented Oct 12, 2023

I would need to look into what pg returns here.
considering also #1322 we may want to ignore indexes that have json stuff in them by detecting -> or ->> and bailing on them

I'll try taking a look once I have a solution for the pc :/

@JankesJanco
Copy link

I ran into a similar issue with this index definition (I think it is not an expression-based index):

Index("ix__meme_asset__latest_sale_date__nulls_first", latest_sale_date.asc().nullsfirst())

The index was already created in the database, however after migrating to sqlalchemy2, alembic started to generate drop and create commands for this index. The generated commands are below. Note the difference in the sa.text() - in upgrade() there is "latest_sale_date ASC NULLS FIRST" and in downgrade() there is just "latest_sale_date NULLS FIRST" without "ASC"

def upgrade():
    # ### commands auto generated by Alembic - please adjust! ###
    op.drop_index('ix__meme_asset__latest_sale_date__nulls_first', table_name='meme_asset')
    op.create_index('ix__meme_asset__latest_sale_date__nulls_first', 'meme_asset', [sa.text('latest_sale_date ASC NULLS FIRST')], unique=False)
    # ### end Alembic commands ###


def downgrade():
    # ### commands auto generated by Alembic - please adjust! ###
    op.drop_index('ix__meme_asset__latest_sale_date__nulls_first', table_name='meme_asset')
    op.create_index('ix__meme_asset__latest_sale_date__nulls_first', 'meme_asset', [sa.text('latest_sale_date NULLS FIRST')], unique=False)
    # ### end Alembic commands ###

When I change the index definition in a way that the index is still the same (see below), the drop and create index commands are not being generated anymore (the difference from the first definition is that here we are not using asc() which is default)

Index("ix__meme_asset__latest_sale_date__nulls_first", latest_sale_date.nullsfirst())

@CaselIT
Copy link
Member

CaselIT commented Oct 13, 2023

thanks for reporting. I guess ASC can be removed from the comparison, but in general similar things are likely to happen.

@zzzeek do you think it warrants an option in the config to list a collection of index names for which to ignore the expression comparison? This feature has been out for quite a bit and we haven't had much complains on it, so I don't disabling it entirely would be a good option for the majority of users

@zzzeek
Copy link
Member

zzzeek commented Oct 13, 2023

we arleady have a feature to ignore a specific list of index names, the include_object hook.

@CaselIT
Copy link
Member

CaselIT commented Oct 13, 2023

yes, but that will ignore the index completely, what I was thinking about was just a way of telling alembic "ignore expression difference in this index, but otherwise write migration if it gets removed or the columns change" type of thing

@zzzeek
Copy link
Member

zzzeek commented Oct 13, 2023

seems very arbitrary and specific. what if they have a server default that's not comparing correctly, showing false positives?

@zzzeek
Copy link
Member

zzzeek commented Oct 13, 2023

also, if I go through the effort to name a specific index to ignore, I would think I'd just have that index ignored and that would be that, I wouldn't want to get into "only compare the columns" - the use case here is that of locating elements that have changed in the codebase as a convenience feature.

@zzzeek
Copy link
Member

zzzeek commented Oct 13, 2023

unfortunately what we likely need here is a full fledged tokenizer for PG style constraints, that tokenizes out casts, parenthesis, the whole thing, and compares intent

@CaselIT
Copy link
Member

CaselIT commented Oct 13, 2023

let's try for now to solve this by improving the case of json indexes, ignoring asc and improving logging.

if we keep getting other complains then we can figure out the next steps

@sqla-tester
Copy link
Collaborator

Federico Caselli has proposed a fix for this issue in the main branch:

More PostgreSQL expression index compare fixes https://gerrit.sqlalchemy.org/c/sqlalchemy/alembic/+/4912

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants