-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
psql/migrations: when no-transaction is set, split multi-statement sql into individual statements and execute them separately #3574
base: main
Are you sure you want to change the base?
Conversation
I don't really see the value in carrying all this extra complexity for what is a pretty niche use-case. As far as I can tell, Diesel's migrations don't work this way, nor does Flyway--the two migrators I'm familiar with--and both allow you to configure migrations to run outside of a transaction. |
Flyway actually does split migrations into individual statements, and even goes a step further and automatically detects if certain statements need to be run outside a transaction. Another example is goose (Go, 7k stars) that also splits the sql into statements and even allows providing additional annotations per statement. On the ruby/rails side, ActiveRecord supports setting Most SQL tools (DBeaver, psql, etc.) will also split multi-statment sqls and run them sequentially. I do agree about the added complexity, but I'm not sure if I would necessarily agree that it is a niche case when dealing with larger projects that use postgres. |
I dunno, I think it's a huge footgun because you can easily be left stuck with a partially applied migration. It won't even catch syntax errors until it goes to execute that statement. And if there's any bugs in the parser it could lead to some really confusing errors. I think keeping our migrations simple is an advantage. As a compromise, I could see adding a simple comment syntax like: -- no-transaction
ALTER TABLE foo ADD COLUMN bar_id uuid;
-- Execute the following chunk in its own `Query` message:
-- statement-break
CREATE INDEX CONCURRENTLY ON foo(bar_id, foo_id);
-- statement-break
ALTER TABLE foo ADD FOREIGN KEY bar_id REFERENCES bar(bar_id);
Then instead of having to pseudo-parse SQL, we just check if a line begins with a certain string, similar to how we detect It would also give you direct control over the splitting, and group things together which can execute in an autocommit block for a bit more safety. |
Re: footguns Re: parser Re: comment annotation idea Please let me know if you think any of the two approaches would be acceptable to you and if yes, which one you'd prefer and I'll update the PR! |
This is going to be a controversial one.
Recently, support for adding
-- no-transaction
at the beginning of migration files was introduced, which will execute the migration without creating a transaction. This is useful for certain commands, such asCREATE INDEX CONCURRENTLY
, which are not allowed to run inside transactions.However, postgres will implicitly create a transaction block when a query contains multiple statements. This means that the
-- no-transaction
flag has inconsistent behavior depending on whether the migration contains just one statement or multiple ones, and that commands that are not allowed to run inside transactions will fail if part of a larger migration.I propose to split the sql into individual statements when
no_tx
is set and to execute them separately, resulting in consistent behavior in line with how many sql tools handle this case. This was implemented using a custompest
grammar, which is very low overhead and performant (but does add a dependency).One could argue that the current behavior is good (albeit lacking a clear error message) and that users should be encouraged to create small migrations with only one statement at a time when opting out of the safety of transactions. I would counter that a user that opts out of transactions usually has a reason for that and is aware that extra care has to be taken to ensure that migration is sound and tested, that an error during a
no_tx
migration will always require manual cleanup anyway, and that there are production cases where forcing the user to split up a logical migration into potentially many individual files can add complexity and make it harder to manage and audit (we recently had such case).