Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not modify column type while modifying fk constraint #644

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

milmazz
Copy link

@milmazz milmazz commented Oct 17, 2024

The following migration syntax:

alter table(:my_table) do
  modify :parent_id, references(:my_table, type: :binary_id, on_delete: :nilify_all, validate: false),
    from: {:binary_id, null: true}
end

Produces a SQL equivalent like this:

ALTER TABLE "my_table"
  ALTER COLUMN "parent_id" TYPE uuid,
  ADD CONSTRAINT "my_table_parent_id_fkey" FOREIGN KEY ("parent_id") REFERENCES "my_table"("id") ON DELETE SET NULL NOT VALID

The problem with the first directive, ALTER COLUMN, is that even if the field type is the same, at least in PostgreSQL, is creating bunch of AccessExclusiveLock (the most restrictive one) on each of the indexes associated to that field, and the table itself, when that's not needed at all, at least in this case.

For example:

test_dev=# BEGIN;
  LOCK TABLE "schema_migrations" IN SHARE UPDATE EXCLUSIVE MODE;
  ALTER TABLE "my_table" ALTER COLUMN "parent_id" TYPE uuid, ADD CONSTRAINT "my_table_parent_id_fkey" FOREIGN KEY ("parent_id") REFERENCES "my_table"("id") ON DELETE SET NULL NOT VALID;
  INSERT INTO "schema_migrations" ("version","inserted_at") VALUES ('20210718210952',NOW());
  SELECT locktype, relation::regclass, mode, transactionid AS tid, virtualtransaction AS vtid, pid, granted FROM pg_locks;
COMMIT;
BEGIN
LOCK TABLE
ALTER TABLE
INSERT 0 1
   locktype    |                     relation                        |           mode           |   tid   |  vtid   |  pid  | granted
---------------+-----------------------------------------------------+--------------------------+---------+---------+-------+---------
 relation      | schema_migrations                                   | RowExclusiveLock         |         | 4/14045 | 16813 | t
 virtualxid    |                                                     | ExclusiveLock            |         | 4/14045 | 16813 | t
 relation      | my_table_parent_id_ppppppp_id_index                 | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | pg_locks                                            | AccessShareLock          |         | 4/14045 | 16813 | t
 relation      | 134438                                              | AccessShareLock          |         | 4/14045 | 16813 | t
 relation      | 134438                                              | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | my_table_parent_id_rrrrr_index                      | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | my_table                                            | AccessShareLock          |         | 4/14045 | 16813 | t
 relation      | my_table                                            | ShareLock                |         | 4/14045 | 16813 | t
 relation      | my_table                                            | ShareRowExclusiveLock    |         | 4/14045 | 16813 | t
 relation      | my_table                                            | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | my_table_project_id_xxxx_index                      | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | 135159                                              | AccessShareLock          |         | 4/14045 | 16813 | t
 relation      | 135159                                              | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | 132875                                              | AccessShareLock          |         | 4/14045 | 16813 | t
 relation      | 132875                                              | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | 132507                                              | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | 132506                                              | AccessShareLock          |         | 4/14045 | 16813 | t
 relation      | 132506                                              | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | schema_migrations                                   | ShareUpdateExclusiveLock |         | 4/14045 | 16813 | t
 relation      | 132504                                              | AccessShareLock          |         | 4/14045 | 16813 | t
 relation      | 132504                                              | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | 135184                                              | AccessShareLock          |         | 4/14045 | 16813 | t
 relation      | 135184                                              | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | 132505                                              | AccessShareLock          |         | 4/14045 | 16813 | t
 relation      | 132505                                              | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | my_table_parent_id_ttttt_index                      | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 transactionid |                                                     | ExclusiveLock            | 2175345 | 4/14045 | 16813 | t
 relation      | my_table_parent_id_aaaaaaaaaaa_id_index             | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | 132503                                              | AccessShareLock          |         | 4/14045 | 16813 | t
 relation      | 132503                                              | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | my_table_parent_id_gggg_index                       | AccessExclusiveLock      |         | 4/14045 | 16813 | t
(35 rows)

COMMIT

So, the proposal is to avoid setting the type when you pass references to modify, if there is any type incompatibility, PostgreSQL, will let you know about it right away:

ERROR:  foreign key constraint "my_table_parent_id_fkey" cannot be implemented
DETAIL:  Key columns "parent_id" and "id" are of incompatible types: character varying and uuid.

With this proposal, the resulting SQL syntax should be something like:

ALTER TABLE "my_table"
  ADD CONSTRAINT "my_table_parent_id_fkey" FOREIGN KEY ("parent_id") REFERENCES "my_table"("id") ON DELETE SET NULL NOT VALID

That will produce more manageable lock types:

   locktype    |     relation      |           mode           |   tid   |  vtid   |  pid  | granted
---------------+-------------------+--------------------------+---------+---------+-------+---------
 relation      | pg_locks          | AccessShareLock          |         | 4/14540 | 16849 | t
 relation      | schema_migrations | RowExclusiveLock         |         | 4/14540 | 16849 | t
 virtualxid    |                   | ExclusiveLock            |         | 4/14540 | 16849 | t
 transactionid |                   | ExclusiveLock            | 2175967 | 4/14540 | 16849 | t
 relation      | schema_migrations | ShareUpdateExclusiveLock |         | 4/14540 | 16849 | t
 relation      | my_table          | AccessShareLock          |         | 4/14540 | 16849 | t
 relation      | my_table          | ShareRowExclusiveLock    |         | 4/14540 | 16849 | t
(7 rows)

And with that, you can avoid potential downtime in production :bowtie:

The following syntax:

```elixir
alter table(:my_table) do
  modify :parent_id, references(:my_table, type: :binary_id, on_delete: :nilify_all, validate: false),
    from: {:binary_id, null: true}
end
```

Produces a SQL equivalent like this:

```sql
ALTER TABLE "my_table"
  ALTER COLUMN "parent_id" TYPE uuid,
  ADD CONSTRAINT "my_table_parent_id_fkey" FOREIGN KEY ("parent_id") REFERENCES "my_table"("id") ON DELETE SET NULL NOT VALID
```

The problem with the first directive, `ALTER COLUMN`, is that even the
type is the same, at least in PostgreSQL, is creating bunch of
`AccessExclusiveLock` (the most restrictive one) on each of the indexes
associated to that field, and the table itself, when that's not needed at all.

For example:

```console
test_dev=# BEGIN;
  LOCK TABLE "schema_migrations" IN SHARE UPDATE EXCLUSIVE MODE;
  ALTER TABLE "my_table" ALTER COLUMN "parent_id" TYPE uuid, ADD CONSTRAINT "my_table_parent_id_fkey" FOREIGN KEY ("parent_id") REFERENCES "my_table"("id") ON DELETE SET NULL NOT VALID;
  INSERT INTO "schema_migrations" ("version","inserted_at") VALUES ('20210718210952',NOW());
  SELECT locktype, relation::regclass, mode, transactionid AS tid, virtualtransaction AS vtid, pid, granted FROM pg_locks;
COMMIT;
BEGIN
LOCK TABLE
ALTER TABLE
INSERT 0 1
   locktype    |                     relation                        |           mode           |   tid   |  vtid   |  pid  | granted
---------------+-----------------------------------------------------+--------------------------+---------+---------+-------+---------
 relation      | schema_migrations                                   | RowExclusiveLock         |         | 4/14045 | 16813 | t
 virtualxid    |                                                     | ExclusiveLock            |         | 4/14045 | 16813 | t
 relation      | my_table_parent_id_ppppppp_id_index                 | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | pg_locks                                            | AccessShareLock          |         | 4/14045 | 16813 | t
 relation      | 134438                                              | AccessShareLock          |         | 4/14045 | 16813 | t
 relation      | 134438                                              | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | my_table_parent_id_rrrrr_index                      | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | my_table                                            | AccessShareLock          |         | 4/14045 | 16813 | t
 relation      | my_table                                            | ShareLock                |         | 4/14045 | 16813 | t
 relation      | my_table                                            | ShareRowExclusiveLock    |         | 4/14045 | 16813 | t
 relation      | my_table                                            | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | my_table_project_id_xxxx_index                      | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | 135159                                              | AccessShareLock          |         | 4/14045 | 16813 | t
 relation      | 135159                                              | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | 132875                                              | AccessShareLock          |         | 4/14045 | 16813 | t
 relation      | 132875                                              | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | 132507                                              | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | 132506                                              | AccessShareLock          |         | 4/14045 | 16813 | t
 relation      | 132506                                              | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | schema_migrations                                   | ShareUpdateExclusiveLock |         | 4/14045 | 16813 | t
 relation      | 132504                                              | AccessShareLock          |         | 4/14045 | 16813 | t
 relation      | 132504                                              | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | 135184                                              | AccessShareLock          |         | 4/14045 | 16813 | t
 relation      | 135184                                              | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | 132505                                              | AccessShareLock          |         | 4/14045 | 16813 | t
 relation      | 132505                                              | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | my_table_parent_id_ttttt_index                      | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 transactionid |                                                     | ExclusiveLock            | 2175345 | 4/14045 | 16813 | t
 relation      | my_table_parent_id_aaaaaaaaaaa_id_index             | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | 132503                                              | AccessShareLock          |         | 4/14045 | 16813 | t
 relation      | 132503                                              | AccessExclusiveLock      |         | 4/14045 | 16813 | t
 relation      | my_table_parent_id_gggg_index                       | AccessExclusiveLock      |         | 4/14045 | 16813 | t
(35 rows)

COMMIT
```

So, the proposal is to avoid setting the type when you pass a
`references` to `modify`, if there is any type incompatibility,
PostgreSQL, will let you know about it right away:

```
ERROR:  foreign key constraint "my_table_parent_id_fkey" cannot be implemented
DETAIL:  Key columns "parent_id" and "id" are of incompatible types: character varying and uuid.
```

Instead, the resulting SQL syntax should be something like:

```sql
ALTER TABLE "my_table"
  ADD CONSTRAINT "my_table_parent_id_fkey" FOREIGN KEY ("parent_id") REFERENCES "my_table"("id") ON DELETE SET NULL NOT VALID
```
That will produce more manageable lock types:

```console
   locktype    |     relation      |           mode           |   tid   |  vtid   |  pid  | granted
---------------+-------------------+--------------------------+---------+---------+-------+---------
 relation      | pg_locks          | AccessShareLock          |         | 4/14540 | 16849 | t
 relation      | schema_migrations | RowExclusiveLock         |         | 4/14540 | 16849 | t
 virtualxid    |                   | ExclusiveLock            |         | 4/14540 | 16849 | t
 transactionid |                   | ExclusiveLock            | 2175967 | 4/14540 | 16849 | t
 relation      | schema_migrations | ShareUpdateExclusiveLock |         | 4/14540 | 16849 | t
 relation      | my_table          | AccessShareLock          |         | 4/14540 | 16849 | t
 relation      | my_table          | ShareRowExclusiveLock    |         | 4/14540 | 16849 | t
(7 rows)
```
Comment on lines +1579 to +1581
"ALTER COLUMN ",
quote_name(name),
" TYPE ",
column_type,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar behavior is observed when doing something like:

 modify :account_id, :binary_id, null: false, from: {:binary_id, null: true}

The SQL syntax produced is this one:

ALTER TABLE "my_table"
  ALTER COLUMN "account_id" TYPE uuid,
  ALTER COLUMN "account_id" SET NOT NULL;

And that's producing way more AccessExclusiveLock than the alternative: ALTER TABLE "my_table" ALTER COLUMN "account_id" SET NOT NULL;

@milmazz milmazz force-pushed the bugfix/migration/do-not-modify-field-type branch from eeed434 to edf1414 Compare October 17, 2024 23:28
@milmazz milmazz force-pushed the bugfix/migration/do-not-modify-field-type branch from edf1414 to 475cf8c Compare October 17, 2024 23:32
@josevalim
Copy link
Member

Hi @milmazz! 👋

So, the proposal is to avoid setting the type when you pass references to modify, if there is any type incompatibility, PostgreSQL, will let you know about it right away:

But if I want to modify the type, what do we do? Maybe we should rather allow the type for references to be skipped and, in such cases, we don't modify it?

@halfdan
Copy link

halfdan commented Oct 18, 2024

alter table(:my_table) do
  modify :parent_id, references(:my_table, type: :binary_id, on_delete: :nilify_all, validate: false),
    from: {:binary_id, null: true}
end

Wouldn't it be sufficient to skip the alter column if the type in the from and references match up? (here :binary_id)

@greg-rychlewski
Copy link
Member

One thing I noticed when looking into this is that our description for modify is Modifies the type of a column when altering a table. in the docs. And the advice is to use execute when not wanting to modify the type.

We have convenience functions for check constraints so maybe it would make sense to have one for foreign key constraints.

@novaugust
Copy link
Contributor

novaugust commented Oct 18, 2024

We have convenience functions for check constraints so maybe it would make sense to have one for foreign key constraints.

It was surprising to me that create constraint(...) can't be used to create foreign key constraints, so +1 to this.

Wouldn't it be sufficient to skip the alter column if the type in the from and references match up? (here :binary_id)

Seems smart and good :) though likely only helps folks using change and not up,down 🤔 I think having create constraint do fkeys would be more intuitive for folks only looking to do that, and it'd be usable in both up/down. docs for modify/2,3 could then point to constraint for only creating a constraint sans column modification


tldr: milton's solution here is an improvement, but i'd also advocate for updating the dsl to create foreign key constraints similarly to check constraints

@greg-rychlewski
Copy link
Member

greg-rychlewski commented Oct 18, 2024

The only thing about the from/to matching is that you technically can allow a reverse migration that is different than what is currently there.

I don't know how encouraged or common it is but for example you might have a column that is integer type and you want to change it to a bigint that references the id of another table. But you want to rollback to bigint without the reference if it turns out the reference causes an issue.

This would apply to any migration that combines an upcast with a reference. We could force the user to split the migration into 2 but I am not sure if that is the right thing to do.

@milmazz
Copy link
Author

milmazz commented Oct 18, 2024

Hi @milmazz! 👋

Long time no see, I hope you're doing well.

But if I want to modify the type, what do we do? Maybe we should rather allow the type for references to be skipped and, in such cases, we don't modify it?

Yeah, I thought about it, and after fc2e173 we should support both modes. I'm not 💯 convince about the syntax and tricks I had to do, but I can revisit this in another round.

So, for example:

create table(:alter_fk_users)

create table(:alter_fk_posts) do
  add :alter_fk_user_id, :id
end

alter table(:alter_fk_posts) do
  modify :alter_fk_user_id, references(:alter_fk_users, on_delete: :nilify_all), from: {:id, null: true}
end

This will add the foreign key constraint but it will avoid setting the ALTER COLUMN TYPE because they remain the same.

For this case:

create table(:alter_fk_users)
     
create table(:alter_fk_comments) do
  add :alter_fk_user_id, references(:alter_fk_users)
end

alter table(:alter_fk_comments) do
  modify :alter_fk_user_id, references(:alter_fk_users, on_delete: :delete_all), from: references(:alter_fk_users, on_delete: :nothing)
end

It will recreate the previous foreign key constraint, but it will avoid setting the column type because they're effectively the same.

@milmazz
Copy link
Author

milmazz commented Oct 18, 2024

One thing I noticed when looking into this is that our description for modify is Modifies the type of a column when altering a table. in the docs. And the advice is to use execute when not wanting to modify the type.

I think the problem occurs because developers overlook this detail, that the modify/3 is always modifying the type even when that's not needed at all in some situations.

Even popular guides about "good practices" suggest to use modify/3 to set a field as null: false

For example, from the Migration Recipes you find the following section Setting NOT NULL on an existing column

# **Postgres 12+ only**
def change do
  execute "ALTER TABLE products VALIDATE CONSTRAINT active_not_null", ""

  alter table("products") do
    modify :active, :boolean, null: false
  end

  drop constraint("products", :active_not_null)
end

Similar advice is found in the excellent_migrations library, while my intention is not to blame anyone, I just want to point out that other developers might be overlooking the fact that modify/3 is introducing a column modification that will cause unexpected table locks.

@greg-rychlewski
Copy link
Member

greg-rychlewski commented Oct 18, 2024

My intention is not to downplay the issue or anything like that. I think this is a good callout. I am mostly unsure about trying to inject too much "smartness" that goes against the stated use. I think we would have to give people a way to be explicit that they don't want the type changed.

I have given an example here of how matching the from/to types might be problematic: #644 (comment)

@novaugust
Copy link
Contributor

I think we would have to give people a way to be explicit that they don't want the type changed.

As in some sort of new option?

modify :column, references(...), modify_type: false
# or
modify :column, references(...), constraint_only: true

@greg-rychlewski
Copy link
Member

Something in the vein of the first one. Though I didn't give the API much thought yet sorry, so don't want to say "go do it now!". But it is probably better to make it specific to type rather than constraint. In case there are other non-constraint options now or in the future that might also have this issue.

@josevalim
Copy link
Member

I like the idea of not changing the type if :from is given and they match. You get two features: reversible and type-skipping.

@greg-rychlewski
Copy link
Member

+1 from me as well then. I was mainly concerned how explicit the docs were in mentioning this exact issue with the locks and suggesting to use another function. But if we are not causing problems for existing users then I'm on board.

@milmazz
Copy link
Author

milmazz commented Oct 21, 2024

I like the idea of not changing the type if :from is given and they match. You get two features: reversible and type-skipping.

@josevalim Exactly! We're keeping backward compatibility with the previous implementation, meaning that we still offer reversible migrations, and additionally we're avoiding changing the type, when we determine is not needed based on the information we take from the from option.

I updated more unit tests to cover the new branches and also fixed a bug when setting a default value in the new branch. It should be ready for review. /cc @greg-rychlewski

I also updated the docs for modify/3, adding an admonition block to draw the developer attention about undesirable locks and how they can work around those specifying the from option, at least for the PostgreSQL adapter.

Screenshot 2024-10-21 at 4 18 10 PM

> adapter, if you provide the option `:from`, and the column type matches,
> we will skip updating it.
>
> Examples
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
> Examples
> Examples:

@greg-rychlewski
Copy link
Member

@milmazz Sorry it seems like the MySQL integration tests failed. It may be you need to isolate this feature to new tests and tag them so they are only run by the Postgres adapter.

Also I was wondering if you think it might be worth adding some unit tests to the Postgres adapter to ensure the query text is what you are expecting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants