Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SchemaTool doesn't correctly recognise the current DB state #5661

Closed
janklan opened this issue Sep 8, 2022 · 9 comments
Closed

SchemaTool doesn't correctly recognise the current DB state #5661

janklan opened this issue Sep 8, 2022 · 9 comments

Comments

@janklan
Copy link

janklan commented Sep 8, 2022

Bug Report

Q A
Version 3.4.4

Summary

Schema tool doesn't correctly detect the current state of the database on PostgreSQLPlatform. I can't speak for other platforms as I have not tested them. This creates issues in components that depend on it, resulting in the SchemaTool proposing changes to the database that should not occur.

Current behaviour

When using columnDefinition, the SchemaTool makes up changes that do not occur when updating the schema. I know the docs say "SchemaTool will not detect changes on the column correctly anymore if you use 'columnDefinition'.", but there is a difference between not detecting changes correctly and making up changes that are not there.

Considering the Entity below, when running orm:schema-tool:update, the first run wants to execute the following expected and well-working queries:

CREATE SEQUENCE fulltext_id_seq INCREMENT BY 1 MINVALUE 1 START 1;
CREATE TABLE fulltext (id INT NOT NULL, column1 TEXT DEFAULT NULL, column2 TEXT DEFAULT NULL, fulltext tsvector GENERATED ALWAYS AS (to_tsvector('english', coalesce(column1, '') || ' ' || coalesce(column2, ''))) STORED, PRIMARY KEY(id));

When running the update command again, the SQL is just wrong:

ALTER TABLE fulltext ALTER fulltext DROP DEFAULT;

There is already a bug report #4111 for the DROP DEFAULT itself, including a proposal that the default value checking is incorrect. I agree with that, but instead of removing the condition, I just applied parenthesis to what I think is the correct statement: from $defaultClause = $column->getDefault() === null to ($defaultClause = $column->getDefault()) === null

After doing so, the SchemaTool no longer tries to drop the default column, however, it still wants to run ALTER TABLE fulltext ALTER fulltext;, so clearly there are some false-positive changes there.

The Column Diff here shows:

^ Doctrine\DBAL\Schema\ColumnDiff^ {#8448
  +oldColumnName: "fulltext"
  +column: Doctrine\DBAL\Schema\Column^ {#4322
    #_name: "fulltext"
    #_namespace: null
    #_quoted: false
    #_type: Doctrine\DBAL\Types\TextType^ {#272}
    #_length: null
    #_precision: 10
    #_scale: 0
    #_unsigned: false
    #_fixed: false
    #_notnull: false
    #_default: null
    #_autoincrement: false
    #_platformOptions: array:1 [
      "version" => false
    ]
    #_columnDefinition: "tsvector GENERATED ALWAYS AS (to_tsvector('english', coalesce(column1, '') || ' ' || coalesce(column1, ''))) STORED"
    #_comment: null
    #_customSchemaOptions: []
  }
  +changedProperties: array:1 [
    0 => "default"
  ]
  +fromColumn: Doctrine\DBAL\Schema\Column^ {#5302
    #_name: "fulltext"
    #_namespace: null
    #_quoted: false
    #_type: Doctrine\DBAL\Types\TextType^ {#272}
    #_length: null
    #_precision: 10
    #_scale: 0
    #_unsigned: false
    #_fixed: false
    #_notnull: false
    #_default: "english"
    #_autoincrement: false
    #_platformOptions: []
    #_columnDefinition: null
    #_comment: null
    #_customSchemaOptions: []
  }
}

So far I tracked it down to the schema comparator being fed incorrect information about the current state of the column - notice how the default value is somehow set to "english", while the column definition is empty.

Dumping the two arrays here https://github.com/doctrine/dbal/blob/3.4.x/src/Schema/Comparator.php#L522 show the below:

^ array:12 [
  "name" => "fulltext"
  "type" => Doctrine\DBAL\Types\TextType^ {#272}
  **"default" => "english"**
  "notnull" => false
  "length" => null
  "precision" => 10
  "scale" => 0
  "fixed" => false
  "unsigned" => false
  "autoincrement" => false
  **"columnDefinition" => null**
  "comment" => null
]
^ array:13 [
  "name" => "fulltext"
  "type" => Doctrine\DBAL\Types\TextType^ {#272}
  **"default" => null**
  "notnull" => false
  "length" => null
  "precision" => 10
  "scale" => 0
  "fixed" => false
  "unsigned" => false
  "autoincrement" => false
  **"columnDefinition" => "tsvector GENERATED ALWAYS AS (to_tsvector('english', coalesce(column1, '') || ' ' || coalesce(column1, ''))) STORED"**
  "comment" => null
  "version" => false
]

I ran out of time for now, I wish I tracked it deeper. I'm hoping someone more familiar with Doctrine's internal can spot quicker where this is going and what needs to be done?

How to reproduce

  1. Add the below entity in your (Symfony) codebase. Outside Symfony, change the command to orm:schema-tool:update. The point is to get the SchemaTool to try to detect changes.
  2. Run bin/console do:sch:up --force to create the table and sequence
  3. Run bin/console do:sch:up --dump-sql to see the broken update queries.
<?php

namespace App\Entity;

use Doctrine\ORM\Mapping as ORM;

#[ORM\Entity]
class Fulltext
{
    #[ORM\Id]
    #[ORM\GeneratedValue]
    #[ORM\Column(type: 'integer')]
    private $id;
    
    #[ORM\Column(type: 'text', nullable: true)]
    private $column1;

    #[ORM\Column(type: 'text', nullable: true)]
    private $column2;

    #[ORM\Column(
        type: 'text',
        nullable: true,
        insertable: false,
        updatable: false,
        columnDefinition: "tsvector GENERATED ALWAYS AS (to_tsvector('english', coalesce(column1, '') || ' ' || coalesce(column1, ''))) STORED",
        generated: 'ALWAYS',
    )]
    private $fulltext;
}

Expected behaviour

The schema should only change when there are changes in place, and the changes should not be incorrect. Sorry for being the Cpt Obvious

@morozov
Copy link
Member

morozov commented Sep 16, 2022

I know the docs say "SchemaTool will not detect changes on the column correctly anymore if you use 'columnDefinition'.", but there is a difference between not detecting changes correctly and making up changes that are not there.

Making the changes that are not there is likely caused by the changes not being detected correctly, which is by design and documented. Do you believe it's unrelated? Can you think of a way to address this issue?

@janklan
Copy link
Author

janklan commented Sep 16, 2022

Making the changes that are not there is likely caused by the changes not being detected correctly

You have a point there; although the docs make it sound that the change detection may fail, not is guaranteed to fail somehow.

What to do about it? If the change detection is broken when there is a custom column definition, (1) don't even attempt to detect changes, (2) always consider the current columnDefinition state authoritative, and (3) optionally where the DB engine allows it, place the columnDefinition in the column's comment at the time the table is being altered, and compare the comment value with current columnDefinition on the next run.

Even without (3) in place, the worst-case scenario is there will be a useless ALTER TABLE statement that won't change anything. That's easy to either ignore or remove from the generated migration.

I'm curious what's the reason for allowing the change detection to run if it's known to be broken?

@morozov
Copy link
Member

morozov commented Sep 17, 2022

I'm curious what's the reason for allowing the change detection to run if it's known to be broken?

No idea. If you believe it's important to get it fixed or improved and you know what to do, please file a pull request and we could discuss the needed changes more substantively.

@janklan
Copy link
Author

janklan commented Sep 17, 2022

I'll wait for a reply of someone who has an idea. Don't get me wrong, I think if a seemingly-broken feature made it to the mainline branch, it probably has some purpose there. I may be missing it. Proposing changes when oblivious is not the way I want to help.

Thanks for your responses, I appreciate it.

@morozov
Copy link
Member

morozov commented Sep 17, 2022

Don't get me wrong, I think if a seemingly-broken feature made it to the mainline branch, it probably has some purpose there.

In my understanding, this feature was introduced to mitigate the inability of the DBAL to declare an arbitrary column on any of the supported platforms (which is an inherent property of any higher-level abstraction compared to the lower-level abstraction that it's built on).

This feature is broken by design. because it breaks that barrier of its own abstraction and gives the end-user direct access to the underlying API.

@janklan
Copy link
Author

janklan commented Sep 18, 2022

This feature is broken by design. because it breaks that barrier of its own abstraction and gives the end-user direct access to the underlying API.

I don't think it's a bad thing to give power users the power to do things that have the potential to break something. I think the design fault here is that DBAL tries to still "help" the user, even though it should go almost completely out of the way, especially if the column is marked non-insertable, non-updateable, and always auto-generated.

My specific use case here is to use fulltext search on various psql tables. I don't want to offload the search functionality to another database. For this I'm trying to utilise an auto-generated column as per the columnDefinition above. Each entity shall have a property called fulltext, but content will be auto-generated from different columns for each entity.

If you say the columnDefinition approach is broken by design, let's ignore the pressing question of "why is it there", and let me ask another one: what is the proper DBAL way to achieve the above?

Adding a custom type? That was my initial approach, but then I found the columnDefinition and got all excited, because it sounds like it's meant to do exactly I want to do: highly customised one-off property. Doctrine isn't supposed to do anything about it other than be aware of it, so I can use that property in my DQL queries.

@morozov
Copy link
Member

morozov commented Sep 18, 2022

Comparison of columns with custom definitions is one issue. The other one is their introspection (see #3281, #4470, #5306).

what is the proper DBAL way to achieve the above?

I don't know. It's better to ask this on Slack where more people could see it.

@morozov
Copy link
Member

morozov commented Sep 22, 2022

I'm going to close this issue as the observed behavior is by design.

Any improvements in the documentation or schema introspection logic related to this are welcome but this is not something that the team plans to work on in foreseeable future.

@morozov morozov closed this as not planned Won't fix, can't repro, duplicate, stale Sep 22, 2022
@github-actions
Copy link

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 23, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants