-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make the comparator API platform aware #4659
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch, @ausi. API wise it looks good but logic-wise we need to work through some details. Please see the comments inline.
The test expects the comparator to identify a change because the current comparator isn't platform-aware. The new one is, and it understands that the "bar" option doesn't apply to any platform.
Right. But the test needs to be reworked to test a real platform-specific option with a specific platform that supports it. |
Done in 5dbb715.
I think all comments should be addressed now. Should I proceed with the deprecation layer to make the changed comparator API backwards compatible? |
Done in d2a53b5. |
src/Schema/Comparator.php
Outdated
@@ -199,8 +212,18 @@ public function diffSequence(Sequence $sequence1, Sequence $sequence2) | |||
* | |||
* @throws SchemaException | |||
*/ | |||
public function diffTable(Table $fromTable, Table $toTable) | |||
public function diffTable(Table $fromTable, Table $toTable, ?AbstractPlatform $platform = null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we could use this as an opportunity to introduce better method names. diffTable()
could be renamed to compareTables()
. This way, we won't have to introduce an optional parameter but could introduce the whole old method. The same applies to the rest of the methods except for compareSchemas()
which already exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yesterday I openned #4707 to introduce a more consistent Comparator API.
But as @ausi commented, it will make merging this PR harder, mainly because of the optional $platform
parameter on all public methods.
After sleeping on it, I think injecting the platform as a constructor parameter when creating a Comparator object would help with avoiding conflcts and would make sense design wise: it seems comparing schemas depends on the platform, given all the linked pull requests.
Actually, it is precisely for enabling this type of API evolution that #4707 deprecates static calls of compareSchemas()
, as stated in #4704 (comment).
What do you think ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But as @ausi commented, it will make merging this PR harder, mainly because of the optional
$platform
parameter on all public methods.
Let's not worry about that. I can help resolve all the conflicts once they occur.
I think injecting the platform as a constructor parameter when creating a Comparator object would help with avoiding conflcts and would make sense design wise: it seems comparing schemas depends on the platform, given all the linked pull requests.
Whether it's injected in the constructor or passed as an argument, it's the same design-wise: the comparator does depend on the platform. The question is, whether or not a given comparator instance needs be usable with different platforms, and the answer is "probably not". I think we can move the platform dependency a constructor argument as originally proposed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The compareColumns()
method now returns bool
in this pull request. This would conflict with #4707 (but only if there would actually be a version release between the two pull requests).
@ausi, I linked a few pull requests that will be superseded by this one once it's done. I believe there are at least as many open issues (to be identified), so it appears to be a quite important improvement. No pressure, just FYI 😅 |
@morozov I think I addressed all comments now. This should be ready for another review 🎉 |
Some tests seem to have problems with the AFAIK this is due to charset and collation being reported even if they use the default. For example, offline schema:
online schema:
I’m not sure how to best fix this with the concept of doctrine, as the One possibility would be that the schema manager stores the information about which platformOptions don’t differ in the |
Why did this issue surface only now? As far as I see from debugging the test, when the unexpected diff is produced, the columns are still compared on a per-property basis. This is how it already works.
I believe it's a bit risky. Omitted values on the defining side mean that the defaults on the database should kick in. If an online column has different properties, the comparison should result in a diff but it won't which is a bug. What we could do instead is explicitly add the defaults to the DBAL version of the column before comparison. They are currently implemented in: dbal/src/Platforms/MySQLPlatform.php Lines 480 to 485 in a602f2b
We'll also need to double-check how it will work when the defaults are omitted (see #4644). |
Currently doctrine only compares platform options that exist in both dbal/src/Schema/Comparator.php Lines 512 to 521 in 2549642
But this is wrong IMO as it would not detect changes from a different charset back to the default for example.
No, I meant explicitly storing information about which platform options are set to the default of the table. This way the comparator can compare the platform options and if a defining side option is missing it can check if the database side option is default or not.
These are the table options, not the column options AFAIK. The charset/collation of the columns are set to the |
I agree. It would be nice to reproduce this issue separately by writing a test for it.
Where can we store this information? Keep in mind that the DBAL should be able to introspect and compare the objects created by other clients.
Yes, but eventually there will be the same problem: if we compare the DDL of the table generated from its definition in the DBAL and the DDL generated from the introspected data, there may be a mismatch due to some implied defaults. At least when the defaults are removed from the DBAL in #4644.
Yes, I think in order to compare the columns properly, we need to introspect the deployed table first. I.e.:
|
I tried out the concept now as you can see in the last two commits. It works pretty well but there is one major issue: the default for the
Done in 4c061bb
I stored it in the I think it could make sense to let the abstraction gain knowledge about charset and collation to handle these cases better, instead of just putting them in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it could make sense to let the abstraction gain knowledge about charset and collation to handle these cases better, instead of just putting them in the
platformOptions
array.
I don't think we're there yet. I see quite some bits in the PR that I don't understand and I don't understand the overall design either. For instance, the property-based comparison in the presence of a specific platform goes again the entire idea of comparing SQL to SQL.
I don't think any platform-specific column parameters should be promoted to the DBAL: handling charsets is done differently in different platforms, and there's no way it could be properly described in an abstract API. It should be the opposite: we expose the platform to the comparator, so if any platform-specific comparison needs to be done, it should delegate it to the platform.
Please let me know if you need help rebasing the PR. It might help if you squash all the commits. This way, the least possible number of merge conflicts will have to be resolved.
@@ -2254,6 +2266,13 @@ public function getColumnDeclarationSQL($name, array $column) | |||
return $name . ' ' . $declaration; | |||
} | |||
|
|||
public function getColumnSQL(string $name, Column $column): string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the purpose perspective, how is it different from the existing getColumnDeclarationSQL()
? I'd rather add the support for Column $column
to the existing method.
'WHERE COLLATION_NAME = TABLES.TABLE_COLLATION ' . | ||
') FROM information_schema.TABLES ' . | ||
'WHERE TABLE_SCHEMA = ' . $database . ' AND TABLE_NAME = ' . $table . | ||
'), @@character_set_database) AS DefaultCharacterSet, ' . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I read it right, it means that if the charset is not specified on the column, the query will return the default database charset. Does the database work this way?
I.e. if I create a column and don't specify the charset, will it be introspected as NULL? And furthermore, if I change the default charset, will it affect all columns with the default charset? That would be weird. I'd expected it to be copied to the column metadata during creation or maybe modification.
@@ -44,6 +44,9 @@ class Column extends AbstractAsset | |||
/** @var mixed[] */ | |||
protected $_platformOptions = []; | |||
|
|||
/** @var mixed[] */ | |||
protected $_platformDefaults = []; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't look like it belongs here. A column is a value object representing the specification defined by the user or introspected from the database. I'd say it's the responsibility of the platform to represent its defaults if needed for comparison.
* of the column being declared as array indexes. | ||
* See getColumnDeclarationSQL() for more information. | ||
*/ | ||
protected function _getColumnArrayFromColumn(Column $column): array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's better to declare it private. Also, please do not use the leading underscore for new methods.
$changedProperties = []; | ||
|
||
foreach (array_keys(array_merge($properties1, $properties2)) as $key) { | ||
if (($properties1[$key] ?? null) === ($properties2[$key] ?? null)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we still need to do a property-based comparison while the platform is available? Note that the strict comparison alone will trigger false-positive diffs.
@@ -22,7 +22,7 @@ protected function setUp(): void | |||
parent::setUp(); | |||
|
|||
$this->schemaManager = $this->connection->getSchemaManager(); | |||
$this->comparator = new Comparator(); | |||
$this->comparator = new Comparator($this->schemaManager->getDatabasePlatform()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not modify existing tests. All existing tests should pass without changes.
* | ||
* @return string[] | ||
*/ | ||
public function diffColumn(Column $column1, Column $column2) | ||
public function diffColumn(Column $column1, Column $column2): array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enforcing the return type on a method is a breaking change, please revert.
* | ||
* @return Column[] | ||
*/ | ||
private function stripDefaultValuesFromColumns(Column $column1, Column $column2): array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stripping properties based on their value may be unnecessarily complex and error-prone. I'd rather use the opposite approach and augment the definition provided by the application with the database defaults.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
augment the definition provided by the application with the database defaults
At which point would you do that? The column doesn’t store a reference to the table nor to the database connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. Regardless of whether we strip the defaults or augment the definition, we'll need the DB connection to understand what the omitted options default to. But at the same time, we want to be able to compare pure definitions, w/o any runtime component.
Even if we keep the approach of stripping the defaults, false positives will be possible. E.g. the offline column has its charset explicitly set to the default value, but the online column has it stripped from the definition. Comparing the two will yield a diff, however, it shouldn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But at the same time, we want to be able to compare pure definitions, w/o any runtime component.
It seems to be impossible by definition: whether or not the two definitions are different depends on the runtime. Instead of making the comparator dependent on the platform, it may need to depend on the schema manager which can use both the platform and the connection to normalize the offline column definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if we keep the approach of stripping the defaults, false positives will be possible. E.g. the offline column has its charset explicitly set to the default value, but the online column has it stripped from the definition. Comparing the two will yield a diff, however, it shouldn't.
That case is already handled by my current implementation. If one or both columns have platform defaults, these defaults are stripped from both columns (online and offline). This way all comparisons should be possible (offline to offline, online to offline and online to online). If two online columns are compared and they have different platform defaults my current implementation does not strip the options.
FWIW, I'm close to finishing #4746 which takes the approach from here and reimplements this w/o introducing the undesired API changes. Just need to implement a comparator for SQL Server and decide on the way to rework unit tests which don't have access to a database connection (needed to determine the default database collation for SQL Server). |
Closing in favor of #4746. |
This pull request adds platform awareness to the comparator API as discussed in #4551 (comment)
I think it works pretty well as shown by the tests. Some tests that previously used code like this:
Could be changed to the following code because the comparator now correctly detects these cases would end up in an empty ALTER TABLE:
One failing test remains:
ComparatorTest::testDiffColumnPlatformOptions
as I was not able to understand what exactly this test expects to happen. With the new comparator design it should be OK to report all platform option changes that have an effect in the resulting SQL code I think.For backwards compatibility I would make the
$platform
argument defaulting tonull
execute the old code fordiffColumn
in this case and triggering a deprecation notice. This should make the upgrade path for users of this library pretty straight forward.