Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid an inconsistency in topological sort result order #11086

Merged
merged 1 commit into from
Dec 21, 2023

Conversation

mpdude
Copy link
Contributor

@mpdude mpdude commented Nov 24, 2023

This PR changes a detail in the commit order computation for depended-upon entities.

We have a parent-child relationship between two entity classes. The association is parent one-to-many children, with the child entities containing the (owning side) back-reference.

Cascade-persist is not used, so all entities have to be passed to EntityManager::persist().

Before v2.16.0, two child entities C1 and C2 will be inserted in the same order in which they are passed to persist(), and that is regardless of whether the parent entity was passed to persist() before or after the child entities.

As of v2.16.0, passing the parent entity to persist() after the child entities will lead to an insert order that is reversed compared to the order of persist() calls.

This PR makes the order consistent in both cases, as it was before v2.16.0.

Cause

When the parent is passed to persist() after the children, commit order computation has to re-arrange the entities. The parent must be inserted first since it is referred to by the children.

The implementation of the topological sort from #10547 processed entities in reverse persist() order and unshifted finished nodes to an array to obtain the final result. That leads to dependencies (parent → before c1, parent → before c2) showing up in the result in the reverse order of which they were added.

This PR changes the topological sort to produce a result in the opposite order ("all edges pointing left"), which helps to avoid the duplicate array order reversal.

Discussion

  • This PR does not change semantics of the persist() so that entities would (under all ciscumstances) be inserted in the order of persist() calls.
  • It fixes an unnecessary inconsistency between versions before 2.16.0 and after. In particular, it may be surprising that the insert order for the child entities depends on whether another referred-to entity (the parent) was added before or after them.
  • Both orders (c1 before or after c2) are technically and logically correct with regard to the agreement that commit() is free to arrange entities in a way that allows for efficient insertion into the database.
  • With regard to the GH11058Test test case added, note that there is nothing that guarantees any particular order of the child entities within the parent's Collection. Also, the order of child entities in the collection is not relevant for the commit order computation, since the collection is not the owning side and no cascade-persist is used.
  • It might be that, if cascade-persist was used, the child entities will effectively be passed to persist() in the order in which they are found in the collection. So, they would be inserted in collection order unless they'd explicitly be passed to persist() before. Bottom line: Do not rely on collection ordering unless it's explicitly defined, e. g. with an explicit rank field.

Fixes #11058.

This PR changes a detail in the commit order computation for depended-upon entities.

We have a parent-child relationship between two entity classes. The association is parent one-to-many children, with the child entities containing the (owning side) back-reference.

Cascade-persist is not used, so all entities have to be passed to `EntityManager::persist()`.

Before v2.16.0, two child entities C1 and C2 will be inserted in the same order in which they are passed to `persist()`, and that is regardless of whether the parent entity was passed to `persist()` before or after the child entities.

As of v2.16.0, passing the parent entity to `persist()` _after_ the child entities will lead to an insert order that is _reversed_ compared to the order of `persist()` calls.

This PR makes the order consistent in both cases, as it was before v2.16.0.

 #### Cause

When the parent is passed to `persist()` after the children, commit order computation has to re-arrange the entities. The parent must be inserted first since it is referred to by the children.

The implementation of the topological sort from doctrine#10547 processed entities in reverse `persist()` order and unshifted finished nodes to an array to obtain the final result. That leads to dependencies (parent → before c1, parent → before c2) showing up in the result in the reverse order of which they were added.

This PR changes the topological sort to produce a result in the opposite order ("all edges pointing left"), which helps to avoid the duplicate array order reversal.

 #### Discussion

* This PR _does not_ change semantics of the `persist()` so that entities would (under all ciscumstances) be inserted in the order of `persist()` calls.
* It fixes an unnecessary inconsistency between versions before 2.16.0 and after. In particular, it may be surprising that the insert order for the child entities depends on whether another referred-to entity (the parent) was added before or after them.
* _Both_ orders (c1 before or after c2) are technically and logically correct with regard to the agreement that `commit()` is free to arrange entities in a way that allows for efficient insertion into the database.

Fixes doctrine#11058.
@mpdude mpdude marked this pull request as ready for review December 21, 2023 15:26
@mpdude mpdude changed the title Improve topological sort result order Avoid an inconsistency in topological sort result order Dec 21, 2023
@derrabus derrabus requested a review from greg0ire December 21, 2023 16:38
@greg0ire greg0ire merged commit c288647 into doctrine:2.17.x Dec 21, 2023
58 checks passed
@greg0ire greg0ire added this to the 2.17.3 milestone Dec 21, 2023
@greg0ire
Copy link
Member

Thanks @mpdude !

@mpdude mpdude deleted the 11058-revisited branch December 21, 2023 22:36
@netbull
Copy link

netbull commented Dec 28, 2023

@mpdude Thank you for the PR!

Sorry for the late reply. Anyway. When I look testNodesReturnedInDepthFirstOrderWithEdgesInDifferentOrderThanNodes and testNodesReturnedInDepthFirstOrderWithDependingNodeLastAndEdgeOrderInversed it looks really strange, but probably I'm missing something because GH11058Test actually works just fine.

Again, thank you for the effort and have a great holidays!

@d-ph
Copy link
Contributor

d-ph commented Mar 1, 2024

@mpdude

Kind sir, just wanted to say that you are a legend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants