Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split TreeWalker and SqlWalker #9551

Merged
merged 1 commit into from
Jul 25, 2022
Merged

Conversation

derrabus
Copy link
Member

@derrabus derrabus commented Feb 28, 2022

When working on native types for the TreeWalker interface and its implementations I ran into some trouble. The main problem was that the interface declares most of its methods with a string return type while some implementations change the return type to void. This of course is not allowed and PHPStan and Psalm already complain a lot about that.

Analyzing the implementations and their usages, I came to the conclusion that we basically have two types of implementations:

  • tree walkers that traverse the AST and modify it
  • output walkers that emit SQL code while traversing the AST

Tree walkers are never called directly. Instead, they are loaded into a TreeWakerChain that iterates over them. Those walkers are optional; it might very well be that the chain is empty when parsing a query. On the other hand, we'll always have an output walker in that process. It has to be either a subclass of SqlWalker or SqlWalker itself which is the default.

You can verify my assumption by removing the implements part from SqlWalker: Doing so won't break a single test.

class SqlWalker implements TreeWalker

The TreeWalker interface is really quite large. This is why implementations (except for SqlWalker and TreeWalkerChain) usually extend TreeWalkerAdapter which provides empty implementations (with void return type!) for all methods. The custom tree walker would just override the method it really wants to implement. The interesting part about that is that only walkSelectStatement(), walkUpdateStatement() and walkDeleteStatement() are ever called publicly. Overriding any other method would be futile. Those methods are called on the output walker however.

This is why I'd like to propose the following changes:

  • SqlWalker does not extend TreeWalker anymore.
  • TreeWalker, TreeWalkerAdapter and TreeWalkerChain are reduced to walkSelectStatement(), walkUpdateStatement() and walkDeleteStatement() and receive a consistent void return type.

@derrabus derrabus force-pushed the types/tree-walker branch 6 times, most recently from 0d3ec8d to c0f7929 Compare March 6, 2022 22:58
@morozov
Copy link
Member

morozov commented Mar 8, 2022

The main problem was that the interface declares most of its methods with a string return type while some implementations change the return type to void. This of course is not allowed and PHPStan and Psalm already complain a lot about that.

Without looking too much into the implementation, instead of having multiple interfaces, would it be possible to have one parameterized? See https://phpstan.org/r/7ec5df39-c547-4116-abd8-f706ae0dd548 for example.

@derrabus
Copy link
Member Author

derrabus commented Mar 8, 2022

Without looking too much into the implementation, instead of having multiple interfaces, would it be possible to have one parameterized?

That's possible and that way we could have SqlWalker implement TreeWalker still, but it would also mean that we would never be able to add a native return type to the interface. I doubt that this is the right solution. Tree walkers are never used in places where we use output walkers and vice versa. They don't need to share a common abstraction.

@derrabus derrabus force-pushed the types/tree-walker branch from c0f7929 to f9659bb Compare March 20, 2022 12:08
@derrabus derrabus force-pushed the types/tree-walker branch 2 times, most recently from cf4fb41 to 1d97ce2 Compare March 30, 2022 22:22
@derrabus derrabus force-pushed the types/tree-walker branch from 1d97ce2 to fd5ee18 Compare April 9, 2022 22:21
@derrabus derrabus added this to the 3.0.0 milestone Apr 10, 2022
@derrabus derrabus force-pushed the types/tree-walker branch 2 times, most recently from 548a069 to b6d25c1 Compare April 22, 2022 18:15
@derrabus derrabus force-pushed the types/tree-walker branch 2 times, most recently from 0169258 to 55c08f5 Compare May 2, 2022 08:54
@derrabus derrabus force-pushed the types/tree-walker branch from 55c08f5 to 9e9dcd3 Compare May 9, 2022 08:43
@derrabus
Copy link
Member Author

derrabus commented May 9, 2022

@greg0ire @beberlei This PR has not really gotten much attention, but I'd like to work on this for 2.13 because it removes a major point of confusion for both static analysis tools. Feedback would be very much appreciated.

@greg0ire
Copy link
Member

greg0ire commented May 9, 2022

I think your proposal goes in the right direction! It's not like we have a big bag of walkers at some point and iterate on them, they seem to be well separated from the start.

I used this as an occasion to try out the MermaidJS integration (disregard the SomeOtherInterface, I could not find a way to make the lone SqlWalker appear otherwise):

Before

classDiagram
TreeWalker <|-- TreeWalkerAdapter
TreeWalker <|-- TreeWalkerChain
TreeWalkerChain *-- TreeWalker
TreeWalker <|-- SqlWalker
TreeWalker : string walkSelectStatement()
TreeWalker : string walkUpdateStatement()
TreeWalker : string walkDeleteStatement()
TreeWalker : mixed manyMoreMethods()
Loading

After

classDiagram
TreeWalker <|-- TreeWalkerAdapter
TreeWalker <|-- TreeWalkerChain
TreeWalkerChain *-- TreeWalker
TreeWalker : void walkSelectStatement()
TreeWalker : void walkUpdateStatement()
TreeWalker : void walkDeleteStatement()
SomeOtherInterface <|-- SqlWalker
Loading

Regarding the purpose of manyMoreMethods(), they do seem useless in TreeWalkerChain and TreeWalkerAdapter, but I will let @beberlei confirm that.

@derrabus derrabus force-pushed the types/tree-walker branch 2 times, most recently from 4d738f3 to 9cfea0f Compare May 17, 2022 10:48
@derrabus derrabus force-pushed the types/tree-walker branch 3 times, most recently from 9834963 to 56e90aa Compare May 25, 2022 06:20
@derrabus
Copy link
Member Author

Rebased. Now that the deprecation is in place, I think I can backport some more changes to 2.13. Leaving this PR as draft for now.

@derrabus derrabus force-pushed the types/tree-walker branch from 56e90aa to fbff600 Compare June 1, 2022 13:10
@derrabus derrabus force-pushed the types/tree-walker branch 3 times, most recently from d579c88 to 8f65a69 Compare June 10, 2022 08:03
public function walkConditionalExpression($condExpr)
{
public function walkConditionalExpression(
AST\ConditionalExpression|AST\ConditionalPrimary|AST\ConditionalTerm|AST\ConditionalFactor $condExpr
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not really happy with those unions here. We should try to get rid of them in a follow-up.

@derrabus derrabus force-pushed the types/tree-walker branch from 93df374 to 6e75d51 Compare July 25, 2022 19:28
@derrabus derrabus marked this pull request as ready for review July 25, 2022 19:29
@derrabus derrabus requested a review from greg0ire July 25, 2022 19:29
@derrabus derrabus merged commit a3a8caa into doctrine:3.0.x Jul 25, 2022
@derrabus derrabus deleted the types/tree-walker branch July 25, 2022 20:27
@greg0ire greg0ire mentioned this pull request Jul 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants