Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DirectoriesSourceLocator is too ineffective #612

Closed
ondrejmirtes opened this issue May 28, 2020 · 6 comments
Closed

DirectoriesSourceLocator is too ineffective #612

ondrejmirtes opened this issue May 28, 2020 · 6 comments

Comments

@ondrejmirtes
Copy link
Contributor

What it currently does from my testing:

When asked about an identifier, it parses all sources it can access until the identifier is found, then returns the result.

When asked about another identifier, it parses all sources again it can access... Rinse and repeat.

Ideally, it should first index all identifiers it knows about to be more effective. It's still too resource-intensive to parse all source files, more lazy evaluation would be more welcome.

What I do in my forked implementation that works quite well (https://github.com/phpstan/phpstan-src/blob/master/src/Reflection/BetterReflection/SourceLocator/OptimizedDirectorySourceLocator.php) is:

  • Index all files using regexes - inspired by Composer, done in this commit (phpstan/phpstan-src@27a0282#diff-9cfe7af0b51a160bd066b5d3c868041e).
  • The AST PHP-Parser is used only on files with identifiers the user asks about.
  • Unknown identifiers are responded quickly without parsing all the sources.
  • The nodes are cached in private properties.
  • Since functions are detected using function in the regex, the cache is also polluted with class methods which I filter later.

My implementation intentionally doesn't care about constants, it would complicate it. And it's only about a single directory, not directories.

@Ocramius
Copy link
Member

Probably best to deprecate the DirectoriesSourceLocator, or at least use it only for classmap scenarios, since all it does is an aggressive sequential scan.

This is more of a documentation issue: we have better source locators using different lookups strategies, which are a better fit for most codebases.

@Ocramius
Copy link
Member

Note: pushing memoization into existing locators is a no-go: it is not their responsibility, and it will lead to a crazy amount of complexity and state-related bugs.

Similar to class/function/constant reflectors, a uniform cache/registry design is needed.

@ondrejmirtes
Copy link
Contributor Author

PHPStan's use-case is that DirectorySourceLocator is used for analysed paths. It's possible there's no other way to analyse them, there might not be any other way to discover the analysed symbols. This is how the whole chain of source locators is constructed in PHPStan: https://github.com/phpstan/phpstan-src/blob/master/src/Reflection/BetterReflection/BetterReflectionSourceLocatorFactory.php#L101-L147

@Ocramius
Copy link
Member

I'm OK with that, I'm saying that using alternative implementations using different strategies (new code) is preferable to adapting the pre-existing one 👍

Effectively, the pattern is always the same: if you have "seen" something, even if not interesting, you want to keep it memoized for later. We need a decent design for that, which works at the core of the library 👍

@Ocramius
Copy link
Member

if you have "seen" something, even if not interesting, you want to keep it memoized for later.

IMO related to #615

I find that the current approach of "build an AST Locator from a composer.json" works quite well, so I'd defer memoization to post-5.x release, since #772 made it much simpler to implement memoization (also for internals)

@Ocramius
Copy link
Member

Closing here: IMO, the DirectoriesSourceLocator covers its niche quite alright, and the composer-based source locator is better for composer-friendly projects.

@Ocramius Ocramius self-assigned this Oct 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants