Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use generators #43

Merged
merged 1 commit into from
Apr 18, 2020
Merged

Use generators #43

merged 1 commit into from
Apr 18, 2020

Conversation

greg0ire
Copy link
Member

My goal with the second commit was to use generators for getFilteredTokens, and then I realized it would not be possible: there are lookaheads up to 250 tokens ahead to decide whether to stay inline or to add linebreaks. It made me move a lot of logic to the value objects though, and get rid of the $indexedTokens ugly array.

@greg0ire greg0ire force-pushed the use-generators branch 2 times, most recently from bda68e6 to 63dbfd4 Compare April 12, 2020 16:19
@greg0ire
Copy link
Member Author

Actually I'm going to remove the second commit (for now): I'm getting really weird behavior, and I think it might have to do with the caching that takes place in the Tokenizer class.

@greg0ire greg0ire force-pushed the use-generators branch 2 times, most recently from 906ae71 to 4030a11 Compare April 12, 2020 18:53
@greg0ire
Copy link
Member Author

The cache is removed with #44 , so I'm pushing the commit that introduces awareness of previous and next tokens again

@greg0ire greg0ire force-pushed the use-generators branch 3 times, most recently from 10508df to c2262a6 Compare April 12, 2020 19:00
src/Token.php Outdated
{
$this->type = $type;
$this->value = $value;
if ($previous !== null) {
$previous->next = $this;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that a Token is really valid only if there is no token after it, or if you just finished building that token. Combined with generators, this produces bugs, and that's why I couldn't use one for getFilteredTokens… this might be a bad idea. Right now the program works because calls are made carefully, but this is flawed I'm afraid. I think I am going to have to rethink all of this.

Copy link
Member Author

@greg0ire greg0ire Apr 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I fixed it now, with exceptions that prevent rewiring or accessing next - related features until they are available.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keeping as a draft until #45 is merged

@greg0ire greg0ire marked this pull request as draft April 12, 2020 19:22
@greg0ire greg0ire force-pushed the use-generators branch 2 times, most recently from c639284 to 05128b3 Compare April 14, 2020 16:17
@greg0ire greg0ire marked this pull request as ready for review April 14, 2020 16:17
Copy link
Member

@goetas goetas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code wise the pull request looks good.

However getFilteredTokens makes in vain the use of generators on the Tokenizer as all the tokens end up in an array.

Another note is how complex is the token object. It has the knowledge of neighbors an more. I understand that the library needs that knowledge, but maybe this is the wrong solution for the problem.

In Masterminds/html5-php, another library I maintain, the Tokenizer https://github.com/Masterminds/html5-php/blob/master/src/HTML5/Parser/Tokenizer.php does not give you a way to get all the tokens, but it offers some methods to get the current, look around and move the cursor. With that method, all the tokenization work is done there.
The consumer class has no knowledge on how the tokens are stored internally.

I think that is a much cleaner approach. (of course it requires a bit more Refactoring on this lib to implement something similar)

@greg0ire
Copy link
Member Author

I spent a crazy amount of time last week-end on this, and it was quite satisfying and fun, but I have the same feeling as you do: the usage of generators is in vain, it brings some clarity and satisfaction in the Tokenizer, but at the price of some danger, that I tried to mitigate with the exceptions in the latest version. I think you are right, and I'm going to have a look at that other library. I'm going to drop the 2 first commits of this PR (I think we can all agree the last one is ok).

@greg0ire
Copy link
Member Author

Not sure why Psalm fails, I think there is a bug with the stub it has for array_keys (I observed it yesterday while contributing to Psalm).

goetas
goetas previously approved these changes Apr 18, 2020
Copy link
Member

@goetas goetas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with the changes, I leave it up to you on how to fix psalm

@greg0ire greg0ire merged commit cb57b3f into doctrine:master Apr 18, 2020
@greg0ire greg0ire deleted the use-generators branch April 18, 2020 06:30
greg0ire added a commit to greg0ire/dbal that referenced this pull request Apr 18, 2020
I just got bit by a bug on master at
doctrine/sql-formatter#43 (comment),
and there are good chances we have the bug on the Dbal too. Anyway, it
only makes sense to use the same version as in the vendors.
greg0ire added a commit to greg0ire/dbal that referenced this pull request Apr 18, 2020
I just got bitten by a bug on master at
doctrine/sql-formatter#43 (comment),
and there are good chances we have the bug on the Dbal too. Anyway, it
only makes sense to use the same version as in the vendors.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants