Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added check to ignore semicolon in parser #913

Merged
merged 4 commits into from
Nov 8, 2020

Conversation

AngelOnFira
Copy link
Contributor

This Pull Request closes #892.

It changes the following:

  • Empty statements are now parsed without error
  • Semicolons are no longer skipped at the end of the parse loop

In the original issue, it was mentioned that there might be a problem with empty StatementLists as well. I couldn't find a case where it didn't work as is, but I still brought the semicolon skip up a little higher.

I'm not sure if any specific tests should be written for this, but if so I can add them.

This PR improves conformance by 0.07% 😄

@codecov
Copy link

codecov bot commented Oct 24, 2020

Codecov Report

Merging #913 into master will increase coverage by 0.03%.
The diff coverage is 76.44%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #913      +/-   ##
==========================================
+ Coverage   59.18%   59.21%   +0.03%     
==========================================
  Files         166      166              
  Lines       10515    10554      +39     
==========================================
+ Hits         6223     6250      +27     
- Misses       4292     4304      +12     
Impacted Files Coverage Δ
boa/examples/classes.rs 0.00% <0.00%> (ø)
boa/src/builtins/boolean/mod.rs 30.00% <0.00%> (ø)
boa/src/builtins/map/ordered_map.rs 63.63% <ø> (ø)
boa/src/class.rs 0.00% <0.00%> (ø)
.../src/environment/declarative_environment_record.rs 38.66% <ø> (ø)
boa/src/environment/function_environment_record.rs 35.10% <ø> (ø)
boa/src/environment/global_environment_record.rs 27.95% <ø> (ø)
boa/src/environment/object_environment_record.rs 23.07% <ø> (ø)
boa/src/exec/mod.rs 57.14% <ø> (ø)
boa/src/object/mod.rs 46.45% <ø> (ø)
... and 85 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4eb2ed4...5d0ab83. Read the comment docs.

@AngelOnFira AngelOnFira force-pushed the implement-empty-statements branch from 1e67661 to 6f8663b Compare October 24, 2020 16:08
@RageKnify RageKnify requested review from Lan2u and Razican October 25, 2020 23:17
Copy link

@Lan2u Lan2u left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good! Just a couple of minor things which might be nice to include :)

Some tests showing the empty statement parsing etc. would also be nice - if the code is going to change later I'd argue they are even more valuable

boa/src/syntax/parser/statement/mod.rs Outdated Show resolved Hide resolved
@croraf
Copy link
Contributor

croraf commented Oct 26, 2020

I suggested on Discord to split this into two PRs. This one with a parse method refactor. And the new one with the EmptyStatement correct parsing solution.

For the second one the idea is not to skip the semicolon but to parse it into EmptyStatement node, as does esprima.org parser:
image

So I would move the following https://github.com/boa-dev/boa/pull/913/files#diff-42cd36e93bb156b2ab7a6277865a1b01d012dbc15af014ef25492f43b1343227R281-R284 down to the Statement.parse (and replace it with EmptyStatement creation)

@Razican
Copy link
Member

Razican commented Oct 26, 2020

I will review this as soon as I have time, hopefully today.

I suggested on Discord to split this into two PRs. This one with a parse method refactor. And the new one with the EmptyStatement correct parsing solution.

For the second one the idea is not to skip the semicolon but to parse it into EmptyStatement node, as does esprima.org parser

This makes sense from a Parser-only perspective, but the issue with this is that adds extra allocations in the StatementList, by adding some EmptyStatement nodes that will need to be dynamically called run() during the execution, which will be just a no-op, but will increase cache misses and memory usage, note that all node variants have the same size in memory.

The current version just ignores EmptyStatements, and will not add them to an StatementList, which is far more efficient. And having an EmptyStatement in the StatementList doesn't really provide any useful information. The only use case I can think of would be if we are trying to re-generate the original code, but we are already ignoring comments, for example, so this is currently not possible, and I think it should be the application who extracts the original code again if needed.

In my opinion we shouldn't add an EmptyStatement parsing and node, for performance reasons.

@croraf
Copy link
Contributor

croraf commented Oct 26, 2020

I will review this as soon as I have time, hopefully today.

I suggested on Discord to split this into two PRs. This one with a parse method refactor. And the new one with the EmptyStatement correct parsing solution.
For the second one the idea is not to skip the semicolon but to parse it into EmptyStatement node, as does esprima.org parser

This makes sense from a Parser-only perspective, but the issue with this is that adds extra allocations in the StatementList, by adding some EmptyStatement nodes that will need to be dynamically called run() during the execution, which will be just a no-op, but will increase cache misses and memory usage, note that all node variants have the same size in memory.

The current version just ignores EmptyStatements, and will not add them to an StatementList, which is far more efficient. And having an EmptyStatement in the StatementList doesn't really provide any useful information. The only use case I can think of would be if we are trying to re-generate the original code, but we are already ignoring comments, for example, so this is currently not possible, and I think it should be the application who extracts the original code again if needed.

In my opinion we shouldn't add an EmptyStatement parsing and node, for performance reasons.

We will then not conform to the parsing and evaluation described in the ES, which possibly introduces errors and makes it harder to follow.

Copy link
Member

@Razican Razican left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this new implementation, should solve the issue we are having with empty statements :) Just check my comments to see if we can improve it further.

@@ -139,6 +139,9 @@ where
}
}

/// The possible TokenKind which indicate the end of a case statement.
const SCRIPT_BREAK_TOKENS: [TokenKind; 1] = [TokenKind::Punctuator(Punctuator::CloseBlock)];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this defined somewhere? I believe a script will never end with a }, right? I think this list should be empty, and in that case, we should add the exception to the parser so that it doesn't fail with an AbruptEnd error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I just added it cause I thought I needed to. I'm going to reassess each one and see what it's proper closing should be.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think this should be resolved. I'm not sure I understand what an exception should be, I ran the tests and I'm getting the same percentage. Can you give me an example of where this might occur?

boa/src/syntax/parser/statement/mod.rs Outdated Show resolved Hide resolved
boa/src/syntax/parser/statement/mod.rs Outdated Show resolved Hide resolved
@Razican
Copy link
Member

Razican commented Oct 28, 2020

After some discussion on discord, it seems that EmptyStatements are indeed important, and they will be implemented fully in the future, as @croraf mentions. For now, I would merge this as soon as the review comments are solved.

@croraf
Copy link
Contributor

croraf commented Oct 28, 2020

Can we merge just the refactor part (which is the majority of this PR), without the semicolon part? (after resolving the comments ofc)

@AngelOnFira
Copy link
Contributor Author

Sorry @croraf I'll be working through the comments today, just got hit with a ton of work this week 😅

@croraf
Copy link
Contributor

croraf commented Oct 28, 2020

@AngelOnFira No hurry, we were just discussing.

@AngelOnFira AngelOnFira force-pushed the implement-empty-statements branch from baae916 to d59e1bc Compare November 4, 2020 15:47
@AngelOnFira AngelOnFira force-pushed the implement-empty-statements branch from d59e1bc to 544e194 Compare November 4, 2020 15:49
@AngelOnFira
Copy link
Contributor Author

Ok @croraf I've reverted the semicolon changes for this PR. @Razican I believe I've addressed all of your comments, but a few need a bit more discussion to confirm that they are done. Please let me know if there is anything else to take care of! 😄

@croraf
Copy link
Contributor

croraf commented Nov 4, 2020

Cool. After this bigger addressing you can make a new PR for the semicolon changes if you wish, it should be very easy to solve that.

@AngelOnFira
Copy link
Contributor Author

Oops still have to make the tests :P

@AngelOnFira
Copy link
Contributor Author

Actually I think the tests don't apply here since this is just a refactor.

@AngelOnFira
Copy link
Contributor Author

@Razican @Lan2u Just want to check in. I think this PR is ready to go, but please ping me if anything else needs to be done 😄

Copy link
Member

@Razican Razican left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

Test262 conformance changes:

Test result master count PR count difference
Total 78,415 78,415 0
Passed 18,945 18,945 0
Ignored 15,547 15,547 0
Failed 43,923 43,923 0
Panics 1,127 1,127 0
Conformance 24.16 24.16 0.00%

@Razican Razican merged commit 0a0c230 into boa-dev:master Nov 8, 2020
@croraf
Copy link
Contributor

croraf commented Nov 8, 2020

@AngelOnFira Gj. Are you willing to make another PR with the semicolon issue fix?

@AngelOnFira
Copy link
Contributor Author

@croraf ya, I'll try to do that now that this is merged.

@AngelOnFira AngelOnFira deleted the implement-empty-statements branch November 8, 2020 21:07
@RageKnify RageKnify added this to the v0.11.0 milestone Jan 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Empty statements not fully implemented
5 participants