Added check to ignore semicolon in parser #913

AngelOnFira · 2020-10-24T15:36:30Z

This Pull Request closes #892.

It changes the following:

Empty statements are now parsed without error
Semicolons are no longer skipped at the end of the parse loop

In the original issue, it was mentioned that there might be a problem with empty StatementLists as well. I couldn't find a case where it didn't work as is, but I still brought the semicolon skip up a little higher.

I'm not sure if any specific tests should be written for this, but if so I can add them.

This PR improves conformance by 0.07% 😄

codecov · 2020-10-24T15:45:17Z

Codecov Report

Merging #913 into master will increase coverage by 0.03%.
The diff coverage is 76.44%.

@@            Coverage Diff             @@
##           master     #913      +/-   ##
==========================================
+ Coverage   59.18%   59.21%   +0.03%     
==========================================
  Files         166      166              
  Lines       10515    10554      +39     
==========================================
+ Hits         6223     6250      +27     
- Misses       4292     4304      +12

Impacted Files	Coverage Δ
boa/examples/classes.rs	`0.00% <0.00%> (ø)`
boa/src/builtins/boolean/mod.rs	`30.00% <0.00%> (ø)`
boa/src/builtins/map/ordered_map.rs	`63.63% <ø> (ø)`
boa/src/class.rs	`0.00% <0.00%> (ø)`
.../src/environment/declarative_environment_record.rs	`38.66% <ø> (ø)`
boa/src/environment/function_environment_record.rs	`35.10% <ø> (ø)`
boa/src/environment/global_environment_record.rs	`27.95% <ø> (ø)`
boa/src/environment/object_environment_record.rs	`23.07% <ø> (ø)`
boa/src/exec/mod.rs	`57.14% <ø> (ø)`
boa/src/object/mod.rs	`46.45% <ø> (ø)`
... and 85 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4eb2ed4...5d0ab83. Read the comment docs.

Lan2u

Looks really good! Just a couple of minor things which might be nice to include :)

Some tests showing the empty statement parsing etc. would also be nice - if the code is going to change later I'd argue they are even more valuable

boa/src/syntax/parser/statement/mod.rs

croraf · 2020-10-26T10:23:39Z

I suggested on Discord to split this into two PRs. This one with a parse method refactor. And the new one with the EmptyStatement correct parsing solution.

For the second one the idea is not to skip the semicolon but to parse it into EmptyStatement node, as does esprima.org parser:

So I would move the following https://github.com/boa-dev/boa/pull/913/files#diff-42cd36e93bb156b2ab7a6277865a1b01d012dbc15af014ef25492f43b1343227R281-R284 down to the Statement.parse (and replace it with EmptyStatement creation)

Razican · 2020-10-26T10:35:27Z

I will review this as soon as I have time, hopefully today.

I suggested on Discord to split this into two PRs. This one with a parse method refactor. And the new one with the EmptyStatement correct parsing solution.

For the second one the idea is not to skip the semicolon but to parse it into EmptyStatement node, as does esprima.org parser

This makes sense from a Parser-only perspective, but the issue with this is that adds extra allocations in the StatementList, by adding some EmptyStatement nodes that will need to be dynamically called run() during the execution, which will be just a no-op, but will increase cache misses and memory usage, note that all node variants have the same size in memory.

The current version just ignores EmptyStatements, and will not add them to an StatementList, which is far more efficient. And having an EmptyStatement in the StatementList doesn't really provide any useful information. The only use case I can think of would be if we are trying to re-generate the original code, but we are already ignoring comments, for example, so this is currently not possible, and I think it should be the application who extracts the original code again if needed.

In my opinion we shouldn't add an EmptyStatement parsing and node, for performance reasons.

croraf · 2020-10-26T10:49:34Z

I will review this as soon as I have time, hopefully today.

I suggested on Discord to split this into two PRs. This one with a parse method refactor. And the new one with the EmptyStatement correct parsing solution.
For the second one the idea is not to skip the semicolon but to parse it into EmptyStatement node, as does esprima.org parser

This makes sense from a Parser-only perspective, but the issue with this is that adds extra allocations in the StatementList, by adding some EmptyStatement nodes that will need to be dynamically called run() during the execution, which will be just a no-op, but will increase cache misses and memory usage, note that all node variants have the same size in memory.

The current version just ignores EmptyStatements, and will not add them to an StatementList, which is far more efficient. And having an EmptyStatement in the StatementList doesn't really provide any useful information. The only use case I can think of would be if we are trying to re-generate the original code, but we are already ignoring comments, for example, so this is currently not possible, and I think it should be the application who extracts the original code again if needed.

In my opinion we shouldn't add an EmptyStatement parsing and node, for performance reasons.

We will then not conform to the parsing and evaluation described in the ES, which possibly introduces errors and makes it harder to follow.

Razican

I like this new implementation, should solve the issue we are having with empty statements :) Just check my comments to see if we can improve it further.

Razican · 2020-10-26T14:48:52Z

boa/src/syntax/parser/mod.rs

@@ -139,6 +139,9 @@ where
    }
 }

+/// The possible TokenKind which indicate the end of a case statement.
+const SCRIPT_BREAK_TOKENS: [TokenKind; 1] = [TokenKind::Punctuator(Punctuator::CloseBlock)];


Is this defined somewhere? I believe a script will never end with a }, right? I think this list should be empty, and in that case, we should add the exception to the parser so that it doesn't fail with an AbruptEnd error.

No, I just added it cause I thought I needed to. I'm going to reassess each one and see what it's proper closing should be.

Ok, I think this should be resolved. I'm not sure I understand what an exception should be, I ran the tests and I'm getting the same percentage. Can you give me an example of where this might occur?

boa/src/syntax/parser/statement/mod.rs

Razican · 2020-10-28T13:31:46Z

After some discussion on discord, it seems that EmptyStatements are indeed important, and they will be implemented fully in the future, as @croraf mentions. For now, I would merge this as soon as the review comments are solved.

croraf · 2020-10-28T13:41:01Z

Can we merge just the refactor part (which is the majority of this PR), without the semicolon part? (after resolving the comments ofc)

AngelOnFira · 2020-10-28T17:03:00Z

Sorry @croraf I'll be working through the comments today, just got hit with a ton of work this week 😅

croraf · 2020-10-28T17:05:48Z

@AngelOnFira No hurry, we were just discussing.

…script end token

AngelOnFira · 2020-11-04T15:53:50Z

Ok @croraf I've reverted the semicolon changes for this PR. @Razican I believe I've addressed all of your comments, but a few need a bit more discussion to confirm that they are done. Please let me know if there is anything else to take care of! 😄

croraf · 2020-11-04T15:56:56Z

Cool. After this bigger addressing you can make a new PR for the semicolon changes if you wish, it should be very easy to solve that.

AngelOnFira · 2020-11-04T16:11:22Z

Oops still have to make the tests :P

AngelOnFira · 2020-11-04T17:21:13Z

Actually I think the tests don't apply here since this is just a refactor.

AngelOnFira · 2020-11-08T14:45:28Z

@Razican @Lan2u Just want to check in. I think this PR is ready to go, but please ping me if anything else needs to be done 😄

Razican

Looks good, thanks!

Test262 conformance changes:

Test result	master count	PR count	difference
Total	78,415	78,415	0
Passed	18,945	18,945	0
Ignored	15,547	15,547	0
Failed	43,923	43,923	0
Panics	1,127	1,127	0
Conformance	24.16	24.16	0.00%

croraf · 2020-11-08T18:11:33Z

@AngelOnFira Gj. Are you willing to make another PR with the semicolon issue fix?

AngelOnFira · 2020-11-08T21:07:27Z

@croraf ya, I'll try to do that now that this is merged.

Added check to ignore semicolon in parser

6f8663b

AngelOnFira force-pushed the implement-empty-statements branch from 1e67661 to 6f8663b Compare October 24, 2020 16:08

Combined parse and parse_general functions for StatementList

7f81172

RageKnify requested review from Lan2u and Razican October 25, 2020 23:17

Lan2u reviewed Oct 26, 2020

View reviewed changes

boa/src/syntax/parser/statement/mod.rs Outdated Show resolved Hide resolved

Razican reviewed Oct 26, 2020

View reviewed changes

AngelOnFira force-pushed the implement-empty-statements branch from baae916 to d59e1bc Compare November 4, 2020 15:47

Change to static lifetime, remove break when closing braces, removed …

544e194

…script end token

AngelOnFira force-pushed the implement-empty-statements branch from d59e1bc to 544e194 Compare November 4, 2020 15:49

Revert semicolon cursor skip

5d0ab83

Lan2u approved these changes Nov 8, 2020

View reviewed changes

Razican approved these changes Nov 8, 2020

View reviewed changes

Razican merged commit 0a0c230 into boa-dev:master Nov 8, 2020

AngelOnFira deleted the implement-empty-statements branch November 8, 2020 21:07

RageKnify added this to the v0.11.0 milestone Jan 10, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added check to ignore semicolon in parser #913

Added check to ignore semicolon in parser #913

AngelOnFira commented Oct 24, 2020

codecov bot commented Oct 24, 2020 •

edited

Loading

Lan2u left a comment •

edited

Loading

croraf commented Oct 26, 2020 •

edited

Loading

Razican commented Oct 26, 2020

croraf commented Oct 26, 2020

Razican left a comment •

edited

Loading

Razican Oct 26, 2020

AngelOnFira Nov 4, 2020

AngelOnFira Nov 4, 2020

Razican commented Oct 28, 2020

croraf commented Oct 28, 2020 •

edited

Loading

AngelOnFira commented Oct 28, 2020

croraf commented Oct 28, 2020

AngelOnFira commented Nov 4, 2020

croraf commented Nov 4, 2020

AngelOnFira commented Nov 4, 2020

AngelOnFira commented Nov 4, 2020

AngelOnFira commented Nov 8, 2020

Razican left a comment

croraf commented Nov 8, 2020

AngelOnFira commented Nov 8, 2020

Added check to ignore semicolon in parser #913

Added check to ignore semicolon in parser #913

Conversation

AngelOnFira commented Oct 24, 2020

codecov bot commented Oct 24, 2020 • edited Loading

Codecov Report

Lan2u left a comment • edited Loading

Choose a reason for hiding this comment

croraf commented Oct 26, 2020 • edited Loading

Razican commented Oct 26, 2020

croraf commented Oct 26, 2020

Razican left a comment • edited Loading

Choose a reason for hiding this comment

Razican Oct 26, 2020

Choose a reason for hiding this comment

AngelOnFira Nov 4, 2020

Choose a reason for hiding this comment

AngelOnFira Nov 4, 2020

Choose a reason for hiding this comment

Razican commented Oct 28, 2020

croraf commented Oct 28, 2020 • edited Loading

AngelOnFira commented Oct 28, 2020

croraf commented Oct 28, 2020

AngelOnFira commented Nov 4, 2020

croraf commented Nov 4, 2020

AngelOnFira commented Nov 4, 2020

AngelOnFira commented Nov 4, 2020

AngelOnFira commented Nov 8, 2020

Razican left a comment

Choose a reason for hiding this comment

Test262 conformance changes:

croraf commented Nov 8, 2020

AngelOnFira commented Nov 8, 2020

codecov bot commented Oct 24, 2020 •

edited

Loading

Lan2u left a comment •

edited

Loading

croraf commented Oct 26, 2020 •

edited

Loading

Razican left a comment •

edited

Loading

croraf commented Oct 28, 2020 •

edited

Loading