Mitigate errors in reporting grammars that can cause the parser to run indefinetely #848

Tartasprint · 2023-04-26T23:13:50Z

The bugs up there are now fixed. But new were found.

The first change I made was fixing a typo at this line:

if ident == "soi" || ident == "eoi" { => if ident == "SOI" || ident == "EOI" {

Since the keywords are the uppercase ones, and that I don't think keywords are case insensitive, this was only a typo. But I'm not sure about that.

Since pest_meta::validator::is_non_failing and pest_meta::validator::is_non_progressing are very similar I fixed/analyzed both of them.
All the edge cases (I hope) in those two functions have been either treated or commented. Here is a summary of the cases that were only commented (marked either by BUG if there is an example of a bug or by WARNING otherwise) and should be addressed:

4 bugs due to POP_ALL, PEEK_ALL, PEEK_SLICE being both non failing and non consuming when the stack is empty (or too small for PEEK_SLICE)
1 bug in is_non_failing due to the negative lookahead: in case the inner expression always fail, the negative lookahead never fails.
1 bug in is_non_progressing: the rule r = @{PUSH("") ~ POP} is both non-progressing and non-failing but is not detected, and causes the parser to run indefinitely on the rule fe = @{(PUSH("")~POP)*} or the second choice of unreachable = @{(PUSH("")~POP)|"other"}. This bug is not commented in the code.
1 warning in is_non_progressing: the lookahead expressions don't consume the input, but they might consume the stack if instructions such as POP or DROP are present inside. This might be seen as progression: the stack length decreases strictly, converging towards the end of parsing. The problem is that the parser does not follow this behavior.
2 warnings relative to recursion. Those are harder to prove correct or incorrect and need more checking.
1 warning about handling a rule that is undefined in is_non_failing. The rules have been checked to be defined earlier (with validate_undefined), but since this souldn't happen we might want to panic or to return an error.
1 warning about the current use of is_non_progressing and is_non_failing in the other validation steps, since all the assumptions the two functions make are not fullfiled

I commented and justified most of the choices because those two functions can be quite tricky (to me at least) and to allow verification of those choices by others.

There might be considerable performance enhancements by running validators (or at least these two validation steps) on all the rules, while updating meta data (e.g. flags for is_non_failing and is_non_progressing). This way we avoid checking multiple times the same expressions. I can propose/give more detail on those changes later.

It seems strange to propose changes knowing that there are bugs, but since many other are being fixed, and that those bugs were already there (unnoticed), I hope they will be accepted.

… including checking their usage

tomtau

Thanks and congrats on your first PR here! 🎆 Can you add new tests for the bugs that were fixed in this PR (if they aren't covered by existing tests)?

Tartasprint · 2023-04-27T20:20:16Z

The is_non_failing and is_non_progressing functions have now most of the cases tested.
I did not write tests for the bugs/warnings listed up there since I don't have a way for fixing them right now.

Tartasprint · 2023-04-27T22:49:42Z

I found a new bug, I added it in the list up there

1 bug in is_non_progressing: the rule r = @{PUSH("") ~ POP} is both non-progressing and non-failing but is not detected, and causes the parser to run indefinitely on the rule fe = @{(PUSH("")~POP)*} or the second choice of unreachable = @{(PUSH("")~POP)|"other"}. This bug is not commented in the code.

tomtau

looking good -- just a few small fmt/linter fixes: https://github.com/pest-parser/pest/actions/runs/4824023698/jobs/8596311317?pr=848#step:4:10
can you run cargo fmt --all and cargo clippy --all --fix (that should automatically fix it, otherwise you can see the report and apply the suggestions)?

for the msrv error, it seems unrelated and I opened this PR: #850

Tartasprint added 3 commits April 26, 2023 15:31

fixed pest_meta::non_progressing

5c69ba2

deeper analysis of is_non_progressing

3ffe396

analysis of is_non_progressing and is_non_failing should be complete,…

a887b62

… including checking their usage

Tartasprint requested a review from a team as a code owner April 26, 2023 23:13

Tartasprint requested review from tomtau and removed request for a team April 26, 2023 23:13

tomtau requested changes Apr 26, 2023

View reviewed changes

tomtau requested review from dragostis and CAD97 April 26, 2023 23:58

written some tests and fixed some bugs

680bc6f

Tartasprint requested a review from tomtau April 27, 2023 20:21

tomtau requested changes Apr 28, 2023

View reviewed changes

formatting

0c253be

tomtau merged commit 7f8411a into pest-parser:master Apr 28, 2023

Tartasprint deleted the fix-non-progressing branch April 28, 2023 13:47

Tartasprint mentioned this pull request Apr 28, 2023

[BUG] Pest.rs hangs indefinitely with this grammar #571

Closed

tomtau mentioned this pull request Jun 21, 2023

Trivial left recursion not detected in presence of tag #875

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mitigate errors in reporting grammars that can cause the parser to run indefinetely #848

Mitigate errors in reporting grammars that can cause the parser to run indefinetely #848

Tartasprint commented Apr 26, 2023 •

edited

Loading

tomtau left a comment •

edited

Loading

Tartasprint commented Apr 27, 2023

Tartasprint commented Apr 27, 2023

tomtau left a comment

Mitigate errors in reporting grammars that can cause the parser to run indefinetely #848

Mitigate errors in reporting grammars that can cause the parser to run indefinetely #848

Conversation

Tartasprint commented Apr 26, 2023 • edited Loading

tomtau left a comment • edited Loading

Choose a reason for hiding this comment

Tartasprint commented Apr 27, 2023

Tartasprint commented Apr 27, 2023

tomtau left a comment

Choose a reason for hiding this comment

Tartasprint commented Apr 26, 2023 •

edited

Loading

tomtau left a comment •

edited

Loading