-
-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mitigate errors in reporting grammars that can cause the parser to run indefinetely #848
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks and congrats on your first PR here! 🎆 Can you add new tests for the bugs that were fixed in this PR (if they aren't covered by existing tests)?
The |
I found a new bug, I added it in the list up there
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking good -- just a few small fmt/linter fixes: https://github.com/pest-parser/pest/actions/runs/4824023698/jobs/8596311317?pr=848#step:4:10
can you run cargo fmt --all
and cargo clippy --all --fix
(that should automatically fix it, otherwise you can see the report and apply the suggestions)?
for the msrv error, it seems unrelated and I opened this PR: #850
Fixes #830 #571
The bugs up there are now fixed. But new were found.
The first change I made was fixing a typo at this line:
Since the keywords are the uppercase ones, and that I don't think keywords are case insensitive, this was only a typo. But I'm not sure about that.
Since
pest_meta::validator::is_non_failing
andpest_meta::validator::is_non_progressing
are very similar I fixed/analyzed both of them.All the edge cases (I hope) in those two functions have been either treated or commented. Here is a summary of the cases that were only commented (marked either by
BUG
if there is an example of a bug or byWARNING
otherwise) and should be addressed:POP_ALL
,PEEK_ALL
,PEEK_SLICE
being both non failing and non consuming when the stack is empty (or too small forPEEK_SLICE
)is_non_failing
due to the negative lookahead: in case the inner expression always fail, the negative lookahead never fails.is_non_progressing
: the ruler = @{PUSH("") ~ POP}
is both non-progressing and non-failing but is not detected, and causes the parser to run indefinitely on the rulefe = @{(PUSH("")~POP)*}
or the second choice ofunreachable = @{(PUSH("")~POP)|"other"}
. This bug is not commented in the code.is_non_progressing
: the lookahead expressions don't consume the input, but they might consume the stack if instructions such asPOP
orDROP
are present inside. This might be seen as progression: the stack length decreases strictly, converging towards the end of parsing. The problem is that the parser does not follow this behavior.is_non_failing
. The rules have been checked to be defined earlier (withvalidate_undefined
), but since this souldn't happen we might want to panic or to return an error.is_non_progressing
andis_non_failing
in the other validation steps, since all the assumptions the two functions make are not fullfiledI commented and justified most of the choices because those two functions can be quite tricky (to me at least) and to allow verification of those choices by others.
There might be considerable performance enhancements by running validators (or at least these two validation steps) on all the rules, while updating meta data (e.g. flags for
is_non_failing
andis_non_progressing
). This way we avoid checking multiple times the same expressions. I can propose/give more detail on those changes later.It seems strange to propose changes knowing that there are bugs, but since many other are being fixed, and that those bugs were already there (unnoticed), I hope they will be accepted.