-
-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support left recursion syntax #533
base: master
Are you sure you want to change the base?
Conversation
There is still some problem about the loop, sometime it loops > 255 and gets panicked. |
Hi @dragostis, this PR is in concept-complete and in minimal viable state, despite–I'm sure–still not covering edge cases especially in cases' validations. But, since this PR is actually violating PEG principle "non-backtracking", I want to know what's your opinion as, I believe, this project's maintainer. |
Any chance for merge this? I want to try it, but I don't know how to configure I have imported the dependence pest = { version = "2.1", git = "https://github.com/Abdillah/pest.git" } expr = *{
"(" ~ expr ~ ")"
| expr ~ ("/"|"*") ~ expr
| expr ~ ("+"|"-") ~ expr
} error parsing
--> 40:9
|
40 | expr = *{␊
| ^---
|
= expected `{`, `_`, `@`, `$`, or `!`
--> 42:5
|
42 | | expr ~ ("*"|"/") ~ expr␊
| ^--^
|
= rule expr is left-recursive (expr -> expr); pest::prec_climber might be useful in this case
error: failed to run custom build command for `pest_meta v2.1.3 (https://github.com/Abdillah/pest.git?branch=leftrecursive#6390c304)` |
It seems that two branches cannot be both rec. program = {SOI ~ statement* ~ EOI}
statement = { expr ~ ";"? }
expr = *{
expr ~ "*" ~ expr
| expr ~ "+" ~ expr
| integer
}
integer = @{"0"|ASCII_NONZERO_DIGIT ~ ("_"? ~ ASCII_DIGIT)*}
WHITESPACE = ${WHITE_SPACE|NEWLINE} when input attempt to add with overflow
thread 'item' panicked at 'attempt to add with overflow', D:\Rust\pest\pest\src\parser_state.rs:509:17
stack backtrace:
0: std::panicking::begin_panic_handler
at /rustc/d7c97a02d1215e4ef26c31cb72dbaf16fd548b2c\/library\std\src\panicking.rs:517
1: core::panicking::panic_fmt
at /rustc/d7c97a02d1215e4ef26c31cb72dbaf16fd548b2c\/library\core\src\panicking.rs:100
2: core::panicking::panic
at /rustc/d7c97a02d1215e4ef26c31cb72dbaf16fd548b2c\/library\core\src\panicking.rs:50
3: pest::parser_state::ParserState<enum$<lists::Rule> >::recursive<enum$<lists::Rule>,lists::impl$0::parse::rules::visible::expr::closure$0::closure$0>
at D:\Rust\pest\pest\src\parser_state.rs:509
4: lists::impl$0::parse::rules::visible::expr::closure$0
at .\tests\lists.rs:15
5: pest::parser_state::ParserState<enum$<lists::Rule> >::rule<enum$<lists::Rule>,lists::impl$0::parse::rules::visible::expr::closure$0>
at D:\Rust\pest\pest\src\parser_state.rs:225
6: lists::impl$0::parse::rules::visible::expr
at .\tests\lists.rs:15
7: lists::impl$0::parse::rules::visible::statement::closure$0::closure$0
at .\tests\lists.rs:15
8: pest::parser_state::ParserState<enum$<lists::Rule> >::sequence<enum$<lists::Rule>,lists::impl$0::parse::rules::visible::statement::closure$0::closure$0>
at D:\Rust\pest\pest\src\parser_state.rs:376
9: lists::impl$0::parse::rules::visible::statement::closure$0
at .\tests\lists.rs:15
10: pest::parser_state::ParserState<enum$<lists::Rule> >::rule<enum$<lists::Rule>,lists::impl$0::parse::rules::visible::statement::closure$0>
at D:\Rust\pest\pest\src\parser_state.rs:225
11: lists::impl$0::parse::rules::visible::statement
at .\tests\lists.rs:15
12: lists::impl$0::parse::rules::visible::program::closure$0::closure$0::closure$1::closure$0::closure$0
at .\tests\lists.rs:15
13: pest::parser_state::ParserState<enum$<lists::Rule> >::optional<enum$<lists::Rule>,lists::impl$0::parse::rules::visible::program::closure$0::closure$0::closure$1::closure$0::closure$0>
at D:\Rust\pest\pest\src\parser_state.rs:463
14: lists::impl$0::parse::rules::visible::program::closure$0::closure$0::closure$1::closure$0
at .\tests\lists.rs:15
15: pest::parser_state::ParserState<enum$<lists::Rule> >::sequence<enum$<lists::Rule>,lists::impl$0::parse::rules::visible::program::closure$0::closure$0::closure$1::closure$0>
at D:\Rust\pest\pest\src\parser_state.rs:376
16: lists::impl$0::parse::rules::visible::program::closure$0::closure$0::closure$1
at .\tests\lists.rs:15
17: enum$<core::result::Result<alloc::boxed::Box<pest::parser_state::ParserState<enum$<lists::Rule> >,alloc::alloc::Global>,alloc::boxed::Box<pest::parser_state::ParserState<enum$<lists::Rule> >,alloc::alloc::Global> > >::and_then<alloc::boxed::Box<pest::pars
at /rustc/d7c97a02d1215e4ef26c31cb72dbaf16fd548b2c\library\core\src\result.rs:965
18: lists::impl$0::parse::rules::visible::program::closure$0::closure$0
at .\tests\lists.rs:15
19: pest::parser_state::ParserState<enum$<lists::Rule> >::sequence<enum$<lists::Rule>,lists::impl$0::parse::rules::visible::program::closure$0::closure$0>
at D:\Rust\pest\pest\src\parser_state.rs:376
20: lists::impl$0::parse::rules::visible::program::closure$0
at .\tests\lists.rs:15
21: pest::parser_state::ParserState<enum$<lists::Rule> >::rule<enum$<lists::Rule>,lists::impl$0::parse::rules::visible::program::closure$0>
at D:\Rust\pest\pest\src\parser_state.rs:225
22: lists::impl$0::parse::rules::visible::program
at .\tests\lists.rs:15
23: lists::impl$0::parse::closure$0
at .\tests\lists.rs:15
24: pest::parser_state::state<enum$<lists::Rule>,lists::impl$0::parse::closure$0>
at D:\Rust\pest\pest\src\parser_state.rs:87
25: lists::impl$0::parse
at .\tests\lists.rs:15
26: lists::item
at .\tests\lists.rs:22
27: lists::item::closure$0
at .\tests\lists.rs:20
28: core::ops::function::FnOnce::call_once<lists::item::closure$0,tuple$<> >
at /rustc/d7c97a02d1215e4ef26c31cb72dbaf16fd548b2c\library\core\src\ops\function.rs:227
29: core::ops::function::FnOnce::call_once
at /rustc/d7c97a02d1215e4ef26c31cb72dbaf16fd548b2c\library\core\src\ops\function.rs:227
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace. |
Well, it seems it needs some works. Thank you for testing. |
I wonder if using the algorithm described in this paper would help the looping and backtracking issues. |
Personally, I see PEG as a way to declaratively specify a parser, moreso than a grammar. This is because of its "first branch wins" behavior, which matches what you'd first write when writing a recursive descent parser. Any real use of pest as a parser generator is almost certainly going to want to post process the pest parse tree to construct an abstract syntax tree. (You can provide an AST view into a parse tree, but this is only realistically useful if you have a concrete syntax tree (basically: lossless AST), rather than a parse tree.) If we can support left recursion without the cost of extra looping/backtracking, as described in NoahTheDuke's linked paper or the Pegged wiki, I'm fine with merging support for left recursive rules, and removing the requirement that rules aren't left recursive. But if it requires backtracking, I'm against, even with colored rules. (Mainly, because how does it handle hidden/mutual/indirect left recursion?) |
I too would appreciate this kind of feature. What blocks support of this being added? |
Add support to left recursion PEG by adding marker syntax.
Inspired by this awesome paper https://www.sciencedirect.com/science/article/pii/S0167642314000288.
Background
Pest's PEG implementation not supporting left recursive syntax. So, anything that repeated on the left-most of a sequence will trigger the validation as semantic error. As a result, our syntax definition needs to be adjusted using many ways. But, there are cases where workaround is not proper at all (at least as far as I know). For example on language with first-class function.
When using pest, we need to adjust the syntax as to match,
which will cause the resulting AST far from expectation.
Changes
Supporting left-recursive syntax is explained in more details in the paper, but I want to iterate my implementation.
This PR adds recursion consume handler, rule modifier syntax, and code generation between them. The recursion consume handler is basically testing the input against the rule, if the rule checks out it will try more deeper into the recursion. If the rule at last failed, it will returns the last result from the trial. In addition, this handler also protects against infinite recursion.
Concern
The implemented recursion handler is a trial-and-error loop on the level of the recursion. So, it's violating PEG non-backtracking principle. In my opinion, because we mark the rule as recursive, this makes the user aware of this feature and maybe its consequences on performance (if any).
Checklist