-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Use a non-regex parser #138
Labels
Comments
This comment has been minimized.
This comment has been minimized.
Closed
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This was referenced Nov 8, 2017
This comment has been minimized.
This comment has been minimized.
epage
added a commit
to epage/liquid-rust
that referenced
this issue
Jan 20, 2018
Fixes cobalt-org#105, mostly. You get enough context. This is also more universal (hard to track line numbers in every context). Any further improvements will be a part of cobalt-org#138. This locks mostly locks down the error API, minus `FilterError` which won't happen until filters get refactored. See cobalt-org#114.
This was referenced Jan 20, 2018
Looks like nom can take in some amount of state. How much we can "plugin" other parsers, no idea. |
Merged
Goncalerta
added a commit
to Goncalerta/liquid-rust
that referenced
this issue
Nov 30, 2018
Replace the old regex-based parser with a pest-based one. Introduce line/column context in some (syntax) errors (see cobalt-org#232). closes cobalt-org#138 fixes cobalt-org#145 fixes cobalt-org#226 fixes cobalt-org#227 fixes cobalt-org#242 BREAKING CHANGES Behavior - Expressions no longer support tags (they weren't supposed to) - More strictness in tokens accepted (Tags will raise an error if given a surplus of arguments. This is to alert the user for possible mistakes) API - Changed signature for tags and blocks - `compiler::parse` takes a `&str` directly as an argument, instead of requiring the midstep of `tokenize`
Re-openeing because I want to consider this topic some more. |
3 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This is a possible enhancement that could make it easier to address some of our bugs and missing features.
The downside is that shopofy/liquid uses regex iirc so it'll be more difficult to follow their grammar. I have heard that there are some differences in the regex engines that already cause some problems.
Goals
{% include {{ page.my_variable }} %}
) to our Jekyll-styleinclude
#142)Background
There are really to main issues with the current parser
The parser right now is cobbled together functions that have too much logic in them, causing others to duplicate it. For example, the for-block has named arguments but not filters, because we don't have reusable argument parsing. The parser is also inconsistent on how it consumes tokens. A consistent, composible model would be a big help.
Solutions
Patch up the current approach.
The parser could be made more maintainable while keeping the current token stream if desired. We could make combinators for token processing, that return where they left off parsing (maybe even on failure like nom's IResult). We can then refactor all of our parsers into using these combinators so we can have an easier to maintain code base.
Parser comparison
Benchmarks
Good
Bad
Line numbers
Good:
nom_locate
Composability
Can plugins reuse the grammar without centralizing all token processing which also hurts correctness?
Good:
Bad:
Partial parsing
An idea to improve performance and API stability.
Good:
Maintenance
Bad
Brittleness
Bad
Ergonomics and Docs
Good
Mixed
The text was updated successfully, but these errors were encountered: