-
Notifications
You must be signed in to change notification settings - Fork 452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changing lexer mode from parser fails #165
Comments
TL;DR: not a jison bug but a grammar bug.
Unless I'm completely mistaken, this is very strange, as it would mean that bison would perform deep semantic analysis of the action code -- last time I looked, it didn't. What you probably mean is that, if you want this classic 'lexer hack' (better known as 'lexical tie-in'; not to be mistaken with 'the lexer hack') to work in the bison/yacc family, you can place action code blocks inside the rule (a.k.a. 'mid-rule action') at a place where you know that no look-ahead is required to reduce the rule up to that point, i.e.
This way of coding the grammar would/should work, as now the grammar generator can run the action, which switches lexer mode, when the Actions, by definition, can only be run when the matching rule has been completely matched (reduced) up to that point, which in this grammar's case means you require the lexer+parser to 'look-ahead' after START to match the Unfortunately, jison doesn't support mid-rule actions like the example above so you need to rewrite the grammar manually to provide a rule set that is LA(0) (i.e. 'no look-ahead required to reduce the rule') at the appropriate places. (You'll need to be careful anyway, because what you're doing is the equivalent of a lexical tie-in, which is also documented in the bison manual here: http://www.gnu.org/software/bison/manual/html_node/Lexical-Tie_002dins.html#Lexical-Tie_002dins and don't forget to read the next section in that manual as well: that should make you realize that even if jison had mid-rule action support, it doesn't necessarily make it easier for you. Here's a potential rewrite (untested, hence 'potential'), where the important rules have been marked Also note that your sample grammar is 'odd' in the sense of 'human reader being able to discern what's legal from the grammar definition' as it /seems/ legal from only reading the grammar rules to feed it input If your intent was to produce a grammar which munches tokens (numbers and text) delineated by
|
Hmm, I was referring to this section of the manual, in which it states that the lookahead does not happen every time before reduction. That's what was throwing me the most about my debugging session with jison, as it always grabbed the next token before reducing, and I thought it shouldn't. (this could be me misunderstanding what my grammar requires) Last time I did this in Bison and ran the resulting parser in debug mode, I got output kind of like this:
I agree entirely with you though, this is a nasty little edge of these grammars, so if my example grammar is wrong it would not surprise me. In my actual grammar, not included due to its proprietary nature, I ended up shifting this complexity into the lexer. It used a temp var to append the contents of Therefore, this ticket is more of a documentation note for myself and others in the future. As such, your examples and explanations are much appreciated. |
Also relevant is this issue: GerHobbelt#3 as there a bugfix for 'default action' state handling is discussed as part of another behaviour which requires the same parser generator ability as the yacc-style 'lexer hack' in this issue: the parser generator must be able to fetch look-ahead from the lexer as late as possible and for that default action parser table rows are crucial as these describe states where the parser does not need any look-ahead to know what to do next, after reducing the already matched rule. WARNING: the referenced issue GerHobbelt#3 material is focused on the GerHobbelt fork; 'vanilla jison' there is a reference to the original zaach jison (i.e. this very repository)! |
Hence, given GerHobbelt#3, my earlier comment dated 2013-03-28 is almost certainly WRONG: you have, with high probability, run into a jison bug discussed in GerHobbelt#3 ! |
If your lexer has multiple states available to it, then these can be changed from within the parser rules by
yy.lexer.begin('state');
. However, the generated parser pulls the next token before running the code associated with a parse rule.This offsets when the state change occurs by one. The parser may expect the correct token, but the lexer cannot return it, since it has not switched state yet. Same for switching out of a state.
In bison, parse rules are run before the next token is pulled, which allows this to work. I've attached an example grammar, to demonstrate the problem.
The text was updated successfully, but these errors were encountered: