Skip to content

GenericLexerExtension

Olivier Duhart edited this page Feb 21, 2018 · 15 revisions

Generic Lexer Extensions

For performance purpose the generic lexer limits the lexems definitions. Nevertheless an extension mechanism provides a way to add new lexem pattern relying on the FSM backing the generic lexer. To extend a lexer we need to add transitions and nodes to the underlying FSM.

New generic tokens are denoted with the specifice GenericToken.Extension value. For instance we can define a date lexem like this

public enum Extensions {
	[Lexeme(GenericToken.Extension)] 
	DATE,

	[Lexeme(GenericToken.Double)] 
	DOUBLE,
}

building extensions

The extensions are build in a callback function called for every token badged with the GenericToken.Extension value. The callback signature is a delegate

 
public delegate void BuildExtension<IN>( IN token, LexemeAttribute lexem,  GenericLexer<IN> lexer) where IN : struct;

where :

  • IN is the enum name used for the lexer (Extensions above)
  • token is the token to build the extension for
  • lexem is the lexem attribute : it allows to get optional parameters for the lexem through lexem.GenericTokenParameters (string[])
  • lexer is the lexer.
predefined FSM

diagram representing the maximal predifined FSM. list of named nodes.

adding states and transitions

API goto transition* + precondition end callback

Building the extended lexer

	BuildExtension<Extensions> extensionBuilder = (Extensions token, LexemeAttribute lexem, GenericLexer<Extensions> lexer) => {
            if (token == Extensions.DATE) {
				// do something interesting here
			}
	}
	
 var lexerRes = LexerBuilder.BuildLexer<Extensions>(new BuildResult<ILexer<Extensions>>(), extensionBuilder);