Skip to content
aldosa edited this page Sep 8, 2021 · 22 revisions

the project

This getting-started page will guide us through the implementation of a very simple parser: parse additions and return the sum.

For example

"1 + 2 + 3"
will be parsed to 
6

install csly

Install from the NuGet gallery GUI or with the Package Manager Console using the following command:

Install-Package sly

or with dotnet core

dotnet add package sly

Scan expressions and extract tokens

The first stage of a parser is the scanning step. From an input source the scanner extracts your language's tokens (words). In this example, our language will have a few simple words:

  • numbers: we will stay with integers
  • the "+" operator

We also need to add a skippable lexeme whitespace (WS) to makes the lexer ignore trivia. The whitespaces will not be sent to the parser to avoid cluttering the parser with whitespace terminals.

Csly encodes this token definitions as a mere C# enum annotated with C# attributes. Each token is annotated with a regular expressions matching it. Here is the full lexer:

public enum ExpressionToken {

    [Lexeme("[0-9]+")] 
    INT = 1,

    [Lexeme("\\+")] 
    PLUS = 2,

     [Lexeme("[ \\t]+",isSkippable:true)] // the lexeme is marked isSkippable : it will not be sent to the parser and simply discarded.
        WS = 3 

}

Define the language grammar

our grammar is quite easy (BNF notation) :

expression : INT 
expression : term PLUS expression
term : INT

Csly uses BNF notation and attaches visitor methods to each rule. Visitor methods have the following properties:

  • return type is the type of the parse result: int for our language,
  • parameters matches each clause of the right-hand side of a rule.

⚠️ read carefully the typing section to correctly write your visitor methods

here is the visitor for the first rule

    [Production("expression: INT")]
    public int Primary(Token<ExpressionToken> intToken)
    {
        return intToken.IntValue;
    }

whole parser is then

public class ExpressionParser
    {
        [Production("expression: INT")]
        public int intExpr(Token<ExpressionToken> intToken)
        {
            return intToken.IntValue;
        }

        [Production("expression : term PLUS expression")]
        public int Expression(int left, Token<ExpressionToken> operatorToken, int right) {
            return left + right;
        }

        [Production("term : INT")]
        public int Expression(Token<ExpressionToken> intToken) {
            return intToken.IntValue;
        }
    }

Build and use the parser

build using ParserBuilder

using sly.parser;
using sly.parser.generator;

public class SomeClass {

    public static Parser<ExpressionToken,int> GetParser() {

        var parserInstance = new ExpressionParser();
        var builder = new ParserBuilder<ExpressionToken, int>();
        var Parser = builder.BuildParser(parserInstance, ParserType.LL_RECURSIVE_DESCENT, "expression").Result;

        return Parser;
    }

}

parse an addition

public class SomeTest {

    public void TestCSLY() {

        string expression = "42  + 42";

        var Parser = SomeClass.GetParser();
        var r = Parser.Parse(expression);

        if (!r.IsError) <!-- this will never happen: && r.Result != null && r.Result is int) -->
        {
            Console.WriteLine($"result of <{expression}>  is {(int)r.Result}");
            // outputs : result of <42 + 42>  is 84"
        }
        else
        {
            if (r.Errors !=null && r.Errors.Any())
            {
                // display errors
                r.Errors.ForEach(error => Console.WriteLine(error.ErrorMessage));
            }
        }
    }

}

Going further

Next steps: