The parsergen/scannergen combo generates source code files of LR1/GLR parser & scanner from a set of annotated production rules, aka grammar.
Both parsergen & scannergen use the same combo (i.e. themselves) to re-generate their own parser & scanner, respectively, to evolve.
Building the generated code with -std=c++2a is required.
🧘 Most often you need the combo, but not always:
- Sometimes reusing an existing scanner with another parser is feasible and cheaper. (%IDDEF_SOURCE)
- Sometimes a standalone scanner suffices. (see CBrackets)

Installation

in ArchLinux

Make sure you have installed yay or any other pacman wrapper.
yay -S parsergen to install.

yay -Ql parsergen to see the installed files:

parsergen /usr/
parsergen /usr/bin/
parsergen /usr/bin/grammarstrip
parsergen /usr/bin/parsergen
parsergen /usr/bin/scannergen
parsergen /usr/share/
parsergen /usr/share/licenses/
parsergen /usr/share/licenses/parsergen/
parsergen /usr/share/licenses/parsergen/LICENSE
parsergen /usr/share/parsergen/
parsergen /usr/share/parsergen/RE_Suite.txt

Three commands grammarstrip parsergen scannergen at your disposal.

from github in any of Linux distros

Make sure you have installed cmake make gcc git, or the likes.

git clone https://github.com/buck-yeh/parsergen.git
cd parsergen
cmake -D FETCH_DEPENDEES=1 -D DEPENDEE_ROOT=_deps .
make -j
PSGEN_DIR="/full/path/to/current/dir"

p.s. You can install a tagged version by replacing main with tag name.

Three commands at your disposal:
- $PSGEN_DIR/ParserGen/grammarstrip
- $PSGEN_DIR/ParserGen/parsergen
- $PSGEN_DIR/ScannerGen/scannergen
🤔 But is it possible to just type grammarstrip parsergen scannergen to run them?
💡 Append the following lines to ~/.bashrc:
```
PSGEN_DIR="/full/path/to/parsergen/dir"
alias grammarstrip="$PSGEN_DIR/ParserGen/grammarstrip"
alias parsergen="$PSGEN_DIR/ParserGen/parsergen"
alias scannergen="$PSGEN_DIR/ScannerGen/scannergen"
```
And run the following line:
```
. ~/.bashrc
```
There you go! It will also take effect in subsequently opened console windows and will last after reboot.

A quick guide to `parsergen`/`scannergen` combo

When you need to quickly implement a parser for an improvised or deliberately designed DSL, prepare a grammar file in simple BNF rules with semantic annotations and then let the combo generate C++ code of parser & scanner.

Write grammar

example/CalcInt/grammar.txt defines a calculator for basic arithmetics + - * / % of integral constants in decimal, octal, or hexadecimal.

lexid   Spaces // (1)

//
//      Output Options (2)
//
%CONTEXT [[std::ostream &]]

%ON_ERROR [[
    $c <<"COL#" <<$pos.m_Col <<": " <<$message <<'\n';
]]

%EXTRA_TOKENS   [[dec_num|oct_num|hex_num|spaces]]
//%SHOW_UNDEFINED

//
//      Operator Precedence (3)
//
left   + -
left   * / %
right  ( )

//
//      Grammar with Reduction Code (4)
//
<@> ::= <Expr>  [[
    $r = $1;
]]

<Expr> ::= <Expr> + <Expr>  [[
    bux::unlex<int>($1) += bux::unlex<int>($3);
    $r = $1;
]]
<Expr> ::= <Expr> - <Expr>  [[
    bux::unlex<int>($1) -= bux::unlex<int>($3);
    $r = $1;
]]
<Expr> ::= <Expr> * <Expr>  [[
    bux::unlex<int>($1) *= bux::unlex<int>($3);
    $r = $1;
]]
<Expr> ::= <Expr> / <Expr>  [[
    bux::unlex<int>($1) /= bux::unlex<int>($3);
    $r = $1;
]]
<Expr> ::= <Expr> % <Expr>  [[
    bux::unlex<int>($1) %= bux::unlex<int>($3);
    $r = $1;
]]
<Expr> ::= ( <Expr> )       [[
    $r = $2;
]]
<Expr> ::= $Num             [[
    $r = bux::createLex(dynamic_cast<bux::C_IntegerLex&>(*$1).value<int>());
]]

(1) New lexid

(2) % Option

(3) Operator precedence

(4) Production rule

Generate C++ code of parser & scanner

When package `parsergen` is installed in ArchLinux

parsergen grammar.txt Parser tokens.txt && \
scannergen Scanner /usr/share/parsergen/RE_Suite.txt tokens.txt

When `parsergen` is built from github

parsergen grammar.txt Parser tokens.txt && \
scannergen Scanner "$PSGEN_DIR/ScannerGen/RE_Suite.txt" tokens.txt

where

Parameter	Description
`grammar.txt`	Annotated BNF rules and other types of options.
`Parser`	Output file base - `parsergen` generates Parser.cpp Parser.h ParserIdDef.h
`Scanner`	Output file base - `scannergen` generates Scanner.cpp Scanner.h
`tokens.txt`	Output of `parsergen` & input of `scannergen`
`RE_Suite.txt`	Recurring token definitions provided with `scannergen` and used by `tokens.txt`

If target source files already exist

💡 Put the commands in a script called reparse for recurring uses.

ℹ️ parsergen will prompt (y/n) questions three times and scannergen will prompt twice.

> ./reparse 
About to parse 'grammar.txt' ...
Total 1 lex-symbols 1 nonterms 9 literals
states = 30	shifts = 106
Spent 0.005232879"
38 out of 106 goto keys erased for redundancy.
ParserIdDef.h already exists. Overwrite it ?(y/n)y
Parser.h already exists. Overwrite it ?(y/n)y
Parser.cpp already exists. Overwrite it ?(y/n)y
Parser created
#pos_args = 4
About to parse '/usr/share/parsergen/RE_Suite.txt' ...
About to parse 'tokens.txt' ...
Scanner.h already exists. Overwrite it ?(y/n)y
Scanner.cpp already exists. Overwrite it ?(y/n)y
> _

Use the generated

ℹ️ from example/CalcInt/main.cpp

Includes

#include "Parser.h"         // C_Parser
#include "ParserIdDef.h"    // TID_LEX_Spaces
#include "Scanner.h"        // C_Scanner

💡 Including ParserIdDef.h may not be necessary when spaces can't be ignored.

Scanner|screener|parser piped to parse

C_Parser                            parser{/*args of context ctor*/};
bux::C_ScreenerNo<TID_LEX_Spaces>   screener{parser}; // (1)
C_Scanner                           scanner{screener};
bux::C_IMemStream                   in{line}; // or other std::istream derived
bux::scanFile(">", in, scanner);

// Check if parsing is ok
// ... (2)

// Acceptance
if (!parser.accepted())
{
   std::cerr <<"Incomplete expression!\n";
   continue; // or break or return
}

// Apply the result 
// parser.getFinalLex() ... (3)

(1) Screener is filter of scanner and can filter out, change, aggregate selected tokens. Don't use it if you don't need it:

C_Parser                            parser{/*args of context ctor*/};
C_Scanner                           scanner{parser};
bux::C_IMemStream                   in{line}; // or other std::istream derived
bux::scanFile(">", in, scanner);

(2) Time to check integrity of your context status.

(3) parser.getFinalLex() returns reference to the merged result of type bux::LR1::C_LexInfo. In this example, the expected result is integral value of type int and can be conveniently obtained by calling bux::unlex<T>()

bux::unlex<int>(parser.getFinalLex())

An alternative way is to store the result in the user context instance thru "production code" instead of calling parser.getFinalLex().

Name		Name	Last commit message	Last commit date
Latest commit History 156 Commits
ParserGen		ParserGen
ScannerGen		ScannerGen
example		example
test/archlinux		test/archlinux
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of Contents

Installation

in ArchLinux

from github in any of Linux distros

A quick guide to `parsergen`/`scannergen` combo

Write grammar

Generate C++ code of parser & scanner

When package `parsergen` is installed in ArchLinux

When `parsergen` is built from github

If target source files already exist

Use the generated

Includes

Scanner|screener|parser piped to parse

About

Releases 11

Packages

Languages

License

buck-yeh/parsergen

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

Installation

in ArchLinux

from github in any of Linux distros

A quick guide to parsergen/scannergen combo

Write grammar

Generate C++ code of parser & scanner

When package parsergen is installed in ArchLinux

When parsergen is built from github

If target source files already exist

Use the generated

Includes

Scanner|screener|parser piped to parse

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 11

Packages 0

Languages

A quick guide to `parsergen`/`scannergen` combo

When package `parsergen` is installed in ArchLinux

When `parsergen` is built from github

Packages