-
Notifications
You must be signed in to change notification settings - Fork 334
Using a custom parser
Adam Brunnmeier edited this page Jan 18, 2016
·
4 revisions
See for example the waxeye parser-generator. It allows to define a Parsing Expression Grammar. An example for implementing a part of Markdown could look like following:
Document <- *(:'\n' | Block )
### Block Elements
Block <= Header | Blockquote | Codeblock | Linknote | List | Table | Paragraph | Invalid_Block
Header <= Header_L2 | Header_L1
Header_L2 <- :'##' :?' ' *(!'\n' .)
Header_L1 <- :'#' :?' ' *(!'\n' .)
Blockquote <- +( :'>' *(!'\n' Unparsed) (&'\n' Unparsed) )
Codeblock <- :'```' Codelanguage :'\n' *(!'```' .) :'```'
Codelanguage <- *(!'\n' .)
Linknote <- :'[' Link_Text :']: ' Link_Url *:[ \t]
List <- +( :'- ' List_Item :'\n' )
List_Item <- +(!'\n' Span)
Paragraph <- +( +(!'\n' (Newline|Span)) ?'\n' )
Newline <- :' \n'
Table <- Table_Header Table_Body
Table_Header <- +(:'|' Table_Header_Item) :'\n'
Table_Header_Item <- *(!('|'|'\n') Span)
Table_Body <- *Table_Body_Row
Table_Body_Row <- !'\n' :?'|' Table_Body_Row_Item *(:'|' Table_Body_Row_Item) :'\n'
Table_Body_Row_Item <- *(!('|'|'\n') Span)
Invalid_Block <- +( +(!'\n' .) '\n')
### Span Elements
Span <= Link | Emphasis | Code | Image | .
Link <= Link_Inline | Link_Reference | Link_Auto
Link_Inline <- :'[' Link_Text :'](' Link_Url :')'
Link_Reference <- :'[' Link_Text :'][]'
Link_Auto <- :'<' +(!'>' .) :'>'
Link_Text <- *(!']' Span)
Link_Url <- *(![) \n] .)
Emphasis <= Emphasis_Bold | Emphasis_Italic
Emphasis_Bold <- :'**' +(!'**' Span) :'**'
Emphasis_Italic <- :'*' +(!'*' Span) :'*'
Code <- :'`' +(!'`' .) :'`'
Image <- :'![' Image_Alt :'](' Image_Url :')'
Image_Alt <- +(!']' Span)
Image_Url <- +(!')' .)
### Unparsed for multi-pass
# I dont know how to parse nested block quotes, they can be resolved in some iterations
Unparsed <- .
The parsed tree can be printed to HTML with e.g. a tree traversing function.
#Stuff, things etc...