Skip to content

Commit

Permalink
Resurrect the %errorhandlertype directive for back-compat (#320)
Browse files Browse the repository at this point in the history
Fixes #320.
  • Loading branch information
sgraf812 committed Oct 21, 2024
1 parent 69e7bc1 commit 77a2fd4
Show file tree
Hide file tree
Showing 18 changed files with 499 additions and 262 deletions.
10 changes: 10 additions & 0 deletions ChangeLog.md
Original file line number Diff line number Diff line change
@@ -1,10 +1,20 @@
# Revision history for Happy

## 2.1.1

This release fixes two breaking changes:

* Properly qualify all uses of Prelude functions, fixing #131
* Bring back the old `%errorhandlertype` directive, the use of which is
discouraged in favour of the "Reporting expected tokens" mechanism
in Happy 2.1, accesible via `%error.expected`.

## 2.1

* Added `--numeric-version` CLI flag.
* Documented and implemented the new feature "Resumptive parsing with ``catch``"
* Documented (and reimplemented) the "Reporting expected tokens" feature
(which turned to cause a breaking change in this release: #320)

## 2.0.2

Expand Down
91 changes: 89 additions & 2 deletions doc/syntax.rst
Original file line number Diff line number Diff line change
Expand Up @@ -273,10 +273,34 @@ Error declaration

%error { <identifier> }

%error { <identifier> } { <identifier> }

.. index:: ``%error``

Specifies the function to be called in the event of a parse error.
The type of ``<identifier>`` varies depending on the presence of ``%lexer`` (see :ref:`Summary <sec-monad-summary>`) and ``%errorhandlertype`` (see the following).
(optional)
Specifies the functions to be called in the event of a parse error.

The first, one-action form specifies a single function (often referred to as
``parseError``) that reports the error and aborts the parse (in the sense of
early return).
When ``%error`` is not specified, the function is assumed to be called ``happyError``.

The type of ``parseError`` varies depending on the presence of ``%lexer``
(see :ref:`Summary <sec-monad-summary>`) and
the :ref:``presence of `%error.expected`` <sec-error-expected-directive>`.

The second, two-action form specifies a pair of functions ``abort`` and
``report`` which are necessary to handle multiple parse errors during
:ref:`resumptive parsing using the ``catch`` mechanism <sec-catch>`.
In this case, ``report`` is called for every parse error and additionally
receives a continuation for resuming the parse as the last argument.
When Happy is unable to resume the parse after a parse error, it calls
``abort``, which is *not* supposed to report an error as well.

To illustrate the correspondence between the two forms:
In a non-resumptive parser (i.e. one that does not use ``catch``),
the one-action form ``%error { \\ tks -> report tks abort }`` is equivalent to
the two-action form ``%error { abort } { report }``.

.. _sec-errorhandlertype-directive:

Expand All @@ -289,6 +313,69 @@ Additional error information

.. index:: ``%errorhandlertype``

(deprecated)
Happy 2.1 overhauled and superseded this directive in favour of the simple,
optional flag directive ``%error.expected``. See <sec-error-expected-directive>.

.. _sec-error-expected-directive:

Reporting expected tokens
-------------------------

.. index:: ``%error.expected``

(optional)
Often, it is useful to present users with suggestions as to which kind of tokens
where expected at the site of a syntax error.
To this end, when the ``%error.expected`` directive is specified, happy assumes that
the error handling function (resp. ``report`` function when using the binary
form of the ``%error`` directive) takes a ``[String]`` argument (the argument
*after* the token stream, in case of a non-threaded lexer) listing all the
stringified tokens that could be shifted at the site of the syntax error.
The strings in this list are derived from the ``%token`` directive.

Here is an example, inspired by test case ``monaderror-explist``:

.. code-block:: none
%tokentype { Token }
%error { handleErrorExpList }
%error.expected
%monad { ParseM } { (>>=) } { return }
%token
'S' { TokenSucc }
'Z' { TokenZero }
'T' { TokenTest }
%%
Exp : 'Z' { 0 }
| 'T' 'Z' Exp { $3 + 1 }
| 'S' Exp { $2 + 1 }
%%
type ParseM = ...
handleErrorExpList :: [Token] -> [String] -> ParseM a
handleErrorExpList ts explist = throwError $ ParseError $ explist
...
Additional error information
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

::

%error.expected

.. index:: ``%error.expected``

Deprecated in favour of the simple, optional flag directive ``%error.expected``.

(optional)
The expected type of the user-supplied error handling can be applied with additional information.
By default, no information is added, for compatibility with previous versions.
Expand Down
57 changes: 3 additions & 54 deletions doc/using.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1053,7 +1053,8 @@ simple non-threaded lexer):
...
Note the use of ``catch`` in the second ``Exp`` rule and
the use of the binary form of the ``%error`` directive.
the use of the two-action form of the ``%error`` directive
(see :ref:`the documentation for ``%error`` <sec-error-directive>`).
The directive specifies a pair of functions ``abort`` and ``report``
which are necessary to handle multiple parse errors.

Expand Down Expand Up @@ -1110,15 +1111,9 @@ A couple of notes:
Similarly, ``abort`` must always throw an exception and cannot return a
syntax tree at all. It should *not* report a parse error as well.

To illustrate how the new binary ``%error`` decomposition corresponds to
the regular unary one, consider the definition
``myError tks = report tks abort``.
This definition could be used in ``%error { myError }``; in this case, the
parser would always abort after the first error.

* Whether or not the ``abort`` and ``report`` functions get passed the
list of tokens is subject to the :ref:`same decision logic as for ``parseError`` <sec-monad-summary>`.
When using :ref:`the ``%error.expected`` directive <sec-expected-list>`,
When using :ref:`the ``%error.expected`` directive <sec-error-expected-directive>`,
the list of expected tokens is passed to ``report`` only, between ``tks``
and ``resume``.

Expand All @@ -1127,52 +1122,6 @@ to the user of happy; the example above simply emitted the string ``catch``
whenever it stands-in an for an errorneous AST node.
A more reasonable implementation would be similar to typed holes in GHC.

.. _sec-expected-list:

Reporting expected tokens
-------------------------

.. index:: expected tokens

Often, it is useful to present users with suggestions as to which kind of tokens
where expected at the site of a syntax error.
To this end, when ``%error.expected`` directive is specified, happy assumes that
the error handling function (resp. ``report`` function when using the binary
form of the ``%error`` directive) takes a ``[String]`` argument (the argument
*after* the token stream, in case of a non-threaded lexer) listing all the
stringified tokens that were expected at the site of the syntax error.
The strings in this list are derived from the ``%token`` directive.

Here is an example, inspired by test case ``monaderror-explist``:

.. code-block:: none
%tokentype { Token }
%error { handleErrorExpList }
%error.expected
%monad { ParseM } { (>>=) } { return }
%token
'S' { TokenSucc }
'Z' { TokenZero }
'T' { TokenTest }
%%
Exp : 'Z' { 0 }
| 'T' 'Z' Exp { $3 + 1 }
| 'S' Exp { $2 + 1 }
%%
type ParseM = ...
handleErrorExpList :: [Token] -> [String] -> ParseM a
handleErrorExpList ts explist = throwError $ ParseError $ explist
...
.. _sec-multiple-parsers:

Generating Multiple Parsers From a Single Grammar
Expand Down
4 changes: 2 additions & 2 deletions happy.cabal
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: happy
version: 2.1
version: 2.1.1
license: BSD2
license-file: LICENSE
copyright: (c) Andy Gill, Simon Marlow
Expand Down Expand Up @@ -139,7 +139,7 @@ executable happy
array,
containers >= 0.4.2,
mtl >= 2.2.1,
happy-lib == 2.1
happy-lib == 2.1.1

default-language: Haskell98
default-extensions: CPP, MagicHash, FlexibleContexts, NamedFieldPuns
Expand Down
8 changes: 4 additions & 4 deletions lib/backend-glr/src/Happy/Backend/GLR/ProduceCode.lhs
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,7 @@ the driver and data strs (large template).
> -> Maybe String -- User-defined stuff (token DT, lexer etc.)
> -> (DebugMode,Options) -- selecting code-gen style
> -> Grammar String -- Happy Grammar
> -> Pragmas -- Pragmas in the .y-file
> -> Directives -- Directives in the .y-file
> -> (String -- data
> ,String) -- parser
>
Expand Down Expand Up @@ -372,7 +372,7 @@ Do the same with the Happy goto table.
%-----------------------------------------------------------------------------
Create the 'GSymbol' ADT for the symbols in the grammar

> mkGSymbols :: Grammar String -> Pragmas -> ShowS
> mkGSymbols :: Grammar String -> Directives -> ShowS
> mkGSymbols g pragmas
> = str dec
> . str eof
Expand Down Expand Up @@ -423,7 +423,7 @@ Creating a type for storing semantic rules
> type SemInfo
> = [(String, String, [Int], [((Int, Int), ([(Int, TokenSpec)], String), [Int])])]

> mkGSemType :: Options -> Grammar String -> Pragmas -> (ShowS, SemInfo)
> mkGSemType :: Options -> Grammar String -> Directives -> (ShowS, SemInfo)
> mkGSemType (TreeDecode,_,_) g pragmas
> = (def, map snd syms)
> where
Expand Down Expand Up @@ -673,7 +673,7 @@ only unpacked when needed. Using classes here to manage the unpacking.
This selects the info used for monadic parser generation

> type MonadInfo = Maybe (String,String,String)
> monad_sub :: Pragmas -> MonadInfo
> monad_sub :: Directives -> MonadInfo
> monad_sub pragmas
> = case monad pragmas of
> (True, _, ty,bd,ret) -> Just (ty,bd,ret)
Expand Down
46 changes: 21 additions & 25 deletions lib/backend-lalr/src/Happy/Backend/LALR/ProduceCode.lhs
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ Produce the complete output file.

> produceParser :: Grammar String -- grammar info
> -> Maybe AttributeGrammarExtras
> -> Pragmas -- pragmas supplied in the .y-file
> -> Directives -- directives supplied in the .y-file
> -> ActionTable -- action table
> -> GotoTable -- goto table
> -> [String] -- language extensions
Expand All @@ -53,7 +53,7 @@ Produce the complete output file.
> , starts = starts'
> })
> mAg
> (Pragmas
> (Directives
> { lexer = lexer'
> , imported_identity = imported_identity'
> , monad = (use_monad,monad_context,monad_tycon,monad_then,monad_return)
Expand Down Expand Up @@ -365,19 +365,16 @@ happyMonadReduce to get polymorphic recursion. Sigh.
The token conversion function.

> produceTokenConverter
> = case lexer' of {
>
> Nothing ->
> str "happyTerminalToTok term = case term of {\n" . indent
> = str "happyTerminalToTok term = case term of {\n" . indent
> . (case lexer' of Just (_, eof') -> str eof' . str " -> " . eofTok . str ";\n" . indent; _ -> id)
> . interleave (";\n" ++ indentStr) (map doToken token_rep)
> . str "_ -> -1#;\n" . indent . str "}\n" -- -1# signals an invalid token
> . str "_ -> -1#;\n" . indent . str "}\n" -- token number -1# (INVALID_TOK) signals an invalid token
> . str "{-# NOINLINE happyTerminalToTok #-}\n"
> . str "\n"
> . str "happyLex kend _kmore [] = kend notHappyAtAll []\n"
> . str "happyLex _kend kmore (tk:tks)\n"
> . str " | Happy_GHC_Exts.tagToEnum# (i Happy_GHC_Exts.==# -1#) = happyReport' (tk:tks) [] happyAbort\n" -- invalid token (-1#); lexer error.
> . str " | Prelude.otherwise = kmore i tk tks\n"
> . str " where i = happyTerminalToTok tk\n"
> . str "\n" .
> (case lexer' of {
> Nothing ->
> str "happyLex kend _kmore [] = kend notHappyAtAll []\n"
> . str "happyLex _kend kmore (tk:tks) = kmore (happyTerminalToTok tk) tk tks\n"
> . str "{-# INLINE happyLex #-}\n"
> . str "\n"
> . str "happyNewToken action sts stk = happyLex (\\tk -> " . eofAction "notHappyAtAll" . str ") ("
Expand All @@ -390,13 +387,7 @@ The token conversion function.
> . str "\n";

> Just (lexer'',eof') ->
> str "happyTerminalToTok term = case term of {\n" . indent
> . str eof' . str " -> " . eofTok . str ";\n" . indent
> . interleave (";\n" ++ indentStr) (map doToken token_rep)
> . str "_ -> Prelude.error \"Encountered a token that was not declared to happy\"\n" . indent . str "}\n"
> . str "{-# NOINLINE happyTerminalToTok #-}\n"
> . str "\n"
> . str "happyLex kend kmore = " . str lexer'' . str " (\\tk -> case tk of {\n" . indent
> str "happyLex kend kmore = " . str lexer'' . str " (\\tk -> case tk of {\n" . indent
> . str eof' . str " -> kend tk;\n" . indent
> . str "_ -> kmore (happyTerminalToTok tk) tk })\n"
> . str "{-# INLINE happyLex #-}\n"
Expand All @@ -409,7 +400,7 @@ The token conversion function.
> -- superfluous pattern match needed to force happyReport to
> -- have the correct type.
> . str "\n";
> }
> })

> where

Expand Down Expand Up @@ -729,16 +720,21 @@ in the presence of the %error.expected directive.
The last argument is the "resumption", a continuation that tries to find
an item on the stack taking a @catch@ terminal where parsing may resume,
in the presence of the two-argument form of the %error directive.
In order to support the legacy %errorhandlertype directive, we retain
have a special code path for `OldExpected`.

> callReportError = -- this one wraps around report_error_handler to expose a unified interface
> str "(\\tokens expected resume -> " .
> (if use_monad then str ""
> else str "HappyIdentity Prelude.$ ") .
> report_error_handler .
> (case (error_handler', lexer') of (DefaultErrorHandler, Just _) -> id
> _ -> str " tokens") .
> (if error_expected' then str " expected"
> else id) .
> (case error_expected' of
> OldExpected -> str " (tokens, expected)" -- back-compat for %errorhandlertype
> _ ->
> (case (error_handler', lexer') of (DefaultErrorHandler, Just _) -> id
> _ -> str " tokens") .
> (case error_expected' of NewExpected -> str " expected"
> NoExpected -> id)) .
> (case error_handler' of ResumptiveErrorHandler{} -> str " resume"
> _ -> id) .
> str ")"
Expand Down
1 change: 1 addition & 0 deletions lib/data/HappyTemplate.hs
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
type Happy_Int = Happy_GHC_Exts.Int#
data Happy_IntList = HappyCons Happy_Int Happy_IntList

#define INVALID_TOK -1#
#define ERROR_TOK 0#
#define CATCH_TOK 1#

Expand Down
2 changes: 2 additions & 0 deletions lib/frontend/boot-src/Parser.ly
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ The parser.
> spec_expect { TokenKW TokSpecId_Expect }
> spec_error { TokenKW TokSpecId_Error }
> spec_errorexpected { TokenKW TokSpecId_ErrorExpected }
> spec_errorhandlertype { TokenKW TokSpecId_ErrorHandlerType }
> spec_attribute { TokenKW TokSpecId_Attribute }
> spec_attributetype { TokenKW TokSpecId_Attributetype }
> code { TokenInfo $$ TokCodeQuote }
Expand Down Expand Up @@ -125,6 +126,7 @@ The parser.
> | spec_expect int { TokenExpect $2 }
> | spec_error code optCode { TokenError $2 $3 }
> | spec_errorexpected { TokenErrorExpected }
> | spec_errorhandlertype id { TokenErrorHandlerType $2 }
> | spec_attributetype code { TokenAttributetype $2 }
> | spec_attribute id code { TokenAttribute $2 $3 }

Expand Down
Loading

0 comments on commit 77a2fd4

Please sign in to comment.