Skip to content

Latest commit

 

History

History
577 lines (450 loc) · 15.4 KB

example.md

File metadata and controls

577 lines (450 loc) · 15.4 KB

author: Sam Derbyshire, Well-Typed title: GHC's renamer (Part II) subtitle: GHC Contributors' Workshop date: June 7th, 2023

The problem

{-# COMPLETE U #-}
pattern U :: ()
pattern U = ()

foo :: Identity Char
foo = do
  U <- return ()
  return 'c'
error:
    * No instance for `MonadFail Identity'
        arising from a do statement
        with the failable pattern `U'
    * In a stmt of a 'do' block: U <- return ()

Context

This has been reported many times: #15681 #16470 #22004 #23458 (related: #16618).

Fundamental problem:

  • the renamer decides how to elaborate do-notation,
  • the pattern-match checker runs after type-checking (it needs types).

:::{.element: class="fragment"} However, in this situation, we don't need types to be able to see that the pattern match succeeds (this is different from situation which involve GADT pattern matches).

:::

Investigation

From the error message, we can see that we emit a MonadFail Identity Wanted constraint when typechecking foo.

Where? We can search for MonadFail to find out!

In the renamer

GHC.Rename.Expr:

rnStmt ctxt rnBody (L loc (BindStmt _ pat (L lb body))) thing_inside
  = do  { -- ...
        ; (fail_op, fvs2) <- monadFailOp pat ctxt
        ; -- ...
        }
monadFailOp :: LPat GhcPs
            -> HsStmtContext GhcRn
            -> RnM (FailOperator GhcRn, FreeVars)
monadFailOp pat ctxt = do
    dflags <- getDynFlags
    if | isIrrefutableHsPat dflags pat -> return (Nothing, emptyFVs)
       | not (isMonadStmtContext ctxt) -> return (Nothing, emptyFVs)
       | otherwise -> getMonadFailOp ctxt

In the typechecker

GHC.Tc.Gen.Match (Gen = constraint generator):

tcDoStmt ctxt (BindStmt xbsrn pat rhs) res_ty thing_inside
  = do  { -- ...
        ; fail_op' <- fmap join . forM (xbsrn_failOp xbsrn) $ \fail ->
            tcMonadFailOp (DoPatOrigin pat) pat' fail new_res_ty
        ; -- ...
        ; return (BindStmt xbstc pat' rhs', thing) }
tcMonadFailOp orig pat fail_op res_ty = do
    dflags <- getDynFlags
    if isIrrefutableHsPat dflags pat
    then return Nothing
    else Just . snd <$> (tcSyntaxOp orig fail_op [synKnownType stringTy]
                          (mkCheckExpType res_ty) $ \_ _ -> return ())

:::{.element: class="fragment"} tcSyntaxOp is going to instantiate fail :: MonadFail m => String -> m a, which will emit a MonadFail Identity Wanted constraint.

:::

Irrefutable patterns

:::{.element: class="fragment"}

isIrrefutableHsPat :: Bool -- ^ Is @-XStrict@ enabled?
                   -> Pat (GhcPass p) -> Bool
isIrrefutableHsPat is_strict = \case
  WildPat {} -> True
  VarPat {}  -> True
  -- ...
  ConPat
    { pat_con  = con
    , pat_args = details } ->
      case ghcPass @p of
        GhcPs -> False -- Conservative
        GhcRn -> False -- Conservative
        GhcTc -> case con of
          L _ (PatSynCon _pat)  -> False -- Conservative
          L _ (RealDataCon con) ->
            isJust (tyConSingleDataCon_maybe (dataConTyCon con))
            && all goL (hsConPatArgs details)

:::

Approach

Let's fix isIrrefutableHsPat for ConPat at GhcRn and GhcTc stages.

data Pat p
  = -- ...
  | ConPat {
        pat_con_ext :: XConPat p,
        pat_con     :: XRec p (ConLikeP p),
        pat_args    :: HsConPatDetails p
    }

:::{.element: class="fragment"}

type instance ConLikeP GhcPs = RdrName
type instance ConLikeP GhcRn = Name
type instance ConLikeP GhcTc = ConLike

type instance XConPat GhcPs = EpAnn [AddEpAnn]
type instance XConPat GhcRn = NoExtField
type instance XConPat GhcTc = ConPatTc

:::

What to change?

We need two things:

  • for a DataCon, whether it is irrefutable (tyConSingleDataCon_maybe);
  • for a PatSyn, whether it is the unique member of a COMPLETE set.

For PatSyns, we have the choice of storing whether the PatSyn is irrefutable in the PatSyn itself, or of threading through the COMPLETE pragmas to isIrrefutableHsPat.


:::{.element: class="fragment"} I've chosen to thread through COMPLETE pragmas, as that would also allow us to handle or patterns:

{-# COMPLETE P, Q #-}
do (one of {P; Q}) <- ...

:::

Using the extension field

Idea: let's store whether the constructor pattern is a DataCon or a PatSyn, and if it's a DataCon whether it's irrefutable, in XConPat GhcRn.

data ConInfoType
  = ConIsData { conIsSingleDataCon :: Bool }
  | ConIsPatSyn

(Alternatively we could change ConLikeP GhcRn to ConLikeName, but that wouldn't handle the tyConSingleDataCon situation.)


:::{.element: class="fragment"} This leaves the task of passing the collection of COMPLETE pragmas to isIrrefutableHsPat.

:::

COMPLETE pragmas

We want to pass COMPLETE sets to isIrrefutableHsPat.
So let's start off by figuring out how these are represented in the compiler.

:::{.element: class="fragment"}

-- In Language.Haskell.Syntax.Binds
data Sig pass
  = -- ...
  | CompleteMatchSig
    (XCompleteMatchSig pass) -- (NB: only exact-print info here)
    (XRec pass [LIdP pass])
    (Maybe (LIdP pass))
-- In GHC.Types.CompleteMatch
data CompleteMatch = CompleteMatch
  { cmConLikes :: UniqDSet ConLike, cmResultTyCon :: Maybe TyCon }

data TcGblEnv
  = TcGblEnv
  { -- ...
  , tcg_complete_matches :: !CompleteMatches }

:::

Information flow

Now let's look into passing COMPLETE sets to isIrrefutableHsPat.
Let's audit its calls; the HLS call hierarchy functionality is very useful for that.

  1. In the renamer:
    1. `stmtTreeToStmts .. (StmtTreeApplicative ..)`
    2. `monadFailOp` in `rnStmt .. (BindStmt ..)` and `rn_rec_stmt .. (BindStmt ..)`
  2. In the typechecker
    1. When creating a `PatSyn` in `tcPatSynMatcher` (from `tcPatSynDecl`).
    2. Calls to `tcMonadFailOp` in `tcApplicativeStmts`, `tcDoStmt`, `tcMcStmt`.

:::{.element: class="fragment"} This means that we are going to need COMPLETE pattern information in the renamer, and before we have actually typechecked PatSyns in the typechecker.

:::

Threading it through

Let's add a field to the typechecker environment (used for both the typechecker and the renamer):

data TcGblEnv
  = TcGblEnv { -- ...
             , tcg_complete_matches :: !CompleteMatches
   {- NEW -} , tcg_complete_matches_rn :: !CompleteMatchesRn
             , -- ...
             }

-- NEW
type CompleteMatchesRn = [CompleteMatchRn]
data CompleteMatchRn = CompleteMatchRn
  { cmConLikesRn :: NameSet -- ^ The set of `ConLike` values
  }

Populating the field

Let's see then how to thread through the correct information; we need to ensure that we first rename pattern synonym COMPLETE signatures and add that information to the TcGblEnv environment before we call isIrrefutableHsPat.

Call hierarchy of monadFailOp

We want to find what to change to pass on the correct COMPLETE information.
Let's use HLS to find the call hierarchy of GHC.Rename.Expr.monadFailOp.

:::{.element: class="fragment"}

  • monadFailOp
  • rnStmt
  • rnStmtsWithFreeVars
  • rnExpr
  • rnGRHS'
  • rnGHRS
  • rnGHRSs
  • rnBind
  • rnLBind
  • rnValBindsRHS

  • rnSrcDecls :::

Focus on rnValBindsRHS

rnValBindsRHS is interesting, as this is where we also rename signatures:

rnValBindsRHS :: HsSigCtxt
              -> HsValBindsLR GhcRn GhcPs
              -> RnM (HsValBinds GhcRn, DefUses)
rnValBindsRHS ctxt (ValBinds _ mbinds sigs)
  = do { (sigs', sig_fvs) <- renameSigs ctxt sigs
       ; binds_w_dus <- mapBagM (rnLBind (mkScopedTvFn sigs')) mbinds
       -- ...
       }

:::{.element: class="fragment"} Here we clearly see that we rename Sigs first (which includes COMPLETE pragmas), and then rename value bindings. So we can simply extend the TcGblEnv, e.g.:

do { (sigs', sig_fvs) <- renameSigs ctxt sigs
   ; binds_w_dus <-
      updGblEnv (add_complete_sigs_rn sigs') $
      mapBagM (rnLBind (mkScopedTvFn sigs')) mbinds
   -- ...
   }

:::

Typechecker

Now, this does the trick in the renamer. However, we also need to make sure to persist the information in tcg_complete_matches. So we should return the CompleteMatchesRn at the end of rnValBindsRHS, and make sure to add them to the final typechecker environment at the end of rnSrcDecls.

Using tcg_complete_matches

We then need to look this information up in isIrrefutableHsPat.

isIrrefutableHsPat :: IsPass p => LPat (GhcPass p) -> TcM Bool
isIrrefutableHsPat pat
  = do { strict <- xoptM LangExt.Strict
       ; comps <- get_complete_matches_rn
       ; return $ is_irrefutable_pat comps strict pat }

:::{.element: class="fragment"} However, we need to make sure we include all complete matches, not just those from the local module.

Let's find how GHC already does that, for tcg_complete_matches.

:::

Taking inspiration

Inspired by mkDsEnvsFromTcGbl and initDsWithModGuts in GHC.HsToCore.Monad:

get_complete_matches_rn :: TcM CompleteMatchesRn
get_complete_matches_rn
  = do { hsc_env <- getTopEnv
       ; tcg_env <- getGblEnv
       ; eps <- liftIO $ hscEPS hsc_env
       ; return $
          completeMatchesRn
            (   hptCompleteSigs hsc_env      -- from the home package
             ++ eps_complete_matches eps )   -- from imports
          ++ tcg_complete_matches_rn tcg_env -- from the current module
       }

completeMatchesRn throws away the COMPLETE pragmas with a result TyCon, as in the renamer we can't know whether they apply or not.

Updating isIrrefutableHsPat

In the worker function is_irrefutable_pat we can then make use of this information, in the ConPat case for a PatSyn. We want to check whether the PatSyn Name is the single member of one of the COMPLETE sets.

:::{.element: class="fragment"}

is_irrefutable_hs_pat complete_matches strict_enabled = \case
  -- ...
  ConPat
    { pat_con_ext = ext
    , pat_con     = con
    , pat_args    = details } ->
      con_irref && all go (hsConPatArgs details)
        where
          con_irref = case ghcPass @p of
            GhcPs -> False -- Conservative
            GhcRn -> case ext of
              ConIsData { conIsSingleDataCon = irref } -> irref
              ConIsPatSyn -> any (single_match con) complete_matches
            GhcTc -> -- ...

:::

Outcome

Now that everything is in place, we try some tests programs.

:::{.element: class="fragment"}

panic! (the 'impossible' happened)
  GHC version 9.7.20230607:
        missing fail op
  Pattern match: U is failable, and fail_expr was left unset

:::

:::{.element: class="fragment"} Debug this by working backwards: search for "missing fail op".

:::

dsHandleMonadicFailure

dsHandleMonadicFailure ctx pat match m_fail_op =
  case shareFailureHandler match of
    MR_Infallible body -> body
    MR_Fallible body ->
      case m_fail_op of
-- Note that (non-monadic) list comprehension, pattern guards, etc could
-- have fallible bindings without an explicit failure op, but this is
-- handled elsewhere. See Note [Failing pattern matches in Stmts] the
-- breakdown of regular and special binds.
        Nothing -> pprPanic "missing fail op" $
          text "Pattern match:" <+> ppr pat <+>
          text "is failable, and fail_expr was left unset"
        Just fail_op -> -- ...

:::{.element: class="fragment"} Where are we constructing MR_Fallible?

:::

mkCoSynCaseMatchResult

mkCoSynCaseMatchResult called in matchPatSyn looks relevant!

mkCoSynCaseMatchResult :: Id -> Type -> CaseAlt PatSyn -> MatchResult CoreExpr
mkCoSynCaseMatchResult var ty alt = MR_Fallible $ mkPatSynCase var ty alt

:::{.element: class="fragment"} Problem: the pattern synonym matcher always returns a fallible match result. This is actually correct in general, e.g.:

pattern Bogus :: Int
pattern Bogus = 3
{-# COMPLETE Bogus #-}

bogus :: Identity Bool
bogus = do
  Bogus <- return 4
  return False

:::

Handling monadic failure

We should accept the previous program, but crash at runtime, using a non-monadic failure operator which raises a pattern match error:

dsHandleMonadicFailure ctx pat match m_fail_op body_ty =
  case shareFailureHandler match of
    MR_Infallible body -> body
    MR_Fallible body -> do
      case m_fail_op of
        Nothing ->
          do error_expr <- mkErrorAppDs pAT_ERROR_ID body_ty (ppr pat)
             body error_expr
        Just fail_op -> -- ...

Note that we need to know the overall return type of the do expression to know the return type of the error; so we modify dsHandleMonadicFailure.

Confirmation

Let's make sure we properly handle the problematic program:

pattern Bogus :: Int
pattern Bogus = 3
{-# COMPLETE Bogus #-}

bogus :: Identity Bool
bogus = do
  Bogus <- return 4
  return False

:::{.element: class="fragment"} The program compiles successfully, and running it gives:

*** Exception: user error (Pattern match failure in 'do' block)

:::

:::{.element: class="fragment"} We can do better by having a custom error message for this situation, as opposed to re-using pAT_ERROR_ID.

:::

Finishing up

  • Tests.
  • Notes.

Other tickets

  • Simple error message improvements:

    • Error messages for instances of a non-class #22688 and #23462.
    • Improve import suggestions #20771.
  • Dependency order with view patterns #14293 #22406.

  • Refactoring of rebindable syntax; see wiki page.

  • Propagate more information about patterns in the renamer to help record update disambiguation #23032 #22746.