Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(compiler): Compiling Dcalc into Lcalc without using ∅ errors #158

Merged
merged 108 commits into from
Feb 24, 2022

Conversation

adelaett
Copy link
Collaborator

@adelaett adelaett commented Nov 22, 2021

Description

The goal of this PR is to compile Dcalc to Lcalc without using raise EmptyError nor try _ with EmptyError -> _.

To do so, we use the same technique as in rust or erlang to handle this kind of exceptions. Each raise EmptyError will be translated as None and each try e1 with EmtpyError -> e2 as match e1 with | None -> e2 | Some x -> x.

Other kinds of errors still exists in the generated Lcalc program, like ConflictError, NoValueProvided, and Crash, but they are not caught inside generated Lcalc program.

Technical details

Transformation pass

When doing this naively, this requires to add matches and Some constructor everywhere. We apply here an other technique where we generate what we call hoists. Hoists are expression whom could minimally raise EmptyError. For instance in let x = <e1, e2, ..., en| e_just :- e_cons> * 3 in x + 1, the sub-expression <e1, e2, ..., en| e_just :- e_cons> can produce an empty error. So we make a hoist with a new variable y linked to the Dcalc expression <e1, e2, ..., en| e_just :- e_cons>, and we return as the translated expression let x = y * 3 in x + 1.

Alternative AST for Dcalc

To implement the translation, we use an alternative representation of the Dcalc AST defined in Dcalc.binded_representation.ml. Indeed, the Dcalc normal AST contains some e: expr Bindlib.box and v: expr Bindlib.var at different places inside scope or program definitions. This makes comparison between variables impracticable, as when we unbox an expression inside Bindlib, each free variable (free according to Bindlib) is replaced by a fresh variable. Hence, If v is a variable that appear inside e, then after unboxing it is no longer possible to compare v with the occurrence of v inside e.

As we discussed one way to handle this would be to use two layers of nested bindlib boxes (expr Bindlib.box Bindlib.box). However this solution makes it quite hard to know if a variable inside an expression is linked to the first or the second box. The other way is to define an alternative AST of Dcalc, that binds directly the variable v inside e. This change is explained more in details in #190.

We could not make the change directly inside this PR because this change would conflict with other parts of the compiler in non-trivial ways.

@denismerigoux
Copy link
Contributor

Hi Alain, I'm still in the process of syncing your PR with master and fixing the bugs that stem from it. Then I'll review it proper; in the meantime can you change the PR description with a good summary of what you bring, the broad principles of your compilation pass, as well as a coarse-grained changelog? I like the PR descriptions to contain those things as a way to have a public record of the design choices in Catala.

@denismerigoux
Copy link
Contributor

denismerigoux commented Feb 15, 2022

I discovered a bug that is likely to be linked to your optimizations in Lcalc. Indeed, when I compile the allocations_familiales example and test it with CATALA_OPTS="--avoid_exceptions" make tests_ocaml, I get an error message saying that this variable raises an empty error:

let base_mensuelle_6737_ : money =
    match
      match
        handle_default_opt
          [|
            (match
               handle_default_opt
                 [|
                   (let (_ : unit) = () in
                    ENone ());
                 |]
                 (ESome false)
                 (let (_ : unit) = () in
                  ENone ())
             with
            | ENone _ -> ENone ()
            | ESome default_term_6741_ -> ESome default_term_6741_);
            (match
               handle_default_opt
                 [|
                   (let (_ : unit) = () in
                    ENone ());
                 |]
                 (ESome false)
                 (let (_ : unit) = () in
                  ENone ())
             with
            | ENone _ -> ENone ()
            | ESome default_term_6745_ -> ESome default_term_6745_);
            (match
               handle_default_opt
                 [|
                   (let (_ : unit) = () in
                    ENone ());
                 |]
                 (ESome false)
                 (let (_ : unit) = () in
                  ENone ())
             with
            | ENone _ -> ENone ()
            | ESome default_term_6749_ -> ESome default_term_6749_);
          |]
          (ESome true)
          (let (_ : unit) = () in
           ENone ())
      with
      | ENone _ -> ENone ()
      | ESome default_term_6752_ -> ESome default_term_6752_
    with
    | ENone _ ->
        raise
          (NoValueProvided
             {
               filename = "tests/../prologue.catala_fr";
               start_line = 74;
               start_column = 10;
               end_line = 74;
               end_column = 24;
               law_headings = [ "Prologue" ];
             })
    | ESome non_empty_argument_6754_ -> non_empty_argument_6754_

However, if you remove modify the Makefile to remove the -O -t options here:

catala/Makefile

Lines 171 to 177 in 6fbff03

$(FRENCH_LAW_OCAML_LIB_DIR)/law_source/allocations_familiales.ml:
CATALA_OPTS="$(CATALA_OPTS) -O -t" $(MAKE) -C $(ALLOCATIONS_FAMILIALES_DIR) allocations_familiales.ml
cp -f $(ALLOCATIONS_FAMILIALES_DIR)/allocations_familiales.ml $@
$(FRENCH_LAW_OCAML_LIB_DIR)/law_source/unit_tests/tests_allocations_familiales.ml:
CATALA_OPTS="$(CATALA_OPTS) -O -t" $(MAKE) -s -C $(ALLOCATIONS_FAMILIALES_DIR) tests/tests_allocations_familiales.ml
cp -f $(ALLOCATIONS_FAMILIALES_DIR)/tests/tests_allocations_familiales.ml $@

Then CATALA_OPTS="--avoid_exceptions" make tests_ocaml passes, suggesting that your translation is correct but not the optimizations (I checked that it's not -t that triggers the fail, only -O).

For the record, the same faulty variable whose OCaml code is up here gives the following without -O enabled:

let base_mensuelle_5830_ : money =
    match
      match
        handle_default_opt
          [|
            (match
               handle_default_opt [||] (ESome true)
                 (match
                    handle_default_opt
                      [|
                        (match
                           handle_default_opt [||]
                             (ESome
                                (date_courante_5806_ >=@ date_of_numbers 2021 4 1
                                && date_courante_5806_ <@ date_of_numbers 2022 4 1))
                             (ESome (money_of_cents_string "41481"))
                         with
                        | ENone _ -> ENone ()
                        | ESome default_term_5832_ -> ESome default_term_5832_);
                      |]
                      (ESome false)
                      (match ENone () with
                      | ENone _ -> ENone ()
                      | ESome empty_litteral_5834_ -> ESome empty_litteral_5834_)
                  with
                 | ENone _ -> ENone ()
                 | ESome default_term_5836_ -> ESome default_term_5836_)
             with
            | ENone _ -> ENone ()
            | ESome default_term_5838_ -> ESome default_term_5838_);
            (match
               handle_default_opt [||] (ESome true)
                 (match
                    handle_default_opt
                      [|
                        (match
                           handle_default_opt [||]
                             (ESome
                                (date_courante_5806_ >=@ date_of_numbers 2020 4 1
                                && date_courante_5806_ <@ date_of_numbers 2021 4 1))
                             (ESome (money_of_cents_string "41404"))
                         with
                        | ENone _ -> ENone ()
                        | ESome default_term_5840_ -> ESome default_term_5840_);
                      |]
                      (ESome false)
                      (match ENone () with
                      | ENone _ -> ENone ()
                      | ESome empty_litteral_5842_ -> ESome empty_litteral_5842_)
                  with
                 | ENone _ -> ENone ()
                 | ESome default_term_5844_ -> ESome default_term_5844_)
             with
            | ENone _ -> ENone ()
            | ESome default_term_5846_ -> ESome default_term_5846_);
            (match
               handle_default_opt [||] (ESome true)
                 (match
                    handle_default_opt
                      [|
                        (match
                           handle_default_opt [||]
                             (ESome
                                (date_courante_5806_ >=@ date_of_numbers 2019 4 1
                                && date_courante_5806_ <@ date_of_numbers 2020 4 1))
                             (ESome (money_of_cents_string "41316"))
                         with
                        | ENone _ -> ENone ()
                        | ESome default_term_5848_ -> ESome default_term_5848_);
                      |]
                      (ESome false)
                      (match ENone () with
                      | ENone _ -> ENone ()
                      | ESome empty_litteral_5850_ -> ESome empty_litteral_5850_)
                  with
                 | ENone _ -> ENone ()
                 | ESome default_term_5852_ -> ESome default_term_5852_)
             with
            | ENone _ -> ENone ()
            | ESome default_term_5854_ -> ESome default_term_5854_);
          |]
          (ESome true)
          (match
             handle_default_opt
               [|
                 (match
                    handle_default_opt [||] (ESome false)
                      (match ENone () with
                      | ENone _ -> ENone ()
                      | ESome empty_litteral_5856_ -> ESome empty_litteral_5856_)
                  with
                 | ENone _ -> ENone ()
                 | ESome default_term_5858_ -> ESome default_term_5858_);
               |]
               (ESome false)
               (match ENone () with
               | ENone _ -> ENone ()
               | ESome empty_litteral_5860_ -> ESome empty_litteral_5860_)
           with
          | ENone _ -> ENone ()
          | ESome default_term_5862_ -> ESome default_term_5862_)
      with
      | ENone _ -> ENone ()
      | ESome default_term_5864_ -> ESome default_term_5864_
    with
    | ENone _ ->
        raise
          (NoValueProvided
             {
               filename = "tests/../prologue.catala_fr";
               start_line = 74;
               start_column = 10;
               end_line = 74;
               end_column = 24;
               law_headings = [ "Prologue" ];
             })
    | ESome non_empty_argument_5866_ -> non_empty_argument_5866_

This should be fixed before merging the PR :)

Copy link
Contributor

@denismerigoux denismerigoux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Alain! I left a bunch of minor comments, that PR needs a little cleaning pass and some debugging but we're nearly there :) Thanks very much for the quality of the documentation, it's appreciated.

compiler/dcalc/ast.ml Outdated Show resolved Hide resolved
compiler/dcalc/ast.ml Outdated Show resolved Hide resolved
compiler/dcalc/ast.mli Outdated Show resolved Hide resolved
compiler/dcalc/ast.mli Outdated Show resolved Hide resolved
compiler/dcalc/binded_representation.ml Outdated Show resolved Hide resolved
compiler/utils/dune Outdated Show resolved Hide resolved
compiler/utils/pos.ml Outdated Show resolved Hide resolved
compiler/lcalc/compile_without_exceptions.ml Show resolved Hide resolved
compiler/lcalc/compile_without_exceptions.ml Outdated Show resolved Hide resolved
compiler/lcalc/compile_without_exceptions.ml Outdated Show resolved Hide resolved
@adelaett
Copy link
Collaborator Author

This is a bug indeed. Inside the compilation_without_exceptions, I've interpreted the if e1 then e2 else e3 as a 'normal' operator such as addition or multiplication. This means that we first evaluate each of the branches e1, e2, and e3. If either returns an empty, then the whole result will be empty. Normally, users cannot build themself empty terms. But inside Dcalc.optimizations, one of the partial evaluation optimization translates an default term without exceptions ⟨| e_just ⊢ e_cons⟩ into an if e_just then e_cons else ∅.

I've attempted a few fixes, but I've found some issues when there is an interaction between function definition and this optimization. I believe it is both easier and more easy to understand to just deactivate the optimization.

@adelaett adelaett changed the title feat(compiler): add a new translation from default-calculus to lambda calculus without empty-errors feat(compiler): Compiling Dcalc into Lcalc without using ∅ errors Feb 18, 2022
Copy link
Contributor

@denismerigoux denismerigoux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congratulations!

@denismerigoux denismerigoux merged commit c4a7fe3 into CatalaLang:master Feb 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔧 compiler Issue concerns the compiler ✨ enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants