-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Monads vs. constructors #31
Comments
Context: when writing up StateLang in my thesis last week, I did not understand what was going on with On a new branch:
Steps 1-3 can be started alone, but the remainder will require a collective effort to update. @myreen points out a key subtlety: arrays. These store expressions in PureLang, and so must store thunks in ThunkLang onwards. The ThunkLang semantics must therefore check it only stores thunks in arrays. We should also fix a current inefficiency: all array allocations store A more minor subtlety could be projections. In particular, projections are forced in the transition to ThunkLang, but if projecting out of a monadic operation this could now produce a type error. However, as projections out of monadic operations should be forbidden except in the contextual equivalence proofs, this should be fine - but may require some changes to relations. |
Initial work on this is on the
|
This is complete as of 2751040. |
This issue suggests modifying the specification of monadic operations to be distinct from constructors from at least ThunkLang onwards.
Motivation
Currently, constructors are compiled to the call-by-value ThunkLang by
Delay
ing their arguments. This ensures the "deep" evaluation of constructor arguments in ThunkLang is compatible with the "shallow" weak-head evaluation in PureLang.Arguments of monadic operations receive the same treatment because they are conflated with constructors - however, we don't morally compile monadic operations to call-by-value primitives until StateLang. In particular, we expect arguments to monadic operations to be
Delay
ed right until we compile away monadic operations in StateLang - otherwise, call-by-value Thunk/Env/StateLang would evaluate monadically sequenced code prematurely and semantics-preservation from PureLang would not hold.But this has negative effects on specification, implementation, and verification from ThunkLang onwards. In particular:
Delay
ed.Delay
ed by enforcing it inexp_of
. Thethunk_to_env
pass removes the compiler expressionDelay
s, which are recovered on expansion to semantics expressions.env_to_state
quite subtle - though it "compiles away" the monad, it must insert variousDelay
operations (both in monadic arguments and around effectful operations) to remain compatible with ThunkLang/EnvLang behaviour, which requires the monadic arguments to be forced.env_to_state
also seems "inconsistent", in thatRet
/Raise
are treated quite differently to other monadic operations.Delay
s appear in the compiler implementation, and so produce odd-looking (and perhaps poor quality) code. A particular example is that all array updates and allocations actually storeDelay
ed values into the array, slowing down array-based code significantly.mk_delay
alwaysDelay
s monadic arguments).This issue argues that we should separate constructors and monadic operations when their semantics starts to differ, i.e. in ThunkLang. Constructors can be deeply evaluated in ThunkLang as they are now, but monadic operations should remain suspended (i.e. shallowly/weak-head evaluated) until we actually want to compile them away in StateLang.
In other words, in PureLang conflating the two is fine because they have the same weak-head semantics, but in ThunkLang constructors have call-by-value semantics and we want monadic operations to remain weak-head.
High-level implementation
Overall, we would need to implement the following:
monadic
operations always halt immediately to a weak-head style value, i.e.eval (monadic mop es) = monadic mop es
Delay
, e.g.:compile (Prim (Cons "Ret") [e]) = monadic Return [compile e]
monadic
operations straightforwardly through EnvLang:compile (monadic mop es) = monadic mop (MAP compile es)
force
, as it is no longer needed. ThunkLang semantics will still need to handleMkTick
, but this is "optional" in the sense it is never required/expected to be anywhere, so shouldn't impose subtle invariants on code.env_to_state
(proofs and implementation) to remove the variousDelay
operations, as these are no longer expected as input nor required on output.For completeness, we could also bubble up the change to PureLang - it might be more elegant than using reserved constructor names. But this is not necessary and would require updating equational reasoning/parsing/typing/demands/... so can certainly wait.
Pros and cons
This could potentially fix the issues discussed above. In particular, ThunkLang/EnvLang could have more natural semantics,
env_to_state
could be much simpler and produce better code, overall fewer invariants/special cases need to be carried through compilation. Moreover, compilation of monadic operations is easier to specify, verify, and understand: they are passed down the compiler essentially unchanged, until StateLang when they are all compiled away. Currently we have a change inpure_to_thunk
which is cumbersome in later languages, followed by the actual compilation in StateLang. Overall, it feels like monadic operations could have a much cleaner compilation story with this change.Of course, this would be a significant change to the compiler. But in a sense, we are doing the work already, but more messily - we enforce
Delay
ed monadic arguments (modelling weak-head evaluation) until StateLang anyway with invariants and very careful compilation, but this change would effectively make it happen "for free" (in the sense that we don't need to assert or enforce it).This new construct would require mutual recursion between expressions and values. Fortunately ThunkLang already has this - so EnvLang would be the biggest change, and no optimisation happens in EnvLang, minimising this burden.
The text was updated successfully, but these errors were encountered: