-
Notifications
You must be signed in to change notification settings - Fork 0
Proposal for closure reform
This is more of a meta-proposal, in which I gather all the different ways we might fix the mess that is closures in the current system. There are a bunch of features we want that are sometimes at odds.
For an outline of a specific set of possible features, see [here] (https://github.com/mozilla/rust/wiki/Proposal-for-closure-reform-%28specific%29).
We want library functions to be able to express that an argument closure will be called at most once (#2549). This will let callers capture noncopyable values and move them out in the closure body (e.g., sending an ARC endpoint over a pipe to another task).
Closures behave differently / have different capabilities depending on the kind bounds satisfied by the stuff in their environment. Closures with Const
environments can't be used for the soundness exploit described in the next subsection. Closures with Owned
environments (that are furthermore allocated on the ~-heap) can be used to spawn tasks. Closures with Copy
(Clone
) environments can be duplicated and used many times (as in #2830).
Now that &mut T
borrowed pointers are noncopyable, there is a soundness bug as demonstrated here. The gist is that, since closure environments are "opaque", a noncopyable mutable borrow can be captured in a closure, and then the closure can be copied somewhere the borrow information is not known.
(aside: This was not a problem for accidentally copying other noncopyables, since they could never be captured by-value, only "implicitly borrowed", which is why it wasn't legal to move them out.)
The recent proposal for unsized types (#6308) intersects the above soundness issue in a perilous corner-case, as explained [here] (http://smallcultfollowing.com/babysteps/blog/2013/05/13/recurring-closures-and-dynamically-sized-types/). If &fn()
can be promoted to &T
, it can be copied unconditionally, which is a problem if &fn()
s can be called to mutate their upvars.
(That's a carefully-stated condition, because there are a number of ways to prevent this: &fn()
s can't be promoted, &fn()
s can't be called, &fn()
s can't mutably borrow their environments... each with different details and different degrees of feasibility.)
pcwalton has been thinking about custom smart pointers a lot recently. It would be nice if the language did not bless @
and ~
in any way in regards to ability to create environments.
There are a bunch of ways to clean up the closure situation. The way I see it, there are 4 major decisions to make, all orthogonal to each other. I list them in rough increasing order of controversiality.
-
Default bounds are enforced based on the type of the closure (as today). Stack closures can have lifetimes; heap closures can only close over
Owned
data; neither type of closure can ever beConst
(as today). -
Function types can list, and hence inherit, kind bounds, as in
~fn:Const+Owned()
(such a function could be put inside of an ARC). Lifetime bounds can also be written, which would be useful for combinator-/curry-style functions, as infn foo(x: &'a MyState) -> @fn:'a() -> int
. (#3569)
-
No oneshot closures, those passed to higher-order functions can be called arbitrarily many times, and captured upvars can't be moved-out (as today). To move-in/move-out, extra library functions would have to be written to take and pass an extra argument (
do rwarc.read(val) |val, state| { ... }
). The con here should be obvious. -
Oneshot functions are written
once fn()
. These are consumed when called, and of course noncopyable. The con is an extra keyword. -
Closures are always consumed when you call them. Non-oneshot closures are indicated by
fn:Copy()
(later,fn:Clone()
). This is "theoretically nicest" -- it just works from the rest of the theory with no extra rules -- but has an uglier "default scenario" than case 1 (that is, all iterator functions, etc, must write kind bounds on their functions).
-
Stack closures can implicitly borrow; heap closures can't (as today). To put a borrowed pointer in a heap closure (assuming option I-2), you have to make another variable that explicitly borrows.
-
Replace implicit borrows with explicit ones, akin to
ref
pattern-bindings, introducing the option to have a capture clause. This makes it clearer when the closure wants to take ownership of something and how long closure-borrows last, and eliminates the "can't move out of captured upvar" error. The downside is referenced upvars get more verbose when autoderef doesn't apply (e.g.,let mut x = 0; do something |ref x| { (*x)++; }
).
In all of these cases (except B1), &fn()
becomes noncopyable, unless it has a :Const
bound. @fn()
is a big problem if its environment is mutable, so any proposal that keeps @fn()
s (A4, B1, B2, B3, B4) would have to add a dynamic "recursion barrier" akin to the write barrier on @mut T
boxes (not needed on a @fn:Const()
).
A. Make closure types statically-sized.
In all of these proposals, heap closures can't be promoted to borrowed closures. I don't know of any demand for doing so, and in any case you can always just do the manually-currying thing of |args| { heap_closure(args) }
.
-
&fn()
becomesfn()
,@fn()
is killed, and~fn()
becomesproc()
(variant 2 of [this idea] (http://smallcultfollowing.com/babysteps/blog/2013/05/14/procedures/)). The con here is that it breaks the "no dynamic allocation unless you see a ~ or a @" rule. -
&fn()
becomesfn()
,@fn()
is killed, and~fn()
becomes~proc()
(variant 3 of [this idea] (http://smallcultfollowing.com/babysteps/blog/2013/05/14/procedures/)). Semantically this is no different from case 1, but the con is thatfn
is a first-class (sized) type, whileproc
is a second-class (unsized) type, which I think has a lot of confusingness potential. -
&fn()
becomesfn()
,@fn()
and~fn()
are killed and replaced by macros that use@Trait
or~Trait
to accomplish the same thing (especially in task spawning) ([this idea] (http://smallcultfollowing.com/babysteps/blog/2013/05/30/removing-procs/)). -
&fn()
becomesfn&()
(fn()
can be sugar for this),@fn()
becomesfn@()
, and~fn()
becomesfn~()
. This retains most expressivity, but the con is that "common wisdom" says pointer sigils after types confuse people. (I don't mind; to me it suggests "I'm not a pointer but I have one inside". It's just parameterising over pointer types.)
B. Make closure types dynamically-sized ("always have to be behind a pointer"), but prevent them from being copied.
-
Borrowed closures can't be called at all as
&fn()
; you have to use&mut fn()
. This has a horrible "default case", and also prevents oneshot closures from working (since you could just re-borrow the&mut
to call it twice in a row, even though Y-combinator-like recursion is prevented). -
Allow unsized promotion to
&T
, but restrict when generic&T
can be copied. This could be done with a new kind bound, such asfn copy_ref<T: NoMutableEnvironment>(x: &'a T) -> (&'a T, &'a T)
. This allows oneshot closures, but makes&T
's behaviour depend onT
. -
Don't allow closures to be promoted to
&T
(although vectors still can be). The con is the weird discrepancy between vectors and closures ("why can I do... but can't...?"). Note that this is almost identical to case 4, but the pointer sigils stay in front instead of behind thefn
(except this still lets~fn()
be promoted to&fn()
). -
Don't do dynamically-sized types at all. This also saves the pain of introducing
Sized
bounds into all data structure functions, but of course precludes the future possibilities discussed in #6308.
Ultimate preferences
I: 2 for sure.
II: prefer 3; 2 would be OK.
III: prefer 2; 1 would be OK.
IV: in order of preference: B3, A4, A3, B4, B2.
Scheduling for 1.0 vs post-1.0
I: For the most part, we could leave option 1 and add option 2 post-1.0. However, for backwards-compatibility, we have to make sure that the "default bounds" in option 1 are "no bounds at all" (e.g., if v1.0's ~fn()
means ~fn():Owned
with post-1.0-option-2, then some v1.0 code that uses ~fn()
s might assume Owned
capabilities and break later). Since task body closures need to own their environment, we can't make no default bounds on ~fn()
s, so either:
- We decide between option 1 or 2 pre-1.0 and stick with it forever
- We get rid of all heap closures (Niko's idea, IV-A-3), and can then make all default environment bounds empty.
However, note they add a piece of expressivity that we don't yet have (we can't put bounds on a Trait's existential data yet either).
II: Options 1 (nothing) and 2 (once
) are future-proof (that is, option 2 can be safely added post-1.0 if we reserve once
). Option 3 (making fn:Clone
mean not-once) is not future-proof, because it removes capabilities from plain fn()
s, so it requires both itself and option I-2 to happen pre-1.0.
III: Changing from either option to the other is not future-proof, no way, no how.
IV: Niko's idea, A3, can happen for 1.0 and later be upgraded to A1, A2, A4, or B3. Option B4 can be upgraded later to option B3 (that is, deferring dynamically-sized types to post-1.0). The other options are stickier.
Recommended plan:
- For bounds and oneshots: Because of my preference for II-3, and the note about expressivity, I want I-2 and II-3 to happen pre-1.0. If II-2 outvotes II-3, then (assuming IV-A3) I-2 and II-2 can both happen independently either pre- or post-1.0.
- For environment borrowing: I think III-2 is theoretically prettier, but it is a lot of churn, so I won't be offended if we do III-1.
- For fixing the soundness bug: IV-A3 is clearly the most flexible, both for rushing to 1.0 and for getting a nice final product.
- Point-by-point comments:
- Once-ness
- It would be useful to come up with a list of functions where
once-ness is applicable.
option::each
is certainly a prime example, though you can so easily usematch
that it's not especially persuasive to me.
- It would be useful to come up with a list of functions where
once-ness is applicable.
- Soundness with
&mut
- The problem is not that
&mut
is captured by value whereas other types are not. The problem is that they are captured by reference, which means that the type of an upvar is really something like&&mut
, but we don't recognize that in the type system. Of course, if we did, closures would be nigh unusable, hence the need to change how we treat closures to track them more linearly.
- The problem is not that
- Soundness with first-class DST
- Sounds about right, though I wouldn't say that
&fn
is "promoted" to&T
, more thatT
can be bound tofn
- Sounds about right, though I wouldn't say that
- Environment bounds
- Option 1 is incompatible with DST
- Onceness
- Option 3 was considered a long time ago and rejected because it
makes calling functions repeatedly very expensive, and this is
not exactly uncommon. Moreover, there is now a very large
category of functions that can be safely called multiple times
but cannot safely be copied (e.g., any closure that references
&mut
). - A type like
once fn
only works if fns are values. In other words, it is incompatible with DST, and also... largely incompatible with the notation of&once fn
, unless we are really stretching what&
means. This is something I didn't fully realize at first. Therefore, if you wanted to keep the@/~/&fn
scheme we have today,once
would effectively becomea sigil of its own. But that is not adequate, since you must differentiate "borrowed" once closures, where the memory is on the stack, from "owned" once closures. So you want something likeonce fn&
. Or something.
- Option 3 was considered a long time ago and rejected because it
makes calling functions repeatedly very expensive, and this is
not exactly uncommon. Moreover, there is now a very large
category of functions that can be safely called multiple times
but cannot safely be copied (e.g., any closure that references
- Implicit vs expicit borrows in the environment
- Though this doesn't really relate to point 1 or 2, you raise an
interesting point that when we consider what a type like
fn:K
means, I think the boundK
needs to consider the "true type" of the closed over variables. The reason is that we want to say thatfn:K
has kindK
, essentially. In other words, if a variable of typeT
is mutated, the "true type" of the upvar would be&mut T
, notT
. This implies that a function likefn:Owned
could never be used, since herefn
is always a closure with borrowed pointers to the creating environment, and thus will never contain onlyOwned
values. There is still a use forfn:Share
orfn:Freeze
, which is roughly the equivalent of the oldpure fn
(minus the parametricity).
- Though this doesn't really relate to point 1 or 2, you raise an
interesting point that when we consider what a type like
- IV. Fix the soundness bug
-
@fn
is not really a problem. Basically the rule would be something like that "to call a fn, you first borrow it to&mut fn
". This would just not be possible for&fn
or@fn
. As you point out, you could potentially loosen this rule and say "if the fn has boundFreeze
, you first borrow it to&fn
". Then it would fine for@fn
and&fn
.@mut fn
would impose dynamic checks as before.However, if we permit recursion for
fn:Freeze
, we would have to ensure that afn:Freeze
could not move from values in any way. I had in mind permitting functions to move from a value as long as they replace it before returning; this is useful when writing things fold patterns. -
A. Options 1--4 sound about right.
-
B. Options 1--4 sound about right.
-
- Once-ness
Instead of going for one of the listed options, I instead propose that functions become objects which implement an trait rather than being a distinct type. The full proposal is here.