From cf061d2cac7e0cfb12b1416dd4c2f3cb59c76d37 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michal=20Hornick=C3=BD?= Date: Thu, 10 Oct 2019 22:07:55 +0200 Subject: [PATCH 01/15] Create 0000-unified_coroutines.md --- text/0000-unified_coroutines.md | 411 ++++++++++++++++++++++++++++++++ 1 file changed, 411 insertions(+) create mode 100644 text/0000-unified_coroutines.md diff --git a/text/0000-unified_coroutines.md b/text/0000-unified_coroutines.md new file mode 100644 index 00000000000..a544ab91c8c --- /dev/null +++ b/text/0000-unified_coroutines.md @@ -0,0 +1,411 @@ +- Feature Name: `unified_coroutines` +- Start Date: 2019-10-09 +- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) +- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000) + +# Summary +[summary]: #summary + +Unify the `Generator` traits with the `Fn*` family of traits. Add a way to pass to pass new arguments upon each resuming of the generator. + +# Motivation +[motivation]: #motivation + +The generators/coroutines are extremely powerful concept, but its implementation in Rust is severely limited, and its usage requires workarounds to achieve useful patterns. The current view of generators is also extremely disconnected from similar concepts, which already exist in the language and the standard library; + + +# Guide-level explanation +[guide-level-explanation]: #guide-level-explanation + +Consider a function. In the olden days, also called a subroutine. This concept is at the core of every programming language, since it is extremely useful. The subroutine has a single entrypoint and a single exit point. Both of these points provide +an interface, which is used to transfer data. Upon entering, caller can pass data into the subroutine and upon exiting, the subroutine can return data back to the caller. + +Many programming languages(including Rust) also adopted more general concept, callled coroutine. The coroutine , or `Generator` in Rust terms differs from function/subroutine in a single way. It allows the coroutine to `suspend` itself, by storing its state, and `yield`ing control back to the caller, along with data. The caller can then repeatedly pass more data back into the coroutine, and `resume` it. This back-and forth communication is an extremely useful tool for solving a wide class of problems. + +Except this is not the truth for Rust's coroutines. The `Generator`, as it was introduced in order to provide a tool for implementing `async-await`, does not provide this functionality. In order to implement async-await feature, the generators were implemented in their most basic form. And in this form, Generators can't accept arguments, and are connected to closures only in their syntax, and not their representation in the type system. + +These issues severely lessen the usability of the generator feature, and are not difficult to solve. + +Current generators take the form of a closure, which contains at least a single yield statement. + +```rust +let name = "World"; +let gen = || { + yield "Hello"; + yield name; +}; +``` +And are used by calling the `resume` method on the generator. +```rust +println!("{}", gen.resume()); +println!("{}", gen.resume()); +``` + +Which results in +```rust +Yielded("Hello") +Yielded("World") +``` + +This RFC proposes the ability of a generator to take arguments with a syntax used by closures. +```rust +let gen = |name : &'static str| { + yield "Hello"; + yield name; +} +``` + +Then, we propose a way to pass the arguments to the generator in the form of a tuple. +```rust +println!("{}", gen.resume(("Not used")); +println!("{}", gen.resume(("World")); +``` +Which would also result in: +```rust +Yielded("Hello") +Yielded("World") +``` +Notice that the argument to first resume call was unused, and the generator yielded only only the value which was passed to the second resume. This behavior is radically different from the first example, in which the name variable from outer scope was captured by generator and yielded with the second `yield` statment; + +The behavior, in which a generator consumes a different value upon each resume is currently not possible without introducing some kind of side channel, like storing the expected value in thread local storage, which is what the current implementation of async-await does. + +There are possible other syntaxes to denote the fact that the value assigned to `name` is different after each `yield`, but we believe that the simplest syntax, which is used in the example above, is in this case the best. + +### Alternative syntaxes +Assigning into the name to denote that the value is changed +```rust +let gen = |name :&static str| { + name = yield "hello"; + name = yield name; +} +``` +Or +```rust +let gen = |name :&static str| { + let (name, ) = yield "hello"; + let (name, ) = yield name; +} +``` +We think that both of these approaches are wrong. In the first approach, we are unable to denote assignment to multiple values at the same time, and the second example denotes shadowing of the `name` binding, which also is incorrect behavior. Another issue with these approaches is, that they require progmmer, to write additional code to get the default behavior. In other words: What happens when user does not perform the required assignment ? Simply said, this code is permitted, but nonsensical: +```rust +let gen = |name :&static str| { + yield "hello"; + let (name, ) = yield name; +} +``` +The example looks like we aren't assigning to the `name` binding, and therefore upon the second yield we should return the value which was passed into the first resume function, but the implementation of such behavior would be extremely complex and probably would not correspond to what user wanted to do in the first place. Another problem is that this design would require making `yield` an expression, which would remove the correspondence of `yield` statement with the `return` statement. + +The design we propose, in which the generator arguments are mentioned only at the start of the generator most closely resembles what is hapenning. And the user can't make a mistake by not assigning to the argument bindings from the yield statement. Only drawback of this approach is, the 'magic'. Since the value of the `name` is magically changed after each `yield`. But we pose that this is very similar to a closure being 'magically' transformed into a generator if it contains a `yield` statement and as such is acceptable amount of 'magic' behavior for this feature. + +![magic](https://media2.giphy.com/media/12NUbkX6p4xOO4/giphy.gif) + +Nonetheless, the introduction of this implicit behavior will require additional cognitive load for new users when learning this feature. However, the behavior of Generators without arguments is unchanged, and therefore this change does not impose this cost upfront, making it possible to introduce the more complex behavior in progressively more complex examples. + +This change would result in following generator trait. + +```rust +pub trait Generator { + type Yield; + type Return; + fn resume(self: Pin<&mut Self>, args: Args) -> GeneratorState; +} +``` + +While the RFC does not deal with the lifetimes of the arguments, the similarity of the modified `Generator` trait with the existing `Fn*` traits suggests that rules which currently apply to closures will also apply to generators. [More info later](theoretical-basis). + +### Use cases: +1. Futures generated by async-await. The current implementation of async futures requires the use of thread-local storage in order to pass the `task::Context` argument into underlying futures. This imposes small, but not zero overhead, which would be removed by this RFC. + +2. Protocol state machines - When a user wants to implement a state machine in order to correctly represent a network protocol, +the ususal approach is to create a `State` enum, and upon every state machine transition `mem::replace`the current state with a default one, and perform a match, to possibly generate a new state, which is then stored into the place of original state. + +Example: +Consider an implementation of following state machine: +``` + a b + +-----+ +----+ + | | | | + | +--v--+ b +-----+ | + | | +-----> | | + +--+ F | | S <-+ + | <-----+ | + +-----+ a +-----+ + +``` +How the similar state machines are implemented today: +```rust +enum State { Empty, First, Second } + +enum Event { A, B } + +fn machine(state: &mut State, event: Event) -> &'static str { + match (mem::replace(state, State::Empty), event) { + (State::First, Event::A) => { + *state = State::First; + return "Action First(A)"; + } + (State::First, Event::B) => { + *state = State::Second; + return "Action First(B)"; + } + (State::Second, Event::A) => { + *state = State::First; + return "Action Second(A)"; + } + (State::Second, Event::B) => { + *state = State::Second; + return "Action Second(B)"; + } + } +} +``` +How we could implement similar state machines after this RFC is accepted: +```rust +let machine = |action| { + loop { + // First state + while action == Action::A { + yield "Action First(A)"; + } + // Second state + yield "Action First(B)"; + while action == Action::B { + yield "Action Second(B)"; + } + yield "Action Second(A)"; + } +}; + ``` +To see how would the generated code change, check out [This sample](addendum-samples) + +# Reference-level explanation +[reference-level-explanation]: #reference-level-explanation +The proposed design modifies the Generator trait, and the MIR which is generated for generators. + +The proposed changes to generator trait are pretty straightforward and do not change its properties significantly. However, they allow us to unify the language view of a closure with a that of a generator. + +The implementation of MIR generation will be more complex, and the author of this RFC is unable to properly gauge the amount of work that will be required. + +### Theoretical basis +[theoretical-basis]: #theoretical-basis +The goal of this RFC is to unify the Rust's implementation of Generators, and the theoretical concept of 'Coroutine' as a generalization of the 'Subroutine/Function', and with this unification also comes the unification of rusts Generators and Functions for free. + +The current implementaion servers its purpose (at least for async-await), but is sevely limited and disjointed from the rest of the language. By introducing the arguments in the same way that they are represented in proposed `Fn` traits, we can bring these 2 concepts more closely together. Additinal info about this unification can be found in [future-possibilities], but if the generator arguments are introduced in proposed form, the future modifications will be just syntax improvements, and will not change the semantics in a significant way. + +Example of current `Fn*` traits: +```rust +pub trait FnOnce { + type Output; + fn call_once(self, args: Args) -> Self::Output; +} +pub trait FnMut : FnOnce { + fn call_mut(&mut self, args: Args) -> Self::Output; +} +``` +And a proposed `Generator` trait: +```rust +pub trait Generator { + type Yield; + type Return; + + fn resume(self: Pin<&mut Self>, args: Args) -> GeneratorState; +} +``` +Considering the closeness of these 2 concepts, it might be beneficial to utilize a different form of the `Generator` trait. +```rust +pub trait FnGen : FnOnce> { + type Yield; + type Return; + + fn call_resume(self: Pin<&mut Self>, args: Args) -> Self::Output; +} +``` +It might also be beneficial to introduce a new trait for denoting closures which must be pinned between invocations: +```rust +pub trait FnPin : FnOnce { + fn call_pin(self: Pin<&mut Self>, args: Args) -> Self::Output; +} +``` +And either utilize the `FnPin` as a `FnGen/Generator` supertrait, or disregard the `FnGen/Generator` trait completely +and utlize generators as a trait alias for a `FnPin>` + +# Drawbacks +[drawbacks]: #drawbacks + +1. Increased complexity of implementation of the Generator feature. + +2. If we only implement necessary parts of this RFC, users will need to pass empty tuple into the `resume` function for most common case, which could be solved by introducing a a trivial trait +```rust +trait NoArgGenerator : Generator<()> { + fn resume(self : Pun<&mut Self>) -> GeneratorState { + self.resume_with_args(()) + } +} +``` +If we introduce this trait, and rename the original `Generator::resume` method to `Generator::resume_with_args`, the existing behavior will not change. But we think this approach is **WRONG**, and the Generator trait should stay with the changes proposed above (`resume`/`call_resume` accepting a tuple). The rationale for this decision is provided in [future-possibilities] section. + +3. Need to teach the special interaction between generator arguments and the yield statement. + +# Rationale and alternatives +[rationale-and-alternatives]: #rationale-and-alternatives + +Rationale: +- This introduces the missing piece into the generators, rounding out an unfinished feature. +- Unifies 2 parts of the language, improving laungage coherence +- Allows implementations of new patterns, such as complex state machines, which are an extremely useful tool in several areas, eg. network protocol implementations. + +Alternatives: +- Only current alternative is storing required information inside side channels such as `Rc>` +or a thread local storage, which introduces runtime overhead and requires `std`. Proposed design is `no_std` compatible. +- The proposed syntax could be changed, but from the explored options, we pose that the simplest syntax is best, even though it introduces new semantics. +- Leave the generator as is, leaving it disconnected from the rest of the language. +- Remove generators completely + +# Prior art +[prior-art]: #prior-art + +- Pre-RFC published by different author on rust-internals forum [link](https://internals.rust-lang.org/t/pre-rfc-generator-resume-args/10011/5) + + Explored the design space and proposed a basic design of the modified `Generator trait`. The described approach solved only the 'Generator resume arguments' part of this RFC, and did not attempt to unify generators with closures, which resulted in unnecessarily complex design, which kept the generators as a separate concept, and even introduced another layer of complexity in form of a trait alias for generators with/without arguments. But nonetheless, reading this discussion was an invaluable source of ideas in this design space. + +- Work on implementing futures which wouldn't require TLS: [link](https://github.com/rust-lang/rust/issues/62918) + +- Python & Lua coroutines - They can be resumed with arguments, with yield expression returning these values [usage](https://www.tutorialspoint.com/lua/lua_coroutines.htm). + + These are interesting, since they both adopt a syntax, in which the yield expression returns values passed to resume. We think that this approach is right one for dynamic languages like Python or lua but the wrong one for Rust. The reason is, these languages are dynamically typed, and allow passing of multiple values into the coroutine. The design proposed here is static, and allows passing only a single argument into the coroutine, a tuple. The argument tuple is treated the same way as in the `Fn*` family of traits. + +# Unresolved questions +[unresolved-questions]: #unresolved-questions + +- Proposed syntax: Do we somehow require assignemnt from yield expression(As outlined by [different pre-rfc](https://internals.rust-lang.org/t/pre-rfc-generator-resume-args/10011)), or we do we specify arguments only at the start of the coroutine, and require +explanation of the different behavior in combination with the `yield` keyword explanation ? + +- Do we unpack the coroutine arguments, unifying the behavior with closures, or do we force only a single argument and encourage the use of tuples ? + +- Do we allow non `'static` coroutine arguments ? How would they interact with the lifetime of the generator, if the generator moved the values passed into `resume` into its local state ? + +- Do we adopt the `FnGen` form of the generator trait and include it into the `Fn*` trait hierarchy making it first class citizen in the type system of closures ? + +- Do we introduce the `FnPin` trait into the `Fn*` hierarchy and make `FnGen/Generator` just an alias ? + +# Future possibilities +[future-possibilities]: #future-possibilities +One of the areas of improvement, is interaction with generators. Currently the generator takes the form of a closure, but the resume method is called like a trait method. One of the future improvements would be making generators callable like closures, since this RFC recomments their unification with closures. + +So, current syntax with 2 arguments looks like: +```rust +let mut gen = Pin::box(|name| { + yield ("Hello", name); + yield ("Bye", name); +}); +gen.resume(("unused", "arg")); +gen.resume(("world", "!")); +``` +Would become +```rust +let mut gen = Pin::box(|name| { + yield ("Hello", name); + yield ("Bye", name); +}); +let _ = gen("unused", "arg"); +let second = gen("world", "!"); +``` + +And this is would fully unify the interface provided by generators with the one provided by closures, but is intertwined with other issues, like [Fn traits](https://github.com/rust-lang/rust/issues/29625) and thus would block the current RFC. Therefore we propose generator syntax accepting the multiple arguments: +```rust +let gen = |a,b,c| { + yield a; + yield b; + yield c; +} +``` +But the `Generator::resume` method accepting tuple of arguments, which are unpacked by the compiler. +```rust +let a = gen.resume(("Why", "Hello", "There !")); +let b = gen.resume(("Why", "Hello", "There !")); +let c = gen.resume(("Why", "Hello", "There !")); +``` +Since this approach most closely resembles current approach to Function traits, and since we consider the concept of a Coroutine/Generator a generalization of the Function concept we aim to unify these 2 concepts. Then, the further work that deals with closures could be transparently applied to generators. + +However, the main goal of this RFC is to provide a basis for these decisions and discussions after the `FnGen/Generator` trait is introduced, and the ability of generators to accept arguments is implemented. + +# Addendum: samples +[addendum-samples]: #addendum-samples + +The Generator concept is transformed into state machine on the MIR level, which is contained inside single function. The current implementation is transformed to something like this: + +```rust +let captured_string = "Hello"; +let mut generator = { + enum __Generator { + Start(&'static str), + Yield1(&'static str), + Done, + } + + impl Generator for __Generator { + type Yield = i32; + type Return = &'static str; + + fn resume(mut self: Pin<&mut Self>) -> GeneratorState { + use std::mem; + match mem::replace(&mut *self, __Generator::Done) { + __Generator::Start(s) => { + *self = __Generator::Yield1(s); + GeneratorState::Yielded("Hello") + } + + __Generator::Yield1(s) => { + *self = __Generator::Done; + GeneratorState::Complete(s) + } + + __Generator::Done => { + panic!("generator resumed after completion") + } + } + } + } + + __Generator::Start(captured_string) +}; +``` + +After implementing changes in this RFC, the generated code could be approximated by this: + +```rust +let captured_string = "Hello" +let mut generator = { + enum __Generator { + Start(&'static str), + Yield1(&'static str), + Done, + } + + impl Generator<(&'static str,)> for __Generator { + type Yield = i32; + type Return = &'static str; + + fn resume(mut self: Pin<&mut Self>, (name,) : (&'static str,)) -> GeneratorState { + use std::mem; + match mem::replace(&mut *self, __Generator::Done) { + __Generator::Start(s) => { + *self = __Generator::Yield1(s); + GeneratorState::Yielded("Hello") + } + + __Generator::Yield1(s) => { + *self = __Generator::Done; + GeneratorState::Complete(name) + } + + __Generator::Done => { + panic!("generator resumed after completion") + } + } + } + } + + __Generator::Start(captured_string) +}; +``` From ad0e062036d82279e7c8e2d384d2b5b0431e4372 Mon Sep 17 00:00:00 2001 From: Dustin Bensing Date: Thu, 10 Oct 2019 23:41:03 +0200 Subject: [PATCH 02/15] Update 0000-unified_coroutines.md fixed small typos and formatting --- text/0000-unified_coroutines.md | 42 ++++++++++++++++----------------- 1 file changed, 21 insertions(+), 21 deletions(-) diff --git a/text/0000-unified_coroutines.md b/text/0000-unified_coroutines.md index a544ab91c8c..faffd3bfd62 100644 --- a/text/0000-unified_coroutines.md +++ b/text/0000-unified_coroutines.md @@ -6,7 +6,7 @@ # Summary [summary]: #summary -Unify the `Generator` traits with the `Fn*` family of traits. Add a way to pass to pass new arguments upon each resuming of the generator. +Unify the `Generator` traits with the `Fn*` family of traits. Add a way to pass new arguments upon each resuming of the generator. # Motivation [motivation]: #motivation @@ -49,7 +49,7 @@ Yielded("World") This RFC proposes the ability of a generator to take arguments with a syntax used by closures. ```rust -let gen = |name : &'static str| { +let gen = |name: &'static str| { yield "Hello"; yield name; } @@ -57,15 +57,15 @@ let gen = |name : &'static str| { Then, we propose a way to pass the arguments to the generator in the form of a tuple. ```rust -println!("{}", gen.resume(("Not used")); -println!("{}", gen.resume(("World")); +println!("{}", gen.resume(("Not used"))); +println!("{}", gen.resume(("World"))); ``` Which would also result in: ```rust Yielded("Hello") Yielded("World") ``` -Notice that the argument to first resume call was unused, and the generator yielded only only the value which was passed to the second resume. This behavior is radically different from the first example, in which the name variable from outer scope was captured by generator and yielded with the second `yield` statment; +Notice that the argument to first resume call was unused, and the generator yielded only the value which was passed to the second resume. This behavior is radically different from the first example, in which the name variable from outer scope was captured by generator and yielded with the second `yield` statment; The behavior, in which a generator consumes a different value upon each resume is currently not possible without introducing some kind of side channel, like storing the expected value in thread local storage, which is what the current implementation of async-await does. @@ -74,28 +74,28 @@ There are possible other syntaxes to denote the fact that the value assigned to ### Alternative syntaxes Assigning into the name to denote that the value is changed ```rust -let gen = |name :&static str| { +let gen = |name: &static str| { name = yield "hello"; name = yield name; } ``` Or ```rust -let gen = |name :&static str| { +let gen = |name: &static str| { let (name, ) = yield "hello"; let (name, ) = yield name; } ``` -We think that both of these approaches are wrong. In the first approach, we are unable to denote assignment to multiple values at the same time, and the second example denotes shadowing of the `name` binding, which also is incorrect behavior. Another issue with these approaches is, that they require progmmer, to write additional code to get the default behavior. In other words: What happens when user does not perform the required assignment ? Simply said, this code is permitted, but nonsensical: +We think that both of these approaches are wrong. In the first approach, we are unable to denote assignment to multiple values at the same time, and the second example denotes shadowing of the `name` binding, which also is incorrect behavior. Another issue with these approaches is, that they require the programmer, to write additional code to get the default behavior. In other words: What happens when the user does not perform the required assignment? Simply said, this code is permitted, but nonsensical: ```rust -let gen = |name :&static str| { +let gen = |name: &static str| { yield "hello"; let (name, ) = yield name; } ``` -The example looks like we aren't assigning to the `name` binding, and therefore upon the second yield we should return the value which was passed into the first resume function, but the implementation of such behavior would be extremely complex and probably would not correspond to what user wanted to do in the first place. Another problem is that this design would require making `yield` an expression, which would remove the correspondence of `yield` statement with the `return` statement. +The example looks like we aren't assigning to the `name` binding, and therefore upon the second yield we should return the value which was passed into the first resume function, but the implementation of such behavior would be extremely complex and probably would not correspond to what the user wanted to do in the first place. Another problem is that this design would require making `yield` an expression, which would remove the correspondence of `yield` statement with the `return` statement. -The design we propose, in which the generator arguments are mentioned only at the start of the generator most closely resembles what is hapenning. And the user can't make a mistake by not assigning to the argument bindings from the yield statement. Only drawback of this approach is, the 'magic'. Since the value of the `name` is magically changed after each `yield`. But we pose that this is very similar to a closure being 'magically' transformed into a generator if it contains a `yield` statement and as such is acceptable amount of 'magic' behavior for this feature. +The design we propose, in which the generator arguments are mentioned only at the start of the generator most closely resembles what is hapenning. And the user can't make a mistake by not assigning to the argument bindings from the yield statement. Only drawback of this approach is, the 'magic'. Since the value of the `name` is magically changed after each `yield`. But we pose that this is very similar to a closure being 'magically' transformed into a generator if it contains a `yield` statement and as such is an acceptable amount of 'magic' behavior for this feature. ![magic](https://media2.giphy.com/media/12NUbkX6p4xOO4/giphy.gif) @@ -234,10 +234,10 @@ and utlize generators as a trait alias for a `FnPin { - fn resume(self : Pun<&mut Self>) -> GeneratorState { + fn resume(self: Pun<&mut Self>) -> GeneratorState { self.resume_with_args(()) } } @@ -272,21 +272,21 @@ or a thread local storage, which introduces runtime overhead and requires `std`. - Python & Lua coroutines - They can be resumed with arguments, with yield expression returning these values [usage](https://www.tutorialspoint.com/lua/lua_coroutines.htm). - These are interesting, since they both adopt a syntax, in which the yield expression returns values passed to resume. We think that this approach is right one for dynamic languages like Python or lua but the wrong one for Rust. The reason is, these languages are dynamically typed, and allow passing of multiple values into the coroutine. The design proposed here is static, and allows passing only a single argument into the coroutine, a tuple. The argument tuple is treated the same way as in the `Fn*` family of traits. + These are interesting, since they both adopt a syntax, in which the yield expression returns values passed to resume. We think that this approach is the right one for dynamic languages like Python or lua but the wrong one for Rust. The reason is, these languages are dynamically typed, and allow passing of multiple values into the coroutine. The design proposed here is static, and allows passing only a single argument into the coroutine, a tuple. The argument tuple is treated the same way as in the `Fn*` family of traits. # Unresolved questions [unresolved-questions]: #unresolved-questions - Proposed syntax: Do we somehow require assignemnt from yield expression(As outlined by [different pre-rfc](https://internals.rust-lang.org/t/pre-rfc-generator-resume-args/10011)), or we do we specify arguments only at the start of the coroutine, and require -explanation of the different behavior in combination with the `yield` keyword explanation ? +explanation of the different behavior in combination with the `yield` keyword explanation? -- Do we unpack the coroutine arguments, unifying the behavior with closures, or do we force only a single argument and encourage the use of tuples ? +- Do we unpack the coroutine arguments, unifying the behavior with closures, or do we force only a single argument and encourage the use of tuples? -- Do we allow non `'static` coroutine arguments ? How would they interact with the lifetime of the generator, if the generator moved the values passed into `resume` into its local state ? +- Do we allow non `'static` coroutine arguments? How would they interact with the lifetime of the generator, if the generator moved the values passed into `resume` into its local state? -- Do we adopt the `FnGen` form of the generator trait and include it into the `Fn*` trait hierarchy making it first class citizen in the type system of closures ? +- Do we adopt the `FnGen` form of the generator trait and include it into the `Fn*` trait hierarchy making it first class citizen in the type system of closures? -- Do we introduce the `FnPin` trait into the `Fn*` hierarchy and make `FnGen/Generator` just an alias ? +- Do we introduce the `FnPin` trait into the `Fn*` hierarchy and make `FnGen/Generator` just an alias? # Future possibilities [future-possibilities]: #future-possibilities @@ -332,7 +332,7 @@ However, the main goal of this RFC is to provide a basis for these decisions and # Addendum: samples [addendum-samples]: #addendum-samples -The Generator concept is transformed into state machine on the MIR level, which is contained inside single function. The current implementation is transformed to something like this: +The Generator concept is transformed into a state machine on the MIR level, which is contained inside a single function. The current implementation is transformed to something like this: ```rust let captured_string = "Hello"; @@ -371,7 +371,7 @@ let mut generator = { }; ``` -After implementing changes in this RFC, the generated code could be approximated by this: +After implementing the changes in this RFC, the generated code could be approximated by this: ```rust let captured_string = "Hello" From 7f7fd40eb77816e675d08282cd4d8a68eed29503 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michal=20Hornick=C3=BD?= Date: Thu, 10 Oct 2019 23:51:17 +0200 Subject: [PATCH 03/15] Reformat syntax options, fix mistakes --- text/0000-unified_coroutines.md | 60 +++++++++++++++++++++++++++++---- 1 file changed, 54 insertions(+), 6 deletions(-) diff --git a/text/0000-unified_coroutines.md b/text/0000-unified_coroutines.md index a544ab91c8c..f1441a26932 100644 --- a/text/0000-unified_coroutines.md +++ b/text/0000-unified_coroutines.md @@ -72,28 +72,56 @@ The behavior, in which a generator consumes a different value upon each resume i There are possible other syntaxes to denote the fact that the value assigned to `name` is different after each `yield`, but we believe that the simplest syntax, which is used in the example above, is in this case the best. ### Alternative syntaxes -Assigning into the name to denote that the value is changed +There are several other possible syntaxes, but each one has its problems. Several are outlined here + +1. Assigning into the name to denote that the value is changed ```rust -let gen = |name :&static str| { +let gen = |name :&'static str| { name = yield "hello"; name = yield name; } ``` -Or +We are unable to denote assignment to multiple values at the same time. + +2. Creating a new binding upon each yield ```rust -let gen = |name :&static str| { +let gen = |name :&'static str| { let (name, ) = yield "hello"; let (name, ) = yield name; } ``` -We think that both of these approaches are wrong. In the first approach, we are unable to denote assignment to multiple values at the same time, and the second example denotes shadowing of the `name` binding, which also is incorrect behavior. Another issue with these approaches is, that they require progmmer, to write additional code to get the default behavior. In other words: What happens when user does not perform the required assignment ? Simply said, this code is permitted, but nonsensical: +We are creating a new binding upon each yield point, and therefore are shadowing earlier bindings. This would mean that by default, the generator stores all of the arguments passed into it through the resumes. Another issue with these approaches is, that they require progmmer, to write additional code to get the default behavior. In other words: What happens when user does not perform the required assignment ? Simply said, this code is permitted, but nonsensical: ```rust let gen = |name :&static str| { yield "hello"; let (name, ) = yield name; } ``` -The example looks like we aren't assigning to the `name` binding, and therefore upon the second yield we should return the value which was passed into the first resume function, but the implementation of such behavior would be extremely complex and probably would not correspond to what user wanted to do in the first place. Another problem is that this design would require making `yield` an expression, which would remove the correspondence of `yield` statement with the `return` statement. +Another issue is, how does this work with loops ? What is the value assigned to `third` in following example ? +```rust +let gen = |a| { + loop { + println!("a : {:?}", a); + let (a,) = yield 1; + println!("b : {:?}", a); + let (a,) = yield 2; + } +} +let first = gen.resume(("0")); +let sec = gen.resume(("1")); +let third gen.resume(("2")); +``` + + +3. Introducing a 'parametrized' yield; +```rust +let gen = | name: &'static str| { + yield(name) "hello"; + yield(name) name; +} +``` +Introduces a new concept of a parametrized statement, which is not used anywhere else in the language, and makes the default behavior store the passed argument inside the generator, making the easiest choice the wrong one on many cases. + The design we propose, in which the generator arguments are mentioned only at the start of the generator most closely resembles what is hapenning. And the user can't make a mistake by not assigning to the argument bindings from the yield statement. Only drawback of this approach is, the 'magic'. Since the value of the `name` is magically changed after each `yield`. But we pose that this is very similar to a closure being 'magically' transformed into a generator if it contains a `yield` statement and as such is acceptable amount of 'magic' behavior for this feature. @@ -101,6 +129,25 @@ The design we propose, in which the generator arguments are mentioned only at th Nonetheless, the introduction of this implicit behavior will require additional cognitive load for new users when learning this feature. However, the behavior of Generators without arguments is unchanged, and therefore this change does not impose this cost upfront, making it possible to introduce the more complex behavior in progressively more complex examples. +Another issue posed by our approach is lifetimes of the generator arguments. +```rust +let gen = |a| { + loop { + println!("a : {:?}", a); + yield 1; + println!("b : {:?}", a); + yield 2; + } +} +let first = gen.resume(("0")); +let sec = gen.resume(("1")); +let third gen.resume(("2")); +``` +In the loop example, the lifetime of `a` is different upon each resuming of the generator, and in the case of generator resuming from the second yield point, the lifetime starts at the end of the generator, and ends at the beginning, which is not expected. +However, if we take into consideration the form generators take when they are transformed into MIR, in this representation the lifetimes the arguments are no different than they would be in a manual `match` based implementation [Seee addendum](addendum-samples) + +### Standard library changes + This change would result in following generator trait. ```rust @@ -258,6 +305,7 @@ Alternatives: - Only current alternative is storing required information inside side channels such as `Rc>` or a thread local storage, which introduces runtime overhead and requires `std`. Proposed design is `no_std` compatible. - The proposed syntax could be changed, but from the explored options, we pose that the simplest syntax is best, even though it introduces new semantics. +- Implement radically new syntax just for generators - Leave the generator as is, leaving it disconnected from the rest of the language. - Remove generators completely From fbf7de15415f58ee9801adcf2f2864840d1e8374 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michal=20Hornick=C3=BD?= Date: Fri, 11 Oct 2019 10:24:26 +0200 Subject: [PATCH 04/15] Update 0000-unified_coroutines.md --- text/0000-unified_coroutines.md | 28 ++++++++++++++++++++++++---- 1 file changed, 24 insertions(+), 4 deletions(-) diff --git a/text/0000-unified_coroutines.md b/text/0000-unified_coroutines.md index cbeffdc0d66..c57c51ae983 100644 --- a/text/0000-unified_coroutines.md +++ b/text/0000-unified_coroutines.md @@ -81,8 +81,26 @@ let gen = |name :&'static str| { name = yield name; } ``` -We are unable to denote assignment to multiple values at the same time, and would therefore have to revert to using a tuple and possibly some kind of -destructuring assignment. +We are unable to denote assignment to multiple values at the same time, and would therefore have to revert to using a tuple and possibly some kind of destructuring assignment. The problem is the non-coherence between receiving arguments for the first time, +and upon multiple resumes. This however is only a syntactic inconvenience, and as such we think that this approach is a very good possible choice. + +If we could perform tuple destructuring when assigning: +```rust +let gen = |name :&'static str, val : i32| { + name, val = yield "hello"; + or + (name, val, ) = yield name; +} +``` +Or if we could 'pack' the arguments into tuple: +```rust +let gen = |..args| { + args = yield "hello"; + or + args = yield name; +} +``` +This syntactic choice would probably be the better one. However, we do not want to introduce a behavior, which would separate teh generators from closures. 2. Creating a new binding upon each yield ```rust @@ -129,7 +147,9 @@ The design we propose, in which the generator arguments are mentioned only at th ![magic](https://media2.giphy.com/media/12NUbkX6p4xOO4/giphy.gif) -Nonetheless, the introduction of this implicit behavior will require additional cognitive load for new users when learning this feature. However, the behavior of Generators without arguments is unchanged, and therefore this change does not impose this cost upfront, making it possible to introduce the more complex behavior in progressively more complex examples. +But, like shia, this point is controversial, and main issue that prevented us from adding generator arguments to the language in the first place. + +The introduction of this implicit behavior will require additional cognitive load for new users when learning this feature. However, the behavior of Generators without arguments is unchanged, and therefore this change does not impose this cost upfront, making it possible to introduce the more complex behavior in progressively more complex examples. Another issue posed by our approach is lifetimes of the generator arguments. ```rust @@ -239,7 +259,7 @@ The implementation of MIR generation will be more complex, and the author of thi [theoretical-basis]: #theoretical-basis The goal of this RFC is to unify the Rust's implementation of Generators, and the theoretical concept of 'Coroutine' as a generalization of the 'Subroutine/Function', and with this unification also comes the unification of rusts Generators and Functions for free. -The current implementaion servers its purpose (at least for async-await), but is sevely limited and disjointed from the rest of the language. By introducing the arguments in the same way that they are represented in proposed `Fn` traits, we can bring these 2 concepts more closely together. Additinal info about this unification can be found in [future-possibilities], but if the generator arguments are introduced in proposed form, the future modifications will be just syntax improvements, and will not change the semantics in a significant way. +The current implementation serves its purpose (at least for async-await), but is sevely limited and disjointed from the rest of the language. By introducing the arguments in the same way that they are represented in proposed `Fn` traits, we can bring these 2 concepts more closely together. Additinal info about this unification can be found in [future-possibilities], but if the generator arguments are introduced in proposed form, the future modifications will be just syntax improvements, and will not change the semantics in a significant way. Example of current `Fn*` traits: ```rust From 3aa99ef5ab2392b9514e93003eedbade30bfa921 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michal=20Hornick=C3=BD?= Date: Fri, 11 Oct 2019 10:25:06 +0200 Subject: [PATCH 05/15] Update 0000-unified_coroutines.md --- text/0000-unified_coroutines.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-unified_coroutines.md b/text/0000-unified_coroutines.md index c57c51ae983..5bf73f028bb 100644 --- a/text/0000-unified_coroutines.md +++ b/text/0000-unified_coroutines.md @@ -100,7 +100,7 @@ let gen = |..args| { args = yield name; } ``` -This syntactic choice would probably be the better one. However, we do not want to introduce a behavior, which would separate teh generators from closures. +This syntactic choice would probably be the better one. However, we do not want to introduce a behavior, which would further separate generators from closures. 2. Creating a new binding upon each yield ```rust From 60302a69767073f2d5225dfd52676c718eb3cf10 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michal=20Hornick=C3=BD?= Date: Fri, 11 Oct 2019 13:27:32 +0200 Subject: [PATCH 06/15] Update 0000-unified_coroutines.md --- text/0000-unified_coroutines.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/text/0000-unified_coroutines.md b/text/0000-unified_coroutines.md index 5bf73f028bb..4b36c0b5cd0 100644 --- a/text/0000-unified_coroutines.md +++ b/text/0000-unified_coroutines.md @@ -96,11 +96,10 @@ Or if we could 'pack' the arguments into tuple: ```rust let gen = |..args| { args = yield "hello"; - or - args = yield name; + args = yield args.name; } ``` -This syntactic choice would probably be the better one. However, we do not want to introduce a behavior, which would further separate generators from closures. +This syntactic choice would probably be the better one. Making the change of the `name` explicit. However, we do not want to introduce a behavior, which would further separate generators from closures. 2. Creating a new binding upon each yield ```rust @@ -143,11 +142,11 @@ let gen = | name: &'static str| { Introduces a new concept of a parametrized statement, which is not used anywhere else in the language, and makes the default behavior store the passed argument inside the generator, making the easiest choice the wrong one on many cases. -The design we propose, in which the generator arguments are mentioned only at the start of the generator most closely resembles what is hapenning. And the user can't make a mistake by not assigning to the argument bindings from the yield statement. Only drawback of this approach is, the 'magic'. Since the value of the `name` is magically changed after each `yield`. But we pose that this is very similar to a closure being 'magically' transformed into a generator if it contains a `yield` statement and as such is an acceptable amount of 'magic' behavior for this feature. +The design we propose, in which the generator arguments are mentioned only at the start of the generator most closely resembles what is hapenning. And the user can't make a mistake by not assigning to the argument bindings from the yield statement. Only drawback of this approach is, the 'magic'. Since the value of the `name` is magically changed after each `yield`. But we pose that this is very similar to a closure being 'magically' transformed into a generator if it contains a `yield` statement. ![magic](https://media2.giphy.com/media/12NUbkX6p4xOO4/giphy.gif) -But, like shia, this point is controversial, and main issue that prevented us from adding generator arguments to the language in the first place. +But, like shia himself, this point is controversial, and is the main issue that prevented us from adding generator arguments to the language in the first place. The introduction of this implicit behavior will require additional cognitive load for new users when learning this feature. However, the behavior of Generators without arguments is unchanged, and therefore this change does not impose this cost upfront, making it possible to introduce the more complex behavior in progressively more complex examples. From 3d3688c56ce1d4f9c1625a48b01130130bda220c Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michal=20Hornick=C3=BD?= Date: Fri, 11 Oct 2019 15:47:15 +0200 Subject: [PATCH 07/15] Grammar mistakes --- text/0000-unified_coroutines.md | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/text/0000-unified_coroutines.md b/text/0000-unified_coroutines.md index 4b36c0b5cd0..cc3a91cca67 100644 --- a/text/0000-unified_coroutines.md +++ b/text/0000-unified_coroutines.md @@ -30,21 +30,25 @@ Current generators take the form of a closure, which contains at least a single ```rust let name = "World"; +let done = "Done"; let gen = || { yield "Hello"; yield name; + return done; }; ``` And are used by calling the `resume` method on the generator. ```rust println!("{}", gen.resume()); println!("{}", gen.resume()); +println!("{}", gen.resume()); ``` Which results in ```rust Yielded("Hello") Yielded("World") +Finished("Done") ``` This RFC proposes the ability of a generator to take arguments with a syntax used by closures. @@ -52,6 +56,7 @@ This RFC proposes the ability of a generator to take arguments with a syntax use let gen = |name: &'static str| { yield "Hello"; yield name; + return name; } ``` @@ -59,14 +64,18 @@ Then, we propose a way to pass the arguments to the generator in the form of a t ```rust println!("{}", gen.resume(("Not used"))); println!("{}", gen.resume(("World"))); +println!("{}", gen.resume(("Done"))); ``` Which would also result in: ```rust Yielded("Hello") Yielded("World") +Finished("Done") ``` Notice that the argument to first resume call was unused, and the generator yielded only the value which was passed to the second resume. This behavior is radically different from the first example, in which the name variable from outer scope was captured by generator and yielded with the second `yield` statment; +The value of `a` in previous example is `"Hello"` between its start, and first `yield`, and `"World"` between first and second `yield`. And assumes the value of `"Done"` between second yield and a the `return` statement + The behavior, in which a generator consumes a different value upon each resume is currently not possible without introducing some kind of side channel, like storing the expected value in thread local storage, which is what the current implementation of async-await does. There are possible other syntaxes to denote the fact that the value assigned to `name` is different after each `yield`, but we believe that the simplest syntax, which is used in the example above, is in this case the best. @@ -128,7 +137,7 @@ let gen = |a| { } let first = gen.resume(("0")); let sec = gen.resume(("1")); -let third gen.resume(("2")); +let third = gen.resume(("2")); ``` @@ -165,7 +174,7 @@ let sec = gen.resume(("1")); let third gen.resume(("2")); ``` In the loop example, the lifetime of `a` is different upon each resuming of the generator, and in the case of generator resuming from the second yield point, the lifetime starts at the end of the generator, and ends at the beginning, which is not expected. -However, if we take into consideration the form generators take when they are transformed into MIR, in this representation the lifetimes the arguments are no different than they would be in a manual `match` based implementation [Seee addendum](addendum-samples) +However, if we take into consideration the form generators take when they are transformed into MIR, in this representation the lifetimes the arguments are no different than they would be in a manual `match` based implementation [See addendum](addendum-samples) ### Standard library changes @@ -305,7 +314,7 @@ and utlize generators as a trait alias for a `FnPin { - fn resume(self: Pun<&mut Self>) -> GeneratorState { + fn resume(self: Pin<&mut Self>) -> GeneratorState { self.resume_with_args(()) } } From 26ce466ba7527d73ae2b5e416e4ac29562306048 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michal=20Hornick=C3=BD?= Date: Fri, 11 Oct 2019 15:59:10 +0200 Subject: [PATCH 08/15] Update 0000-unified_coroutines.md --- text/0000-unified_coroutines.md | 17 ++++++++++++++--- 1 file changed, 14 insertions(+), 3 deletions(-) diff --git a/text/0000-unified_coroutines.md b/text/0000-unified_coroutines.md index cc3a91cca67..e3509c6bf36 100644 --- a/text/0000-unified_coroutines.md +++ b/text/0000-unified_coroutines.md @@ -54,7 +54,7 @@ Finished("Done") This RFC proposes the ability of a generator to take arguments with a syntax used by closures. ```rust let gen = |name: &'static str| { - yield "Hello"; + yield "Hello"; yield name; return name; } @@ -72,9 +72,20 @@ Yielded("Hello") Yielded("World") Finished("Done") ``` -Notice that the argument to first resume call was unused, and the generator yielded only the value which was passed to the second resume. This behavior is radically different from the first example, in which the name variable from outer scope was captured by generator and yielded with the second `yield` statment; +Or expanded with values in between: +```rust +let gen = |name: &'static str| { + // name = "Not used" + yield "Hello"; + // name = "World + yield name; + // name = "Done" + return name; +} +``` +Notice that in this example the argument to first resume call was unused, and the generator yielded only the value which was passed to the second resume. This behavior is radically different from the first example, in which the name variable from outer scope was captured by generator and yielded with the second `yield` statment; -The value of `a` in previous example is `"Hello"` between its start, and first `yield`, and `"World"` between first and second `yield`. And assumes the value of `"Done"` between second yield and a the `return` statement +The value of `name` in previous example is `"Hello"` between its start, and first `yield`, and `"World"` between first and second `yield`. And assumes the value of `"Done"` between second yield and a the `return` statement The behavior, in which a generator consumes a different value upon each resume is currently not possible without introducing some kind of side channel, like storing the expected value in thread local storage, which is what the current implementation of async-await does. From 03f034c19384cafb8b714e59dca9c344b67ac4bd Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michal=20Hornick=C3=BD?= Date: Fri, 11 Oct 2019 17:31:51 +0200 Subject: [PATCH 09/15] Update 0000-unified_coroutines.md --- text/0000-unified_coroutines.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/text/0000-unified_coroutines.md b/text/0000-unified_coroutines.md index e3509c6bf36..0d6f79a97b4 100644 --- a/text/0000-unified_coroutines.md +++ b/text/0000-unified_coroutines.md @@ -343,9 +343,8 @@ Rationale: - Allows implementations of new patterns, such as complex state machines, which are an extremely useful tool in several areas, eg. network protocol implementations. Alternatives: -- Only current alternative is storing required information inside side channels such as `Rc>` -or a thread local storage, which introduces runtime overhead and requires `std`. Proposed design is `no_std` compatible. - The proposed syntax could be changed, but from the explored options, we pose that the simplest syntax is best, even though it introduces new semantics. +- Explore async generators to replace `coroutine`-like generator use cases and find another way to solve the `task::Context` argument problem faced by async-await. - Implement radically new syntax just for generators - Leave the generator as is, leaving it disconnected from the rest of the language. - Remove generators completely From 3f693bd4be438f921850f11bb9e8b24870f2c2eb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michal=20Hornick=C3=BD?= Date: Fri, 11 Oct 2019 22:17:13 +0200 Subject: [PATCH 10/15] Update 0000-unified_coroutines.md --- text/0000-unified_coroutines.md | 186 ++++++++++++++------------------ 1 file changed, 80 insertions(+), 106 deletions(-) diff --git a/text/0000-unified_coroutines.md b/text/0000-unified_coroutines.md index 0d6f79a97b4..7ded34a626c 100644 --- a/text/0000-unified_coroutines.md +++ b/text/0000-unified_coroutines.md @@ -11,7 +11,7 @@ Unify the `Generator` traits with the `Fn*` family of traits. Add a way to pass # Motivation [motivation]: #motivation -The generators/coroutines are extremely powerful concept, but its implementation in Rust is severely limited, and its usage requires workarounds to achieve useful patterns. The current view of generators is also extremely disconnected from similar concepts, which already exist in the language and the standard library; +The generators/coroutines are extremely powerful concept, but their implementation in Rust is severely limited, and usage requires workarounds in order to achieve useful patterns. The current view of generators is also extremely disconnected from similar concepts, which already exist in the language and the standard library. # Guide-level explanation @@ -22,7 +22,7 @@ an interface, which is used to transfer data. Upon entering, caller can pass dat Many programming languages(including Rust) also adopted more general concept, callled coroutine. The coroutine , or `Generator` in Rust terms differs from function/subroutine in a single way. It allows the coroutine to `suspend` itself, by storing its state, and `yield`ing control back to the caller, along with data. The caller can then repeatedly pass more data back into the coroutine, and `resume` it. This back-and forth communication is an extremely useful tool for solving a wide class of problems. -Except this is not the truth for Rust's coroutines. The `Generator`, as it was introduced in order to provide a tool for implementing `async-await`, does not provide this functionality. In order to implement async-await feature, the generators were implemented in their most basic form. And in this form, Generators can't accept arguments, and are connected to closures only in their syntax, and not their representation in the type system. +Except this is not the truth for Rust's coroutines. The `Generator`, as it was introduced in order to provide a tool for implementing `async-await`, does not provide this functionality. In order to implement async-await feature, the generators were implemented in their most basic form. And in this form, Generators can't accept arguments, and require invocation of a method, in order to These issues severely lessen the usability of the generator feature, and are not difficult to solve. @@ -76,11 +76,11 @@ Or expanded with values in between: ```rust let gen = |name: &'static str| { // name = "Not used" - yield "Hello"; + yield "Hello"; // name is also dropped here // name = "World - yield name; + yield name; // name not dropped, since it is returned from yield // name = "Done" - return name; + return name; // name not dropped } ``` Notice that in this example the argument to first resume call was unused, and the generator yielded only the value which was passed to the second resume. This behavior is radically different from the first example, in which the name variable from outer scope was captured by generator and yielded with the second `yield` statment; @@ -89,104 +89,20 @@ The value of `name` in previous example is `"Hello"` between its start, and firs The behavior, in which a generator consumes a different value upon each resume is currently not possible without introducing some kind of side channel, like storing the expected value in thread local storage, which is what the current implementation of async-await does. -There are possible other syntaxes to denote the fact that the value assigned to `name` is different after each `yield`, but we believe that the simplest syntax, which is used in the example above, is in this case the best. +The design we propose, in which the generator arguments are mentioned only at the start of the generator most closely resembles what is hapenning. By default, every time an argument is passed into the generator, it is then dropped before the next yield. +And could be stored inside the generator if the user wants to by assigning the value into a new bindinging which is used accros yield points. -### Alternative syntaxes -There are several other possible syntaxes, but each one has its problems. Several are outlined here -1. Assigning into the name to denote that the value is changed -```rust -let gen = |name :&'static str| { - name = yield "hello"; - name = yield name; -} -``` -We are unable to denote assignment to multiple values at the same time, and would therefore have to revert to using a tuple and possibly some kind of destructuring assignment. The problem is the non-coherence between receiving arguments for the first time, -and upon multiple resumes. This however is only a syntactic inconvenience, and as such we think that this approach is a very good possible choice. +##### Drawbacks -If we could perform tuple destructuring when assigning: -```rust -let gen = |name :&'static str, val : i32| { - name, val = yield "hello"; - or - (name, val, ) = yield name; -} -``` -Or if we could 'pack' the arguments into tuple: -```rust -let gen = |..args| { - args = yield "hello"; - args = yield args.name; -} -``` -This syntactic choice would probably be the better one. Making the change of the `name` explicit. However, we do not want to introduce a behavior, which would further separate generators from closures. - -2. Creating a new binding upon each yield -```rust -let gen = |name :&'static str| { - - let (name, ) = yield "hello"; - let (name, ) = yield name; -} -``` -We are creating a new binding upon each yield point, and therefore are shadowing earlier bindings. This would mean that by default, the generator stores all of the arguments passed into it through the resumes. Another issue with these approaches is, that they require progmmer, to write additional code to get the default behavior. In other words: What happens when user does not perform the required assignment ? Simply said, this code is permitted, but nonsensical: -```rust -let gen = |name: &static str| { - yield "hello"; - let (name, ) = yield name; -} -``` -Another issue is, how does this work with loops ? What is the value assigned to `third` in following example ? -```rust -let gen = |a| { - loop { - println!("a : {:?}", a); - let (a,) = yield 1; - println!("b : {:?}", a); - let (a,) = yield 2; - } -} -let first = gen.resume(("0")); -let sec = gen.resume(("1")); -let third = gen.resume(("2")); -``` - - -3. Introducing a 'parametrized' yield; -```rust -let gen = | name: &'static str| { - yield(name) "hello"; - yield(name) name; -} -``` -Introduces a new concept of a parametrized statement, which is not used anywhere else in the language, and makes the default behavior store the passed argument inside the generator, making the easiest choice the wrong one on many cases. - - -The design we propose, in which the generator arguments are mentioned only at the start of the generator most closely resembles what is hapenning. And the user can't make a mistake by not assigning to the argument bindings from the yield statement. Only drawback of this approach is, the 'magic'. Since the value of the `name` is magically changed after each `yield`. But we pose that this is very similar to a closure being 'magically' transformed into a generator if it contains a `yield` statement. +Drawback of this approach is, the 'magic'. Since the value of the `name` is magically changed after each `yield`. But we pose that this is very similar to a closure being 'magically' transformed into a generator if it contains a `yield` statement. ![magic](https://media2.giphy.com/media/12NUbkX6p4xOO4/giphy.gif) -But, like shia himself, this point is controversial, and is the main issue that prevented us from adding generator arguments to the language in the first place. +But, like shia himself, this point is controversial, and is the main issue that prevented us from adding generator arguments to the language in the first place. There are possible other syntaxes to denote the fact that the value assigned to `name` is different after each `yield`, but we believe that the simplest syntax, which is used in the example above, is in this case the best. Additional examples are described [later](alternative-syntaxes) The introduction of this implicit behavior will require additional cognitive load for new users when learning this feature. However, the behavior of Generators without arguments is unchanged, and therefore this change does not impose this cost upfront, making it possible to introduce the more complex behavior in progressively more complex examples. -Another issue posed by our approach is lifetimes of the generator arguments. -```rust -let gen = |a| { - loop { - println!("a : {:?}", a); - yield 1; - println!("b : {:?}", a); - yield 2; - } -} -let first = gen.resume(("0")); -let sec = gen.resume(("1")); -let third gen.resume(("2")); -``` -In the loop example, the lifetime of `a` is different upon each resuming of the generator, and in the case of generator resuming from the second yield point, the lifetime starts at the end of the generator, and ends at the beginning, which is not expected. -However, if we take into consideration the form generators take when they are transformed into MIR, in this representation the lifetimes the arguments are no different than they would be in a manual `match` based implementation [See addendum](addendum-samples) - ### Standard library changes This change would result in following generator trait. @@ -202,6 +118,7 @@ pub trait Generator { While the RFC does not deal with the lifetimes of the arguments, the similarity of the modified `Generator` trait with the existing `Fn*` traits suggests that rules which currently apply to closures will also apply to generators. [More info later](theoretical-basis). ### Use cases: + 1. Futures generated by async-await. The current implementation of async futures requires the use of thread-local storage in order to pass the `task::Context` argument into underlying futures. This imposes small, but not zero overhead, which would be removed by this RFC. 2. Protocol state machines - When a user wants to implement a state machine in order to correctly represent a network protocol, @@ -274,7 +191,7 @@ The proposed changes to generator trait are pretty straightforward and do not ch The implementation of MIR generation will be more complex, and the author of this RFC is unable to properly gauge the amount of work that will be required. -### Theoretical basis +## Theoretical basis [theoretical-basis]: #theoretical-basis The goal of this RFC is to unify the Rust's implementation of Generators, and the theoretical concept of 'Coroutine' as a generalization of the 'Subroutine/Function', and with this unification also comes the unification of rusts Generators and Functions for free. @@ -284,10 +201,10 @@ Example of current `Fn*` traits: ```rust pub trait FnOnce { type Output; - fn call_once(self, args: Args) -> Self::Output; + extern "rust-call" fn call_once(self, args: Args) -> Self::Output; } pub trait FnMut : FnOnce { - fn call_mut(&mut self, args: Args) -> Self::Output; + extern "rust-call" fn call_mut(&mut self, args: Args) -> Self::Output; } ``` And a proposed `Generator` trait: @@ -317,20 +234,16 @@ pub trait FnPin : FnOnce { And either utilize the `FnPin` as a `FnGen/Generator` supertrait, or disregard the `FnGen/Generator` trait completely and utlize generators as a trait alias for a `FnPin>` +But, contrary to this point, we might not want to conflate the `Generator` trait with the `Fn*` trait hierarchy, +because of future compatilibity with possible formalizations of newly added rust features. +See work on effect systems by (Russel Johnston)[https://gist.github.com/rpjohnst/a68de4c52d9b0b0f6ddf54ca293cceee] + # Drawbacks [drawbacks]: #drawbacks 1. Increased complexity of implementation of the Generator feature. -2. If we only implement necessary parts of this RFC, users will need to pass empty tuple into the `resume` function for most common case, which could be solved by introducing a trivial trait -```rust -trait NoArgGenerator : Generator<()> { - fn resume(self: Pin<&mut Self>) -> GeneratorState { - self.resume_with_args(()) - } -} -``` -If we introduce this trait, and rename the original `Generator::resume` method to `Generator::resume_with_args`, the existing behavior will not change. But we think this approach is **WRONG**, and the Generator trait should stay with the changes proposed above (`resume`/`call_resume` accepting a tuple). The rationale for this decision is provided in [future-possibilities] section. +2. If we only implement necessary parts of this RFC, users will need to pass empty tuple into the `resume` function for most common case. This could be then solved by introducing a `TODO: Desugared like function calls of closures` The rationale for this decision is provided in [future-possibilities] section. 3. Need to teach the special interaction between generator arguments and the yield statement. @@ -349,6 +262,67 @@ Alternatives: - Leave the generator as is, leaving it disconnected from the rest of the language. - Remove generators completely + +### Alternative syntaxes +[alternative-syntaaxex]: #alternative-syntaxes +There are several other possible syntaxes, to denote that the value of generator arguments is different after each yield. Several are outlined here: + +1. Assigning into the name to denote that the value is changed +```rust +let gen = |name: &'static str| { + (name, ) = yield "hello"; + args = yield name; + name = args.0 +} +``` +We are unable to denote assignment to multiple values at the same time, and would therefore have to revert to using a tuple and possibly some kind of destructuring assignment. The problem is the non-coherence between receiving arguments for the first time, +and upon multiple resumes. This however is only a syntactic inconvenience, and as such we think that this approach is a very good possible choice. + +If we could perform tuple destructuring when assigning: +```rust +let gen = |name: &'static str, val: i32| { + name, val = yield "hello"; + or + (name, val, ) = yield name; +} +``` +Or if we could 'pack' the arguments into tuple: +```rust +let gen = |..args| { + args = yield "hello"; + args = yield args.name; +} +``` +This syntactic choice would probably be the better one. Making the change of the `name` explicit. However, we do not want to introduce a behavior, which would further separate generators from closures. + +3. Introducing a 'parametrized' yield; +```rust +let gen = | name: &'static str| { + yield(name) "hello"; + yield(name) name; +} +``` +Introduces a new concept of a parametrized statement, which is not used anywhere else in the language, and makes the default behavior store the passed argument inside the generator, making the easiest choice the wrong one on many cases. + + +Another issue posed by our approach is lifetimes of the generator arguments. +```rust +let gen = |a| { + loop { + println!("a : {:?}", a); + yield 1; + println!("b : {:?}", a); + yield 2; + } +} +let first = gen.resume(("0")); +let sec = gen.resume(("1")); +let third gen.resume(("2")); +``` +In the loop example, the lifetime of `a` is different upon each resuming of the generator, and in the case of generator resuming from the second yield point, the lifetime starts at the end of the generator, and ends at the beginning, which is not expected. +However, if we take into consideration the form generators take when they are transformed into MIR, in this representation the lifetimes of the arguments are no different than they would be in a manual `match` based implementation [See addendum](addendum-samples) + + # Prior art [prior-art]: #prior-art From 02d21ba5faeca9a5caf711d741b72a5586d0bc96 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michal=20Hornick=C3=BD?= Date: Fri, 11 Oct 2019 22:22:19 +0200 Subject: [PATCH 11/15] Update 0000-unified_coroutines.md --- text/0000-unified_coroutines.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/text/0000-unified_coroutines.md b/text/0000-unified_coroutines.md index 7ded34a626c..59e4954748b 100644 --- a/text/0000-unified_coroutines.md +++ b/text/0000-unified_coroutines.md @@ -99,7 +99,7 @@ Drawback of this approach is, the 'magic'. Since the value of the `name` is magi ![magic](https://media2.giphy.com/media/12NUbkX6p4xOO4/giphy.gif) -But, like shia himself, this point is controversial, and is the main issue that prevented us from adding generator arguments to the language in the first place. There are possible other syntaxes to denote the fact that the value assigned to `name` is different after each `yield`, but we believe that the simplest syntax, which is used in the example above, is in this case the best. Additional examples are described [later](alternative-syntaxes) +But, like Shia himself, this point is controversial, and is the main issue that prevented us from adding generator arguments to the language in the first place. There are possible other syntaxes to denote the fact that the value assigned to `name` is different after each `yield`, but we believe that the simplest syntax, which is used in the example above, is in this case the best. Additional examples are described [later](alternative-syntaxes) The introduction of this implicit behavior will require additional cognitive load for new users when learning this feature. However, the behavior of Generators without arguments is unchanged, and therefore this change does not impose this cost upfront, making it possible to introduce the more complex behavior in progressively more complex examples. @@ -236,7 +236,7 @@ and utlize generators as a trait alias for a `FnPin Date: Fri, 11 Oct 2019 22:23:41 +0200 Subject: [PATCH 12/15] Update 0000-unified_coroutines.md --- text/0000-unified_coroutines.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/text/0000-unified_coroutines.md b/text/0000-unified_coroutines.md index 59e4954748b..8fc7b0215aa 100644 --- a/text/0000-unified_coroutines.md +++ b/text/0000-unified_coroutines.md @@ -243,7 +243,7 @@ See work on effect systems by [Russel Johnston](https://gist.github.com/rpjohnst 1. Increased complexity of implementation of the Generator feature. -2. If we only implement necessary parts of this RFC, users will need to pass empty tuple into the `resume` function for most common case. This could be then solved by introducing a `TODO: Desugared like function calls of closures` The rationale for this decision is provided in [future-possibilities] section. +2. If we only implement necessary parts of this RFC, users will need to pass empty tuple into the `resume` function for most common case. This could be then solved by introducing a similar desugating mechanism as is used for closures today. More info in [future-possibilities] section. 3. Need to teach the special interaction between generator arguments and the yield statement. From 204cadbb81ffa6f96af815b7f7697e924b8cca02 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michal=20Hornick=C3=BD?= Date: Sat, 12 Oct 2019 11:25:02 +0200 Subject: [PATCH 13/15] Update 0000-unified_coroutines.md --- text/0000-unified_coroutines.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/text/0000-unified_coroutines.md b/text/0000-unified_coroutines.md index 8fc7b0215aa..58f49d2f168 100644 --- a/text/0000-unified_coroutines.md +++ b/text/0000-unified_coroutines.md @@ -20,9 +20,9 @@ The generators/coroutines are extremely powerful concept, but their implementati Consider a function. In the olden days, also called a subroutine. This concept is at the core of every programming language, since it is extremely useful. The subroutine has a single entrypoint and a single exit point. Both of these points provide an interface, which is used to transfer data. Upon entering, caller can pass data into the subroutine and upon exiting, the subroutine can return data back to the caller. -Many programming languages(including Rust) also adopted more general concept, callled coroutine. The coroutine , or `Generator` in Rust terms differs from function/subroutine in a single way. It allows the coroutine to `suspend` itself, by storing its state, and `yield`ing control back to the caller, along with data. The caller can then repeatedly pass more data back into the coroutine, and `resume` it. This back-and forth communication is an extremely useful tool for solving a wide class of problems. +Many programming languages(including Rust) also adopted more general concept, callled coroutine. The coroutine , or `Generator` in Rust terms differs from function/subroutine in a single way. It allows the coroutine to `suspend` itself, by storing its state, and `yield`ing control back to the caller, along with data. The caller can then repeatedly pass more data back into the coroutine, and `resume` it. This back-and-forth communication is an extremely useful tool for solving a wide class of problems. -Except this is not the truth for Rust's coroutines. The `Generator`, as it was introduced in order to provide a tool for implementing `async-await`, does not provide this functionality. In order to implement async-await feature, the generators were implemented in their most basic form. And in this form, Generators can't accept arguments, and require invocation of a method, in order to +Except this is not the truth for Rust's coroutines. The `Generator`, as it was introduced in order to provide a tool for implementing `async-await`, does not provide this functionality. In order to implement async-await feature, the generators were implemented in their most basic form. And in this form, Generators can't accept arguments. These issues severely lessen the usability of the generator feature, and are not difficult to solve. @@ -39,9 +39,9 @@ let gen = || { ``` And are used by calling the `resume` method on the generator. ```rust -println!("{}", gen.resume()); -println!("{}", gen.resume()); -println!("{}", gen.resume()); +println!("{:?}", gen.resume()); +println!("{:?}", gen.resume()); +println!("{:?}", gen.resume()); ``` Which results in @@ -62,9 +62,9 @@ let gen = |name: &'static str| { Then, we propose a way to pass the arguments to the generator in the form of a tuple. ```rust -println!("{}", gen.resume(("Not used"))); -println!("{}", gen.resume(("World"))); -println!("{}", gen.resume(("Done"))); +println!("{:?}", gen.resume(("Not used"))); +println!("{:?}", gen.resume(("World"))); +println!("{:?}", gen.resume(("Done"))); ``` Which would also result in: ```rust @@ -216,7 +216,7 @@ pub trait Generator { fn resume(self: Pin<&mut Self>, args: Args) -> GeneratorState; } ``` -Considering the closeness of these 2 concepts, it might be beneficial to utilize a different form of the `Generator` trait. +Considering the similarity of these 2 traits, a following trait might be hierarchy might be desirable: ```rust pub trait FnGen : FnOnce> { type Yield; From d866737668fb05085bcc630fa78ad7ec70213085 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michal=20Hornick=C3=BD?= Date: Sat, 12 Oct 2019 17:55:14 +0200 Subject: [PATCH 14/15] Update 0000-unified_coroutines.md --- text/0000-unified_coroutines.md | 20 +++++++++----------- 1 file changed, 9 insertions(+), 11 deletions(-) diff --git a/text/0000-unified_coroutines.md b/text/0000-unified_coroutines.md index 58f49d2f168..38f641c6b10 100644 --- a/text/0000-unified_coroutines.md +++ b/text/0000-unified_coroutines.md @@ -83,15 +83,14 @@ let gen = |name: &'static str| { return name; // name not dropped } ``` -Notice that in this example the argument to first resume call was unused, and the generator yielded only the value which was passed to the second resume. This behavior is radically different from the first example, in which the name variable from outer scope was captured by generator and yielded with the second `yield` statment; +Notice that in this example the argument to first resume call was unused, but was still available from the start of the generator until the first `yield` point. The generator has then yielded the value which was passed to the second resume. This behavior is radically different from the first example, in which the name variable from outer scope was captured by generator and yielded with the second `yield` statment, and as such is not representable in the current form of generators. The value of `name` in previous example is `"Hello"` between its start, and first `yield`, and `"World"` between first and second `yield`. And assumes the value of `"Done"` between second yield and a the `return` statement The behavior, in which a generator consumes a different value upon each resume is currently not possible without introducing some kind of side channel, like storing the expected value in thread local storage, which is what the current implementation of async-await does. The design we propose, in which the generator arguments are mentioned only at the start of the generator most closely resembles what is hapenning. By default, every time an argument is passed into the generator, it is then dropped before the next yield. -And could be stored inside the generator if the user wants to by assigning the value into a new bindinging which is used accros yield points. - +And could be stored inside the generator if the user wants. To store a resume argument inside the generator, all the user has to do, is to assign it to a binding, which is used acroos `yield` points. ##### Drawbacks @@ -99,7 +98,7 @@ Drawback of this approach is, the 'magic'. Since the value of the `name` is magi ![magic](https://media2.giphy.com/media/12NUbkX6p4xOO4/giphy.gif) -But, like Shia himself, this point is controversial, and is the main issue that prevented us from adding generator arguments to the language in the first place. There are possible other syntaxes to denote the fact that the value assigned to `name` is different after each `yield`, but we believe that the simplest syntax, which is used in the example above, is in this case the best. Additional examples are described [later](alternative-syntaxes) +But, like Shia himself, this point is controversial, and is the main issue that prevented us from adding generators with arguments to the language in the first place. There are possible other syntaxes to denote the fact that the value assigned to `name` is different after each `yield`, but we believe that the simplest syntax, which is used in the example above, is in this case the best. Additional examples are described [later](alternative-syntaxes) The introduction of this implicit behavior will require additional cognitive load for new users when learning this feature. However, the behavior of Generators without arguments is unchanged, and therefore this change does not impose this cost upfront, making it possible to introduce the more complex behavior in progressively more complex examples. @@ -243,7 +242,7 @@ See work on effect systems by [Russel Johnston](https://gist.github.com/rpjohnst 1. Increased complexity of implementation of the Generator feature. -2. If we only implement necessary parts of this RFC, users will need to pass empty tuple into the `resume` function for most common case. This could be then solved by introducing a similar desugating mechanism as is used for closures today. More info in [future-possibilities] section. +2. If we only implement necessary parts of this RFC, users will need to pass empty tuple into the `resume` function for most common case. This could be then solved by introducing a similar desugating mechanism as is used for calling closures today. More info in [future-possibilities] section. 3. Need to teach the special interaction between generator arguments and the yield statement. @@ -295,7 +294,7 @@ let gen = |..args| { ``` This syntactic choice would probably be the better one. Making the change of the `name` explicit. However, we do not want to introduce a behavior, which would further separate generators from closures. -3. Introducing a 'parametrized' yield; +2. Introducing a 'parametrized' yield; ```rust let gen = | name: &'static str| { yield(name) "hello"; @@ -304,7 +303,6 @@ let gen = | name: &'static str| { ``` Introduces a new concept of a parametrized statement, which is not used anywhere else in the language, and makes the default behavior store the passed argument inside the generator, making the easiest choice the wrong one on many cases. - Another issue posed by our approach is lifetimes of the generator arguments. ```rust let gen = |a| { @@ -352,11 +350,11 @@ explanation of the different behavior in combination with the `yield` keyword ex # Future possibilities [future-possibilities]: #future-possibilities -One of the areas of improvement, is interaction with generators. Currently the generator takes the form of a closure, but the resume method is called like a trait method. One of the future improvements would be making generators callable like closures, through the `extern "rust-call"` interface, since this RFC recomments their unification with closures. +One of the areas of improvement, is interaction with generators. Currently the generator takes the form of a closure, but the resume method is called like a trait method. One of the future improvements would be making generators callable like closures, through the `extern "rust-call"` interface, since this RFC recommends their unification with closures. So, current syntax with 2 arguments looks like: ```rust -let mut gen = Pin::box(|name| { +let mut gen = Pin::box(|name, arg2| { yield ("Hello", name); yield ("Bye", name); }); @@ -365,7 +363,7 @@ gen.resume(("world", "!")); ``` Would become ```rust -let mut gen = Pin::box(|name| { +let mut gen = Pin::box(|name, arg2| { yield ("Hello", name); yield ("Bye", name); }); @@ -375,7 +373,7 @@ let second = gen("world", "!"); And this is would fully unify the interface provided by generators with the one provided by closures, but is intertwined with other issues, like [Fn traits](https://github.com/rust-lang/rust/issues/29625) and thus would block the current RFC. Therefore we propose generator syntax accepting the multiple arguments: ```rust -let gen = |a,b,c| { +let gen = |a, b, c| { yield a; yield b; yield c; From b8d4c9aab56fd2bee0338d0225c4c03e3aecdfdc Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michal=20Hornick=C3=BD?= Date: Sat, 12 Oct 2019 18:17:53 +0200 Subject: [PATCH 15/15] Update 0000-unified_coroutines.md --- text/0000-unified_coroutines.md | 31 +++++++++++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/text/0000-unified_coroutines.md b/text/0000-unified_coroutines.md index 38f641c6b10..616037f9fda 100644 --- a/text/0000-unified_coroutines.md +++ b/text/0000-unified_coroutines.md @@ -321,6 +321,35 @@ In the loop example, the lifetime of `a` is different upon each resuming of the However, if we take into consideration the form generators take when they are transformed into MIR, in this representation the lifetimes of the arguments are no different than they would be in a manual `match` based implementation [See addendum](addendum-samples) +### Alternative designs +[alternative-designs]: #alternative-designs +[Source](https://internals.rust-lang.org/t/crazy-idea-coroutine-closures/1576) +This design presumes that `yield` will be an expression that resolves into a list of arguments that which were passed into `resume`. + +In order to solve the disconnect between receving an argument list at the start of the generator, and a tuple when the generator, this proposal from 2015 advocates for 2 different methods on the `Generator` trait, `start` and `resume`. In this chapter we will present a modified view. + +Make following syntax return a closure,which returns a generator: +```rust +let makegen : impl FnOnce() -> impl Generator<(String,String)> = || { + println!("running gen"); + let (name, value) = yield 0; + let (name, value) = yield 1; + let (name, value) = yield 2; +}; + +or + +let makegen : impl FnOnce() -> impl Generator<(String,String)> = || gen/coro { + println!("running gen"); + let (name, value) = yield 0; + let (name, value) = yield 1; + let (name, value) = yield 2; +}; +``` +This aproach wold fix the syntactic disconnect between receiving arguments for the first `resume` of the generator and the next ones, but the issue of arguments to previous yields being implcitly available and therefore potentially storable is still there. + + + # Prior art [prior-art]: #prior-art @@ -334,6 +363,8 @@ However, if we take into consideration the form generators take when they are tr These are interesting, since they both adopt a syntax, in which the yield expression returns values passed to resume. We think that this approach is the right one for dynamic languages like Python or lua but the wrong one for Rust. The reason is, these languages are dynamically typed, and allow passing of multiple values into the coroutine. The design proposed here is static, and allows passing only a single argument into the coroutine, a tuple. The argument tuple is treated the same way as in the `Fn*` family of traits. +- Alternative design of a `FnOnce` closure which returns a generator, and therefore fixes th syntactic disconnect between receiving arguments between the `start` and `resume` of the generator [Link](https://internals.rust-lang.org/t/crazy-idea-coroutine-closures/1576) + # Unresolved questions [unresolved-questions]: #unresolved-questions