-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] Static Function Argument Unpacking #3723
base: master
Are you sure you want to change the base?
Conversation
|
||
### Type Coercions of Collections | ||
|
||
If the collection being unpacked is a reference for the collection type, whether argument unpacking works, depends on if accessing it directly with the field access expression (`.idx`, or `[idx]`) would work at compile time. If it does, then argument unpacking works. (For the reference, see [`std::ops::Deref`](https://doc.rust-lang.org/std/ops/trait.Deref.html) and [type coercions](https://doc.rust-lang.org/reference/type-coercions.html).) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Earlier it is stated that "Tuples, tuple structs, and fixed-size arrays can be unpacked.", but this seems to indicate that anything implementing Index<Int>
could be unpacked. I think tuples definitely make sense and fixed-length [T; N]
arrays could make sense, but anything involving Index
should be out of scope for now because it turns the complexity up quite a bit.
It is pretty easy to turn slicelike collections into fixed-length arrays anyway, the user can handle that if they need this support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a similar note, it is probably worth mentioning type inference. For example:
fn foo(a: u8, b: u8, c: u8, d: u8, e: u8) { /* ... */ }
fn bar(buf: &[u8]) {
foo(...buf.try_into().unwrap());
}
Is that expected to work because &[T; 5]
has TryFrom<&[T]>
and u8
is Copy
? What about
fn bar(buf: &[u8], other: &[u8]) {
foo(1, ...buf.try_into().unwrap(), ...other.try_into().unwrap());
}
That probably needs to be forbidden as ambiguous, but that is something that should be spelled out here with some T-Types input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch – I seem to have written this part about accessing the fields in an ambiguous and self-contradictory way. I'll need to make some changes and also study the problem a bit more.
On the whole, I'm aiming at a design that's infallible during compilation. But I think I'll need to look into if that claim can actually be made after all, i.e. if some seemingly infallible cases wouldn't be totally infallible, but only just as infallible as they currently are. (Leading into a runtime panic upon failure?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a similar note, it is probably worth mentioning type inference. For example:
fn foo(a: u8, b: u8, c: u8, d: u8, e: u8) { /* ... */ } fn bar(buf: &[u8]) { foo(...buf.try_into().unwrap()); }Is that expected to work because
&[T; 5]
hasTryFrom<&[T]>
andu8
isCopy
? What aboutfn bar(buf: &[u8], other: &[u8]) { foo(1, ...buf.try_into().unwrap(), ...other.try_into().unwrap()); }That probably needs to be forbidden as ambiguous, but that is something that should be spelled out here with some T-Types input.
I'll definitely need to add a subchapter on type inference next to the referred "Type Coercions of Collections" subchapter. 👍🏼
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Earlier it is stated that "Tuples, tuple structs, and fixed-size arrays can be unpacked.", but this seems to indicate that anything implementing
Index<Int>
could be unpacked. I think tuples definitely make sense and fixed-length[T; N]
arrays could make sense, but anything involvingIndex
should be out of scope for now because it turns the complexity up quite a bit.It is pretty easy to turn slicelike collections into fixed-length arrays anyway, the user can handle that if they need this support.
I've edited this subchapter (now named "Type Coercions and Conversions of Collections") in commit fa28f69 to address this. Basically, I think the original writing was erroneous, contradicting with other parts of the text, so thanks for bringing this up. This is also one of the sections (and topics) I'm most unsure about, so please don't hesitate to point out any remaining mistakes. :)
4. All of the items inside the collection are unpacked. | ||
- For example, attempting to unpack a thousand-element array just to pass the first two elements as arguments to a function taking two parameters seems like a mistake that should be explicitly prevented. | ||
- Consequently, there must be at least as many unfilled parameter slots left in the function call as there are items inside the collection being unpacked. If there are *N* items in the collection being unpacked, the immediately next *N* parameter slots in the function call are filled with the collection's items as the arguments. | ||
5. Minimum of one element/field is required in the collection being unpacked. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this RFC is intended to help push variadic generics along, we need to allow unpacking zero-element things, since it is necessary for stuff like defining variadic Fn
:
struct CallLocked<F>(Mutex<F>);
// this should work for F: FnMut() -> u8 and that needs unpacking zero-element things
impl<F: FnMut(...Args) -> R, ...Args, R> Fn(...Args) -> R for CallLocked<F> {
fn call(&self, ...args: ...Args) -> R {
self.0.lock().unwrap()(...args)
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a big fan of not special-casing zero things in the language semantics. That seems especially important for being able to unpack arrays of const-generic length.
(We should probably lint on sketchy things like let x = [...v.clear()];
, like we lint on let x = v.clear();
-- or clippy::let_unit_value
does, at least -- but the language should allow it for all lengths.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted, these arguments sound reasonable to me. Initially, I leaned towards starting with a strict design, since it's easier to relax rather than add constraints later on, but that's because I couldn't come up with a reason why zero-length collections could ever be unpacked. :)
In light of this, I'll make the following two changes to the RFC:
- Zero-length collection won't be errors anymore. Instead, in these cases, the argument unpacking syntax will just desugar into nothingness – i.e. no arguments are unpacked. This corner case will be explained in the text.
- The error diagnostic for this will be changed into a lint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the RFC based on the above comments in commit 395b6e4.
I think this design is now better and nicely in line with the guiding principles I had set, specifically "Compatibility with other features" and "Avoiding ambiguity with simple rules and by requiring explicit control by the user (developer)". :)
This isn't exactly varidics but it is related. Cc @Jules-Bertholet who I believe has done some design work in that area. (Some recent discussion in that area https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/Variadic.20generics.20experiment) |
|
||
The ellipsis symbol composed from three consecutive ASCII dot characters is used in the "et cetera" or "and so on" sense in many design documents and code examples. Giving it an actual syntactical meaning could lead to some confusion or readability issues. Preferring `…`, i.e. the Unicode character U+2026, Horizontal Ellipsis, in those places could help. | ||
|
||
# Rationale and Alternatives |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I'd like to see discussed is why this is limited to "functions, methods, and closures".
Why not also allow other things like array literals as well?
let a = [1, 2, 3];
let b = [7, 8, 9];
let c = [...a, ...b];
assert_eq!(c, [1, 2, 3, 7, 8, 9]);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or for extra fun,
let c = [...*b"hello world", b'\0'];
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a bit about this under "Future Possibilities" subchapter "Unpacking in Fixed-Size Array and Tuple Literals", but I should make it more general and expand it to cover additional cases as well – such as that char array example you gave.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ultimately, I'd love to see unpacking expanded thus.
The reason I left it out of scope for now is that, presumably, non-trivial additional work would be required to think over the possible interactions with other features and explore the design space thoroughly enough.
My preference would be to postpone expansion of the feature this way into a future RFC, but I'm not strongly opposed to including it in this one. I think postponing would also have the benefit, that – assuming this current RFC gets accepted – the RFC would be more lightweight. Either way, maybe I should elaborate on the reason for omission from the RFC.
What's your view on this – include now or have a separate RFC?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added further elaboration on the subject of unpacking within collection literals in commit 5033917. There's an IRLO thread from the end of 2020 with a pre-RFC on array expansion syntax by nwn that pretty much covers this idea, but would need synchronization on the selected syntax. (I wonder if they'd like to continue on this by themselves or work together to write a new version?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One possibility for a way forward that just occurred to me is landing these connected features (with accepted RFCs) into nightly one by one, but stabilizing them in one batch. This way, users of stable Rust would benefit from the features without any of them feeling unfinished.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think not including tuple, array, and slice constructors in the MVP is a mistake. Users will be very surprised to learn that this syntax works with tuple structs and not normal tuples. Also, making this syntax exclusive to tuple structs will incentivize developers to use them even when a tuple or array would otherwise be better suited, including in public APIs, potentially distorting the entire library ecosystem. (I agree with all the other future-possibility relegations.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree on tuples and arrays. But I think that slices has enough additional complexity it is worth excluding from the MVP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I was only referring to unpacking into slices, not unpacking slices. But now that I think about it more, what that really means is unpacking into arrays that are immediately reborrowed as slices. So yes, tuples and arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Jules-Bertholet, I agree on your rationale for why having the feature of unpacking into arrays (and tuples) sooner rather than later would be preferred. I'll consider some options on how to make this happen. For starters, I've contacted the author of the pre-RFC linked above to ask if they'd like to rebase it on this RFC or continue the work in some other way. Anyway, I think the best way forward will be to defer FCP'ing this MR until text clearly proposing the addition of the feature exists.
A place where I found myself wanting something similar is when passing function pointers instead of closures: // Compiles
[10].into_iter().map(usize::count_ones);
// Does not
std::iter::zip([10], [11]).map(usize::min); The second iterator will error by stating that this |
when we get variadic generics, you could probably just have: pub trait Iterator {
// assumes variadic generics are built on tuples
pub fn splatted_map<F: FnMut(...Self::Item) -> R, R>(self, f: F) -> SplattedMap<Self, F>
where
Self::Item: Tuple,
{
todo!()
}
// ... all existing trait methods
} |
Co-authored-by: Pyry Kontio <[email protected]>
This papercut is definitely related, and I've collected some links related to the problem under the "Prior Art" subchapter "Using Tuples in Place of Argument Lists" (possibly the Zulip thread there was started by you?). Unfortunately, I couldn't come up with a nice design leveraging the syntax proposed here to address this. |
Are you suggesting this as a change to |
yeah, that zulip thread was me! |
|
That would sadly not work as a fix then, since this papercut is not |
I don't see why we can't add |
I'd call this a workaround, not a fix, because it doesn't really address the issue itself. All in all, this is a papercut, and on its own I don't think it is enough justification for adding new dedicated methods, especially when it could feasibly be addressed by the language itself/some other change later on. |
The same ellipsis syntax with a very similar meaning could be adopted to defining fixed-size arrays and tuple literals as well. For example: | ||
```rust | ||
const CDE: [char; 3] = ['C', 'D', 'E']; | ||
const ABCDEFG1: [char; 7] = ['A', 'B', ...CDE, 'F', 'G']; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is added, then I’d expect for the syntax to be supported in vec!
as well (eventually, if not at the same time as in arrays).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True. Using vec![]
is a good example I should add to the subchapter "Unpacking Arguments for Macro Invocations" as a synergy for unpacking in arrays (or the other way around).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've now added a mention of vec![]
use under "Unpacking Arguments for Macro Invocations" in commit e7f696c.
While it is not spelled out, I suppose unpacking Example 1 with variadic function call: use std::ffi::c_char;
unsafe extern "C" {
unsafe fn printf(fmt: *const c_char, ...);
}
fn main() {
let expr = (1, 2, 3);
unsafe {
printf(c"%d %d %d\n".as_ptr(), ...expr);
}
} Example 2 with overloaded unboxed closure: #![feature(fn_traits, unboxed_closures)]
#[derive(Copy, Clone)]
struct F;
impl FnOnce<(u8, u8)> for F {
type Output = ();
extern "rust-call" fn call_once(self, args: (u8, u8)) {
println!("2 args: {args:?}");
}
}
impl FnOnce<(u8, u8, u8)> for F {
type Output = ();
extern "rust-call" fn call_once(self, args: (u8, u8, u8)) {
println!("3 args: {args:?}");
}
}
fn main() {
let f = F;
f(1, 2);
f(3, 4, 5);
let expr = (6, 7);
f(...expr);
} Because I've seen proposal above suggesting // error[E0284]: type annotations needed
printf(c"%d %d %d\n".as_ptr(), ...buf.try_into().unwrap());
// error[E0284]: type annotations needed
f(...buf.try_into().unwrap()); |
Yes, I think so. I hadn't thought of this, but it makes sense. Thanks for the examples – I'll need to do some further thinking w.r.t. calling variadic functions and update the text accordingly! |
yes, I think we should avoid adding more ways that tuples can't be used like tuple structs, since we already have enough pain from that for macro authors (e.g. you can't write |
Thanks for the suggestion. Having more alternatives from Rust-specific prior art helps, and so I've listed this Table 1 in commit 857517a. It's actually a bit similar to Scala's syntax, which I added in commit e06e70a. |
Personally, I see zero technical impediments and am totally in favor of having these as well. There's some discussion about human reasons under this comment thread: #3723 (review) @tmccombs, @programmerjake, what's your view on these? |
I think that it should work in any reasonable future implementation of variadics to be able to create a tuple or array with syntax like |
Thanks for the critical feedback! Anecdotally, I ran into the need for argument unpacking myself in a situation very similar to the one under "Guide-Level Explanation", where functions I was using were in different crates and I was merely passing the returned tuple from one as the conventional arguments in another. This is actually what prompted me to write the RFC. :) I've added a bit of conjecture under the "Motivation" chapter in commit 31eb229, but you're right in that data-driven evidence would be great. Here are a couple of experiments I've had in mind that I could try to run if I have the time:
|
If the expression being unpacked is a reference to one of the allowed collection types (tuple, tuple struct, or fixed-size array), whether argument unpacking compiles depends on if accessing **all** its fields manually with the field access expression (`.idx`, or `[idx]`) would work at compile time. If it does, then argument unpacking works. (For the reference, see [`std::ops::Deref`](https://doc.rust-lang.org/std/ops/trait.Deref.html) and [type coercions](https://doc.rust-lang.org/reference/type-coercions.html).) Attempts to access nonexistent fields of tuples and fixed-size arrays or references thereof, using indices past the types' ranges, lead to compilation errors. With argument unpacking, only the fields known of during program compilation are unpacked. | ||
|
||
For example, the following compiles, since the alternative compiles: | ||
```rust | ||
fn consume(a: u8, b: u8, c: u8) { | ||
println!("{a}, {b}, {c}"); | ||
} | ||
|
||
fn main() { | ||
let tup = &(1, 2, 3); | ||
consume(...tup); | ||
// Alternative: consume(tup.0, tup.1, tup.2); | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does ...foo
do if foo
is a tuple struct that also Deref
s to a different tuple struct? And does the answer change based on whether foo
’s fields are visible at the location?
More broadly, while this autoderef behavior may be a good idea, it needs justification. Why should this be in the MVP? Why is consume(...*tup);
inadequate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially, I was thinking that while these type coercions wouldn't be exactly trivial, they'd still be relatively straightforward, but you raise some good points that hadn't occurred to me. :)
I'm starting to think pushing these under Future Possibilities would be the best, especially considering consume(...*tup);
would be quite sufficient for an MVP, IMO.
## Syntax Change for Functional Record Updates | ||
|
||
Adopting the `...expr` syntax for argument unpacking means that it is now part of the general "take stuff from here and expand it" family of syntactic sugar. As Rust already uses the `..` syntax for *Functional Record Updates*, changing that to use an ellipsis instead would be congruent. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might also be worthwhile to switch from ..
to ...
in rest patterns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that would be nice since it'd let you write the pattern [a, ...b, c]
instead of having to write [a, b @ .., c]
since just ..b
is a range pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll include a note about that as well. Basically, in my view, having the two dot ..
syntax only for range-related functionality while dedicating the three dot ...
for "all the stuff here" seems consistent.
} | ||
``` | ||
|
||
Explicitly indicating varying degrees of (de)reference status or mutability on arguments being unpacked does not follow from the proposed syntax in any straightforward way. Thus, although it limits the usefulness of the feature, the design for such possibility is left out of the scope of this proposal. Consequently, the code will only compile if passing the arguments one by one with the corresponding field access expressions would compile. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that this should be left out of the MVP. But some discussion in the future possibilities section may be advisable.
Here is a possible syntax (which may have been proposed before):
let foo = (1, 2, 3);
func(&...foo);
// is equivalent to
func(&foo.0, &foo.1, &foo.2);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed – an example such as the one you gave would be good here. I'll add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This example does not convince me. My first thought is that the code is clearer if you define a struct to contain r
, g
, and b
. The unpacking feature makes it more convenient to use the code as written, which is actually a net negative because it removes pressure to refactor based on a struct.
The rusty approach to APIs tends to avoid long sequences of arguments of the same type because it's super hard to read that code and understand what is happening. This encourages long sequences of arguments of the same type.
I agree that encouraging long sequences of arguments is one possible result of this feature, and this concern is actually listed as the first point under the Drawbacks chapter. However, I'd argue that Rusty way is also relying on crates made by others, and refactoring those to exactly suit the function caller's needs is often not worth the effort. That said, I could try to come up with more convincing examples. One problem with those is that we don't yet have variadic functions which combine nicely with argument unpacking as shown upthread and also in the backlinked issue just recently. |
Summary
This RFC adds call-site unpacking of tuples, tuple structs, and fixed-size arrays, using
...expr
within the function call's parentheses as a shorthand for passing arguments. The full contents of these collections with known sizes are unpacked directly as the next arguments of a function call, desugaring into the corresponding element accesses during compilation.Rendered