diff --git a/docs/design/variadics.md b/docs/design/variadics.md new file mode 100644 index 0000000000000..0f9454ec92005 --- /dev/null +++ b/docs/design/variadics.md @@ -0,0 +1,950 @@ +# Variadics + + + + + +## Table of contents + +- [Basics](#basics) + - [Overview](#overview) + - [Packs and each-names](#packs-and-each-names) + - [Pack expansions](#pack-expansions) + - [Pack expansion expressions and statements](#pack-expansion-expressions-and-statements) + - [Pack expansion patterns](#pack-expansion-patterns) + - [Additional examples](#additional-examples) +- [Execution Semantics](#execution-semantics) + - [Expressions and statements](#expressions-and-statements) + - [Pattern matching](#pattern-matching) +- [Typechecking](#typechecking) + - [Tuples, packs, segments, and shapes](#tuples-packs-segments-and-shapes) + - [Iterative typechecking of pack expansions](#iterative-typechecking-of-pack-expansions) + - [Typechecking patterns](#typechecking-patterns) + - [Typechecking pattern matches](#typechecking-pattern-matches) +- [Appendix: Type system formalism](#appendix-type-system-formalism) + - [Explicit deduced arities](#explicit-deduced-arities) + - [Typing and shaping rules](#typing-and-shaping-rules) + - [Reduction rules](#reduction-rules) + - [Equivalence, equality, and convertibility](#equivalence-equality-and-convertibility) + - [Pattern match typechecking algorithm](#pattern-match-typechecking-algorithm) + - [Canonicalization algorithm](#canonicalization-algorithm) +- [Alternatives considered](#alternatives-considered) +- [References](#references) + + + +## Basics + +### Overview + +A "pack expansion" is a syntactic unit beginning with `...`, which is a kind of +compile-time loop over sequences called "packs". Packs are initialized and +referred to using "each-names", which are marked with the `each` keyword at the +point of declaration and the point of use. + +The syntax and behavior of a pack expansion depends on its context, and in some +cases on a keyword following the `...`: + +- In a tuple literal expression (such as a function call argument list), `...` + iteratively evaluates its operand expression, and treats the values as + successive elements of the tuple. +- `...and` and `...or` iteratively evaluate a boolean expression, combining + the values using `and` and `or`. Normal short-circuiting behavior for the + resulting `and` and `or` operators applies at runtime. +- In a statement context, `...` iteratively executes a statement. +- In a tuple literal pattern (such as a function parameter list), `...` + iteratively matches the elements of the scrutinee tuple. In conjunction with + pack bindings, this enables functions to take an arbitrary number of + arguments. + +This example illustrates many of the key concepts: + +```carbon +// Takes an arbitrary number of vectors with arbitrary element types, and +// returns a vector of tuples where the i'th element of the vector is +// a tuple of the i'th elements of the input vectors. +fn Zip[... each ElementType:! type] + (... each vector: Vector(each ElementType)) + -> Vector((... each ElementType)) { + ... var each iter: auto = each vector.Begin(); + var result: Vector((... each ElementType)); + while (...and each iter != each vector.End()) { + result.push_back((... each iter)); + ... each iter++; + } + return result; +} +``` + +### Packs and each-names + +A _pack_ is a sequence of a fixed number of values called "elements", which may +be of different types. Packs are very similar to tuple values in many ways, but +they are not first-class values -- in particular, no run-time expression +evaluates to a pack. The _arity_ of a pack is a compile-time value representing +the number of values in the sequence. + +An _each-name_ consists of the keyword `each` followed by the name of a pack, +and can only occur inside a pack expansion. On the Nth iteration of the pack +expansion, an each-name refers to the Nth element of the named pack. As a +result, a binding pattern with an each-name, such as `each ElementType:! type`, +acts as a declaration of all the elements of the named pack, and thereby +implicitly acts as a declaration of the pack itself. + +Note that `each` is part of the name syntax, not an expression operator, so it +binds more tightly than any expression syntax. For example, the loop condition +`...and each iter != each vector.End()` in the implementation of `Zip` is +equivalent to `...and (each iter) != (each vector).End()`. + +### Pack expansions + +A _pack expansion_ is an instance of one of the following syntactic forms: + +- A statement of the form "`...` _statement_". +- A tuple expression element of the form "`...` _expression_", with the same + precedence as `,`. +- A tuple pattern element of the form "`...` _pattern_", with the same + precedence as `,`. +- An implicit parameter list element of the form "`...` _pattern_", with the + same precedence as `,`. +- An expression of the form "`...` `and` _expression_" or "`...` `or` + _expression_", with the same precedence as `and` and `or`. + +The statement, expression, or pattern following the `...` (and the `and`/`or`, +if present) is called the _body_ of the expansion. + +The `...` token can also occur in a tuple expression element of the form "`...` +`expand` _expression_", with the same precedence as `,`. However, that syntax is +not considered a pack expansion, and has its own semantics: _expression_ must +have a tuple type, and "`...` `expand` _expression_" evaluates _expression_ and +treats its elements as elements of the enclosing tuple literal. This is +especially useful for using non-literal tuple values as function call arguments: + +```carbon +fn F(x: i32, y: String); +fn MakeArgs() -> (i32, String); + +F(...expand MakeArgs()); +``` + +`...and`, `...or`, and `...expand` can be trivially distinguished with one token +of lookahead, and the other meanings of `...` can be distinguished from each +other by the context they appear in. As a corollary, if the nearest enclosing +delimiters around a `...` are parentheses, they will be interpreted as forming a +tuple rather than as grouping. Thus, expressions like `(... each ElementType)` +in the above example are tuple literals, even though they don't contain commas. + +By convention, `...` is always followed by whitespace, except that `...and`, +`...or`, and `...expand` are written with no whitespace between the two tokens. +This serves to emphasize that the keyword is not part of the expansion body, but +rather a modifier on the syntax and semantics of `...`. + +All each-names in a given expansion must refer to packs with the same arity, +which we will also refer to as the arity of the expansion. If an expansion +contains no each-names, it must be a pattern, or an expression in the type +position of a binding pattern, and its arity is deduced from the scrutinee. + +A pack expansion or `...expand` expression cannot contain another pack expansion +or `...expand` expression. + +An each-name cannot be used in the same pack expansion that declares it. In most +if not all cases, an each-name that violates this rule can be changed to an +ordinary name, because each-names are only necessary when you need to transfer a +pack from one pack expansion to another. + +#### Pack expansion expressions and statements + +A pack expansion expression or statement can be thought of as a kind of loop +that executes at compile time (specifically, monomorphization time), where the +expansion body is implicitly parameterized by an integer value called the _pack +index_, which ranges from 0 to one less than the arity of the expansion. The +pack index is implicitly used as an index into the packs referred to by +each-names. This is easiest to see with statement pack expansions. For example, +if `a`, `x`, and `y` are packs with arity 3, then +`... each a += each x * each y;` is roughly equivalent to + +```carbon +a[:0:] += x[:0:] * y[:0:]; +a[:1:] += x[:1:] * y[:1:]; +a[:2:] += x[:2:] * y[:2:]; +``` + +Here we are using `[:N:]` as a hypothetical pack indexing operator for purposes +of illustration; packs cannot actually be indexed in Carbon code. + +> **Future work:** We're open to eventually adding indexing of variadics, but +> that remains future work and will need its own proposal. + +`...and` and `...or` behave like chains of the corresponding boolean operator, +so `...and F(each x, each y)` behaves like +`true and F(x[:0:], y[:0:]) and F(x[:1:], y[:1:]) and F(x[:2:], y[:2:])`. They +can also be interpreted as looping constructs, although the rewrite is less +straightforward because Carbon doesn't have a way to write a loop in an +expression context. An expression like `...and F(each x, each y)` can be thought +of as evaluating to the value of `result` after executing the following code +fragment: + +``` +var result: bool = true; +for (let i:! i32 in (0, 1, 2)) { + result = result && F(x[:i:], y[:i:]); + if (result == false) { break; } +} +``` + +`...` in a tuple literal behaves like a series of comma-separated tuple +elements, so `(... F(each x, each y))` is equivalent to +`(F(x[:0:], y[:0:]), F(x[:1:], y[:1:]), F(x[:2:], y[:2:]))`. This can't be +expressed as a loop in Carbon code, but it is still fundamentally iterative. + +#### Pack expansion patterns + +A pack expansion pattern "`...` _subpattern_" appears as part of a tuple pattern +(or an implicit parameter list), and matches a sequence of tuple elements if +each element matches _subpattern_. For example, in the signature of `Zip` shown +earlier, the parameter list consists of a single pack expansion pattern +`... each vector: Vector(each ElementType)`, and so the entire argument list +will be matched against the binding pattern +`each vector: Vector(each ElementType)`. + +Since _subpattern_ will be matched against multiple scrutinees (or none) in a +single pattern-matching operation, a binding pattern within a pack expansion +pattern must declare an each-name (such as `each vector` in the `Zip` example), +and the Nth iteration of the pack expansion will initialize the Nth element of +the named pack from the Nth scrutinee. The binding pattern's type expression may +contain an each-name (such as `each ElementType` in the `Zip` example), but if +so, it must be a deduced parameter of the enclosing pattern. + +> **Future work:** That restriction can probably be relaxed, but we currently +> don't have motivating use cases to constrain the design. + +### Additional examples + +```carbon +// Computes the sum of its arguments, which are i64s +fn SumInts(... each param: i64) -> i64 { + var sum: i64 = 0; + ... sum += each param; + return sum; +} +``` + +```carbon +// Concatenates its arguments, which are all convertible to String +fn StrCat[... each T:! ConvertibleToString](... each param: each T) -> String { + var len: i64 = 0; + ... len += each param.Length(); + var result: String = ""; + result.Reserve(len); + ... result.Append(each param.ToString()); + return result; +} +``` + +```carbon +// Returns the minimum of its arguments, which must all have the same type T. +fn Min[T:! Comparable & Value](first: T, ... each next: T) -> T { + var result: T = first; + ... if (each next < result) { + result = each next; + } + return result; +} +``` + +```carbon +// Invokes f, with the tuple `args` as its arguments. +fn Apply[... each T:! type, F:! CallableWith(... each T)] + (f: F, args: (... each T)) -> auto { + return f(...expand args); +} +``` + +```carbon +// Toy example of mixing variadic and non-variadic parameters. +// Takes an i64, any number of f64s, and then another i64. +fn MiddleVariadic(first: i64, ... each middle: f64, last: i64); +``` + +```carbon +// Toy example of using the result of variadic type deduction. +fn TupleConcat[... each T1: type, ... each T2: type]( + t1: (... each T1), t2: (... each T2)) -> (... each T1, ... each T2) { + return (...expand t1, ...expand t2); +} +``` + +## Execution Semantics + +### Expressions and statements + +In all of the following, N is the arity of the pack expansion being discussed, +and `$I` is a notional variable representing the pack index. These semantics are +implemented at monomorphization time, so the value of N is a known integer +constant. Although the value of `$I` can vary during execution, it is +nevertheless treated as a constant. + +A statement of the form "`...` _statement_" is evaluated by executing +_statement_ N times, with `$I` ranging from 0 to N - 1. + +An expression of the form "`...and` _expression_" is evaluated as follows: a +notional `bool` variable `$R` is initialized to `true`, and then "`$R = $R and` +_expression_" is executed up to N times, with `$I` ranging from 0 to N - 1. If +at any point `$R` becomes false, this iteration is terminated early. The final +value of `$R` is the value of the expression. + +An expression of the form "`...or` _expression_" is evaluated the same way, but +with `or` in place of `and`, and `true` and `false` transposed. + +A tuple expression element of the form "`...` _expression_" evaluates to a +sequence of N values, where the k'th value is the value of _operand_ where `$I` +is equal to k - 1. + +An each-name evaluates to the `$I`th value of the pack it refers to (indexed +from zero). + +### Pattern matching + +The semantics of pack expansion patterns are chosen to follow the general +principle that pattern matching is the inverse of expression evaluation, so for +example if the pattern `(... each x: auto)` matches some scrutinee value `s`, +the expression `(... each x)` should be equal to `s`. These semantics are +implemented at monomorphization time, so all types are known constants, and all +all arities are known. + +A tuple pattern can contain no more than one subpattern of the form "`...` +_operand_". When such a subpattern is present, the N elements of the pattern +before the `...` expansion are matched with the first N elements of the +scrutinee, and the M elements of the pattern after the `...` expansion are +matched with the last M elements of the scrutinee. If the scrutinee does not +have at least N + M elements, the pattern does not match. + +The remaining elements of the scrutinee are iteratively matched against +_operand_, in order. In each iteration, `$I` is equal to the index of the +scrutinee element being matched, minus N. + +On the Nth iteration, a binding pattern binds the Nth element of the named pack +to the Nth scrutinee value. + +## Typechecking + +### Tuples, packs, segments, and shapes + +In order to discuss the underlying type system for variadics, we will need to +introduce some pseudo-syntax to represent values and expressions that occur in +the type system, but cannot be expressed directly in user code. We will use +non-ASCII glyphs such as `«»‖⟬⟭` for that pseudo-syntax, to distinguish it from +valid Carbon syntax. + +In the context of variadics, we will say that a tuple literal consists of a +comma-separated sequence of _segments_, and reserve the term "elements" for the +components of a tuple literal after pack expansion. For example, the expression +`(... each foo)` may evaluate to a tuple value with any number of elements, but +the expression itself has exactly one segment. + +Each segment has a type, which expresses (potentially symbolically) both the +types of the elements of the segment and the arity of the segment. The type of a +tuple literal is a tuple literal of the types of its segments. For example, +suppose we are trying to find the type of `z` in this code: + +```carbon +fn F[... each T:! type]((... each x: Optional(each T)), (... each y: i32)) { + let z: auto = (0 as f32, ... each x, ... each y); +} +``` + +We proceed by finding the type of each segment. The type of `0 as f32` is `f32`, +by the usual non-variadic typing rules. The type of `... each x` is +`... Optional(each T)`, because `Optional(each T)` is the declared type of +`each x`, and the type of a pack expansion is a pack expansion of the type of +its body. + +The type of `... each y` is more complicated. Conceptually, it consists of some +number of repetitions of `i32`. We don't know exactly how many repetitions, +because it's implicitly specified by the caller: it's the arity of the second +argument tuple. Effectively, that arity acts as a hidden deduced parameter of +`F`. + +So to represent this type, we need two new pseudo-syntaxes: + +- `‖each X‖` refers to the deduced arity of the pack expansion that contains + the declaration of `each X`. +- `«E; N»` evaluates to `N` repetitions of `E`. This is called a _arity + coercion_, because it coerces the expression `E` to have arity `N`. `E` must + not contain any pack expansions, each-names, or pack literals (see below). + +Combining the two, the type of `... each y` is `... «i32; ‖each y‖»`. Thus, the +type of `z` is `(f32, ... Optional(each T), ... «i32; ‖each y‖»)`. + +Now, consider a modified version of that example: + +```carbon +fn F[... each T:! type]((... each x: Optional(each T)), (... each y: i32)) { + let (... each z: auto) = (0 as f32, ... each x, ... each y); +} +``` + +`each z` is a pack, but it has the same elements as the tuple `z` in our earlier +example, so we represent its type in the same way, as a sequence of segments: +`⟬f32, Optional(each T), «i32; ‖each y‖»⟭`. The `⟬⟭` delimiters make this a +_pack literal_ rather than a tuple literal. Notice one subtle difference: the +segments of a pack literal do not contain `...`. In effect, every segment of a +pack literal acts as a separate loop body. As with the tuple literal syntax, the +pack literal pseudo-syntax can also be used in patterns. + +The _shape_ of a pack literal is a tuple of the arities of its segments, so the +shape of `⟬f32, Optional(each T), «i32; ‖each y‖»⟭` is +`(1, ‖each T‖, ‖each y‖)`. Other expressions and patterns also have shapes. In +particular, the shape of an arity coercion `«E; A»` is `(A,)`, the shape of +`each X` is `(‖each X‖,)`, and the shape of an expression that does not contain +pack literals, shape coercions, or each-names is `(1,)`. The arity of an +expression is the sum of the elements of its shape. See the +[appendix](#typing-and-shaping-rules) for the full rules for determining the +shape of an expression. + +If a pack literal is part of some enclosing expression that doesn't contain +`...`, it can be _expanded_, which moves the outer expression inside the pack +literal. For example, `... Optional(⟬each X, Y⟭)` is equivalent to +`... ⟬Optional(each X), Optional(Y)⟭`. Similarly, an arity coercion can be +expanded so long as the parent node is not `...`, a pattern, or a pack literal. +See the [appendix](#reduction-rules) for the full rules governing this +operation. _Fully expanding_ an expression or pattern that does not contain a +pack expansion means repeatedly expanding any pack literals and arity coercions +within it, until they cannot be expanded any further. + +The _scalar components_ of a fully-expanded expression `E` are a set, defined as +follows: + +- If `E` is a pack literal, its scalar components are the union of the scalar + components of the segments. +- If `E` is an arity coercion `«F; S»`, the only scalar component of `E` is + `F`. +- Otherwise, the only scalar component of `E` is `E`. + +The scalar components of any other expression that does not contain `...` are +the scalar components of its fully expanded form. + +By construction, a segment of a pack literal never has more than one scalar +component. Also by construction, a scalar component cannot contain a pack +literal, pack expansion, or arity coercion, but it can contain each-names, so we +can operate on it using the ordinary rules of non-variadic expressions so long +as we treat the names as opaque. + +### Iterative typechecking of pack expansions + +Since the execution semantics of an expansion are defined in terms of a notional +rewritten form where we simultaneously iterate over each-names, in principle we +can typecheck the expansion by typechecking the rewritten form. However, the +rewritten form usually would not typecheck as ordinary Carbon code, because the +each-names can have different types on different iterations. Furthermore, the +difference in types can propagate through expressions: if `each x` and `each y` +can have different types on different iterations, then so can `each x * each y`. +In effect, we have to typecheck the loop body separately for each iteration. + +However, at typechecking time we usually don't even know how many iterations +there will be, much less what type an each-name will have on any particular +iteration, because the types of the each-names are packs, which are sequences of +segments, not sequences of elements. To solve that problem, we require that the +types of all each-names in a pack expansion must have the same shape. This +enables us to typecheck the pack expansion by simultaneously iterating over +segments instead of input elements. + +As a result, the type of an expression or pattern within a pack expansion is a +sequence of segments, or in other words a _pack_, representing the types it +takes on over the course of the iteration. Note, however, that even though such +an expression has a pack type, it does not evaluate to a pack value. Rather, it +evaluates to a sequence of non-pack values over the course of the pack expansion +loop, and its pack type summarizes the types of that sequence. + +Within a given iteration, typechecking follows the usual rules of non-variadic +typechecking, except that when we need the type of an each-name, we use the +scalar component of the current segment of its type. As noted above, we can +operate on a scalar component using the ordinary rules of non-variadic +typechecking. + +Once the body of a pack expansion has been typechecked, typechecking the +expansion itself is relatively straightforward: + +- A statement pack expansion requires no further typechecking, because + statements don't have types. +- An `...and` or `...or` expression has type `bool`, and every segment of the + operand's type pack must have a type that's convertible to `bool`. +- For a `...` tuple element expression or pattern, the segments of the + operand's type pack become segments of the type of the enclosing tuple. + +> **TODO:** Discuss typechecking `...expand`. + +### Typechecking patterns + +A _full pattern_ consists of an optional deduced parameter list, a pattern, and +an optional return type expression. + +A pack expansion pattern has _fixed arity_ if it contains at least one usage of +an each-name that is not a parameter of the enclosing full pattern. Otherwise it +has _deduced arity_. A tuple pattern can have at most one segment with deduced +arity. For example: + +```carbon +class C(... each T:! type) { + fn F[... each U:! type](... each t: each T, ... each u: each U); +} +``` + +In the signature of `F`, `... each t: each T` has fixed arity, since the arity +is determined by the arguments passed to `C`, before the call to `F`. On the +other hand, `... each u: each U` has deduced arity, because the arity of +`each U` is determined by the arguments passed to `F`. + +After typechecking a full pattern, we attempt to merge as many tuple segments as +possible, in order to simplify the subsequent pattern matching. For example, +consider the following function declaration: + +```carbon +fn Min[T:! type](first: T, ... each next: T) -> T; +``` + +During typechecking, we rewrite that function signature so that it only has one +parameter: + +```carbon +fn Min[T:! type](... each args: «T; ‖each next‖+1») -> T; +``` + +(We represent the arity as `‖each next‖+1` to capture the fact that `each args` +must match at least one element.) + +When the pattern is heterogeneous, the merging process may be more complex. For +example: + +```carbon +fn ZipAtLeastOne[First:! type, ... each Next:! type] + (first: Vector(First), ... each next: Vector(each Next)) + -> Vector((First, ... each Next)); +``` + +During typechecking, we transform that function signature to the following form: + +```carbon +fn ZipAtLeastOne[... ⟬First, each Next⟭:! «type; ‖each next‖+1»] + (... each __args: Vector(⟬First, each Next⟭)) + -> Vector((... ⟬First, each Next⟭)); +``` + +We can then rewrite that by replacing the pack of names `⟬First, each Next⟭` +with an invented name `each __Args`, so that the function has only one +parameter: + +```carbon +fn ZipAtLeastOne[... each __Args:! «type; ‖each next‖+1»] + (... each __args: Vector(each __Args)) + -> Vector((... each __Args)); +``` + +We can replace a name pack with an invented each-name only if all of the +following conditions hold: + +- The name pack doesn't use any name more than once. For example, we can't + apply this rewrite to `⟬X, each Y, X⟭`. +- The name pack contains exactly one each-name. For example, we can't apply + this rewrite to `⟬X, Y⟭`. +- The replacement removes all usages of the constituent names, including their + declarations. For example, we can't apply this rewrite to `⟬X, each Y⟭` in + this code, because the resulting signature would have return type `X` but no + declaration of `X`: + ```carbon + fn F[... ⟬X, each Y⟭:! «type; ‖each next‖+1»] + (... each __args: each ⟬X, each Y⟭) -> X; + ``` +- The pack expansions being rewritten do not contain any pack literals other + than the name pack being replaced. For example, we can't apply this rewrite + to `⟬X, each Y⟭` in this code, because the pack expansion in the deduced + parameter list also contains the pack literal `⟬I, each type⟭`: + ```carbon + fn F[... ⟬X, each Y⟭:! ⟬I, each type⟭](... each __args: each ⟬X, each Y⟭); + ``` + Notice that as a corollary of this rule, all the names in the name pack must + have the same type. + +See the [appendix](#pattern-match-typechecking-algorithm) for a more formal +discussion of the rewriting process. + +### Typechecking pattern matches + +To typecheck a pattern match between a tuple pattern and a tuple scrutinee, we +try to split and merge the segments of the scrutinee type so that it has the +same number of segments as the pattern type, and corresponding segments have the +same arity. For example, consider this call to `ZipAtLeastOne` (as defined in +the previous section): + +```carbon +fn F[... each T:! type](... each t: Vector(each T), u: Vector(i32)) { + ZipAtLeastOne(... each t, u); +} +``` + +The pattern type is `(... Vector(⟬First, each Next⟭))`, so we need to rewrite +the scrutinee type `(... Vector(each T), Vector(i32))` to have a single tuple +segment with an arity that matches `‖each Next‖+1`. We can do that by merging +the scrutinee segments to obtain `(... ⟬Vector(each T), Vector(i32)⟭)`. This has +a single segment with arity `‖each T‖+1`, which can match `‖each Next‖+1` +because the deduced arity `‖each Next‖` behaves as a deduced parameter of the +pattern, so they match by deducing `‖each Next‖ == ‖each T‖`. + +When merging segments of the scrutinee, we don't attempt to form name packs and +replace them with invented names, but we also don't need to: we don't require a +merged scrutinee segments to have a single scalar component. + +The search for this rewrite processes each pattern segment to the left of the +segment with deduced arity, in order from left to right. For each pattern +segment, it greedily merges unmatched scrutinee segments from left to right +until their cumulative shape is greater than or equal to the shape of the +pattern segment, and then splits off a scrutinee segment on the right if +necessary to make the shapes exactly match. Pattern segments to the right of the +segment with deduced arity are processed the same way, but with left and right +reversed, so that segments are always processed from the outside in. + +See the [appendix](#appendix-type-system-formalism) for the rewrite rules that +govern merging and splitting. + +Once we have the pattern and scrutinee segments in one-to-one correspondence, we +check each scalar component of the scrutinee type against the scalar component +of the corresponding pattern type segment (by construction, the pattern type +segment has only one scalar component). Since we are checking scalar components +against scalar components, this proceeds according to the usual rules of +non-variadic typechecking. + +> **TODO:** Extend this approach to fall back to a complementary approach, where +> the pattern and scrutinee trade roles: we maximally merge the scrutinee tuple, +> while requiring each segment to have a single scalar component, and then +> merge/split the pattern tuple to match it, without requiring pattern tuple +> segments to have a single scalar component. This isn't quite symmetric with +> the current approach, because when processing the scrutinee we can't merge +> deduced parameters (scrutinees don't have any), but we can invent new `let` +> bindings. + +## Appendix: Type system formalism + +A _pack literal_ is a comma-separated sequence of segments, enclosed in `⟬⟭` +delimiters. A pack literal can appear in an expression, pattern, or name +context, and every segment must be valid in the context where the pack literal +appears (for example, the segments of a pack literal in a name context must all +be names). Pack literals cannot be nested, and cannot appear outside a pack +expansion. + +### Explicit deduced arities + +In this formalism, deduced arities are explicit rather than implicit, so Carbon +code must be desugared into this formalism as follows: + +For each pack expansion pattern, we introduce a binding pattern `__N:! Arity` as +a deduced parameter of the enclosing full pattern, where `__N` is a name chosen +to avoid collisions. Then, for each binding pattern of the form `each X: T` +within that expansion, if `T` does not contain an each-name, the binding pattern +is rewritten as `each X: «T; __N»`. If this does not introduce any usages of +`__N`, we remove its declaration. + +`Arity` is a compiler-internal type which represents non-negative integers. The +only operation it supports is `+`, with non-negative integer literals and other +`Arity`s. `Arity` is used only during type checking, so `+` has no run-time +semantics, and its only symbolic semantics are that it is commutative and +associative. + +### Typing and shaping rules + +The shape of an AST node within a pack expansion is determined as follows: + +- The shape of an arity coercion is the value of the expression after the `;`. +- The shape of a pack literal is the concatenation of the arities of its + segments. +- The shape of an each-name expression is the shape of the binding pattern + that declared the name. +- If a binding pattern's name and type components have the same number of + segments, and each name segment is an each-name if and only if the + corresponding type segment's shape is not 1, then the shape of the binding + pattern is the shape of the type expression. Otherwise, the binding pattern + is ill-shaped. +- For any other AST node: + - If all the node's children have shape 1, its shape is 1. + - If there is some shape `S` such that all of the node's children have + shape either 1 or `S`, its shape is `S`. + - Otherwise, the node is ill-shaped. + +> **TODO:** The "well-shaped" rules as stated are slightly too restrictive. For +> example, `⟬each X, Y⟭: «Z; N+1»` is well-shaped, and `(⟬each X, Y⟭, «Z; N+1»)` +> is well-shaped if the shape of `each X` is `N`. + +The type of an expression or pattern can be computed as follows: + +- The type of `each x: auto` is `each __X`, a newly-invented deduced parameter + of the enclosing full pattern, which behaves as if it was declared as + `... each __X:! type`. +- The type of an each-name expression is the type expression of the binding + pattern that declared it. +- The type of an arity coercion `«E; S»` is `«T; S»`, where `T` is the type of + `E`. +- The type of a pack literal is a pack literal consisting of the concatenated + types of its segments. This concatenation flattens any nested pack literals + (for example `⟬A, ⟬B, C⟭⟭` becomes `⟬A, B, C⟭`) +- The type of a pack expansion expression or pattern is `...B`, where `B` is + the type of its body. +- The type of a tuple literal is a tuple literal consisting of the types of + its segments. +- If an expression or pattern `E` contains a pack literal or arity coercion + that is not inside a pack expansion, the type of `E` is the type of the + fully expanded form of `E`. + +> **TODO:** address `...expand`, `...and` and `...or`. + +### Reduction rules + +Unless otherwise specified, all expressions in these rules must be free of side +effects. Note that every reduction rule is also an equivalence: the utterance +before the reduction is equivalent to the utterance after, so these rules can +sometimes be run in reverse (particularly during deduction). + +Utterances that are reduced by these rules must be well-shaped (and the reduced +form will likewise be well-shaped), but need not be well-typed. This enables us +to apply these reductions while determining whether an utterance is well-typed, +as in the case of typing an expression or pattern that contains a pack literal +or arity coercion, above. + +_Singular pack removal:_ if `E` is a pack segment, `⟬E⟭` reduces to `E`. + +_Singular expansion removal:_ `...E` reduces to `E`, if the shape of `E` is +`(1,)`. + +_Pack expansion splitting:_ If `E` is a segment and `S` is a sequence of +segments, `...⟬E, S⟭` reduces to `...E, ...⟬S⟭`. + +_Pack expanding:_ If `F` is a function, `X` is an utterance that does not +contain pack literals, each-names, or arity coercions, and `⟬P1, P2⟭` and +`⟬Q1, Q2⟭` both have the shape `(S1, S2)`, then +`F(⟬P1, P2⟭, X, ⟬Q1, Q2⟭, «Y; S1+S2»)` reduces to +`⟬F(P1, X, Q1, «Y; S1»), F(P2, X, Q2, «Y; S2»)⟭`. This rule generalizes in +several dimensions: + +- `F` can have any number of arity coercion and other non-pack-literal + arguments, and any positive number of pack literal arguments, and they can + be in any order. +- The pack literal arguments can have any number of segments (but the + well-shapedness requirement means they must have the same number of + segments). +- `F()` can be any expression syntax other than `...`, not just a function + call. For example, this rule implies that `⟬X1, X2⟭ * ⟬Y1, Y2⟭` reduces to + `⟬X1 * Y1, X2 * Y2⟭`, where the `*` operator plays the role of `F`. +- `F()` can also a be a pattern syntax. For example, this rule implies that + `(⟬x1: X1, x2: X2⟭, ⟬y1: Y1, y2: Y2⟭)` reduces to + `⟬(x1: X1, y1: Y1), (x2: X2, y2: Y2)⟭`, where the tuple pattern syntax + `( , )` plays the role of `F`. +- When binding pattern syntax takes the role of `F`, the name part of the + binding pattern must be a name pack. For example, `⟬x1, x2⟭: ⟬X1, X2⟭` + reduces to `⟬x1: X1, x2: X2⟭`, but `each x: ⟬X1, X2⟭` cannot be reduced by + this rule. + +_Coercion expanding:_ If `F` is a function, `S` is a shape, and `Y` is an +expression that does not contain pack literals or arity coercions, +`F(«X; S», Y, «Z; S»)` reduces to `«F(X, Y, Z); S»`. As with pack expanding, +this rule generalizes: + +- `F` can have any number of non-arity-coercion arguments, and any positive + number of arity coercion arguments, and they can be in any order. +- `F()` can be any expression syntax other than `...` or pack literal + formation, not just a function call. Unlike pack expanding, coercion + expanding does not apply if `F` is a pattern syntax. + +_Coercion removal:_ `«E; 1»` reduces to `E`. + +_Tuple indexing:_ Let `I` be an integer template constant, let `X` be a tuple +segment, and let `Ys` be a sequence of tuple segments. + +- If the arity `A` of `X` is less than `I+1`, then `(X, Ys).(I)` reduces to + `(Ys).(I-A)`. +- Otherwise: + - If `X` is not a pack expansion, then `(X, Ys).(I)` reduces to `X`. + - If `X` is of the form `...⟬«E; S»⟭`, then `(X, Ys).(I)` reduces to `E`. + +### Equivalence, equality, and convertibility + +_Pack renaming:_ Let `Ns` be a sequence of names, let `⟬Ns⟭: «T; N»` be a name +binding pattern (which may be a symbolic or template binding as well as a +runtime binding), and let `__A` be an identifier that does not collide with any +name that's visible where `⟬Ns⟭` is visible. We can rewrite all occurrences of +`⟬Ns⟭` to `each __A` in the scope of the binding pattern (including the pattern +itself) if all of the following conditions hold: + +- `Ns` contains at least one each-name. +- No name in `Ns` is used in the scope outside of `Ns`. +- No name occurs more than once in `Ns`. +- No other pack literals occur in the same pack expansion as an occurrence of + `⟬Ns⟭`. + +_Expansion convertibility:_ `...T` is convertible to `...U` if the arity of `U` +equals the arity of `T`, and the scalar components of `T` are each convertible +to all scalar components of `U`. + +_Shape equality:_ Let `(S1s)`, `(S2s)`, `(S3s)`, and `(S4s)` be shapes. +`(S1s, S2s)` equals `(S3s, S4s)` if `(S1s)` equals `(S3s)` and `(S2s)` equals +`(S4s)`. + +### Pattern match typechecking algorithm + +A full pattern is in _normal form_ if it contains no pack literals, and every +arity coercion is fully expanded. For example, +`[__N:! Arity](... each x: Vector(«i32; __N»))` is not in normal form, but +`[__N:! Arity](... each x: «Vector(i32); __N»)` is. Note that all user-written +full patterns are in normal form. Note also that by construction, this means +that the type of the body of every pack expansion has a single scalar component. +The _canonical form_ of a full pattern is the unique normal form (if any) that +is "maximally merged", meaning that every tuple pattern and tuple literal has +the smallest number of segments. For example, the canonical form of +`[__N:! Arity](... each x: «i32; __N», y: i32)` is +`[__N:! Arity](... each __args: «i32; __N+1»)`. + +> **TODO:** Specify algorithm for converting a full pattern to canonical form, +> or establishing that there is no such form. See next section for a start. + +If a function with type `F` is called with argument type `A`, we typecheck the +call by converting `F` to canonical form, and then checking whether `A` is +convertible to the parameter type by applying the deduction rules in the +previous sections. If that succeeds, we apply the resulting binding map to the +function return type to obtain the type of the call expression. + +> **TODO:** Specify the algorithm more precisely. In particular, discuss how to +> rewrite `A` as needed to make the shapes line up, but don't rewrite `F` after +> canonicalization. + +Typechecking for pattern match operations other than function calls is defined +in terms of typechecking a function call: We check a scrutinee type `S` against +a pattern `P` by checking `__F(S,)` against a hypothetical function signature +`fn __F(P,)->();`. + +> **Future work:** Extend this approach to support merging the argument list as +> well as the parameter list. + +#### Canonicalization algorithm + +The canonical form can be found by starting with a normal form, and +incrementally merging an adjacent singular parameter type into the variadic +parameter type. + +For example, consider the following function: + +```carbon +fn F[First:! type, Second:! type, ... each Next:! type] + (first: Vector(First), second: Vector(Second), + ... each next: Vector(each Next)) -> (First, Second, ... each Next); +``` + +First, we desugar the implicit arity: + +```carbon +fn F[__N:! Arity, First:! type, Second:! type, ... each Next:! «type; __N»] + (first: Vector(First), second: Vector(Second), + ... each next: Vector(each Next)) -> (First, Second, ... each Next); +``` + +Then we attempt to merge `Second` with `each Next` as follows (note that for +brevity, some of the steps presented here actually contain multiple independent +reductions): + +```carbon +// Singular pack removal (in reverse) +fn F[__N:! Arity, First:! type, Second:! type, ... ⟬each Next:! «type; __N»⟭] + (first: Vector(First), second: Vector(Second), + ... each next: Vector(⟬each Next⟭)) -> (First, Second, ... ⟬each Next⟭); +// Pack expanding +fn F[__N:! Arity, First:! type, Second:! type, ... ⟬each Next:! «type; __N»⟭] + (first: Vector(First), second: Vector(Second), + ... each next: ⟬Vector(each Next)⟭) -> (First, Second, ... ⟬each Next⟭); +// Pack expanding +fn F[__N:! Arity, First:! type, Second:! type, ... ⟬each Next:! «type; __N»⟭] + (first: Vector(First), second: Vector(Second), + ... ⟬each next: Vector(each Next)⟭) -> (First, Second, ... ⟬each Next⟭); +// Pack expansion splitting (in reverse) +fn F[__N:! Arity, First:! type, ... ⟬Second:! type, each Next:! «type; __N»⟭] + (first: Vector(First), ... ⟬second: Vector(Second), + each next: Vector(each Next)⟭) + -> (First, ... ⟬Second, each Next⟭); +// Pack expanding (in reverse) +fn F[__N:! Arity, First:! type, ... ⟬Second, each Next⟭:! «type; __N+1»] + (first: Vector(First), ... ⟬second, each next⟭: ⟬Vector(Second), Vector(each Next)⟭) + -> (First, ... ⟬Second, each Next⟭); +// Pack expanding (in reverse) +fn F[__N:! Arity, First:! type, ... ⟬Second, each Next⟭:! «type; __N+1»] + (first: Vector(First), ... ⟬second, each next⟭: Vector(⟬Second, each Next⟭)) + -> (First, ... ⟬Second, each Next⟭); +// Pack renaming +fn F[__N:! Arity, First:! type, ... each __A:! «type; __N+1»] + (first: Vector(First), ... each __a: Vector(each __A)) + -> (First, ... each __A); +``` + +This brings us back to a normal form, while reducing the number of tuple +segments. We can now repeat that process to merge the remaining parameter type: + +```carbon +fn F[__N:! Arity, First:! type, ... ⟬each __A:! «type; __N+1»⟭] + (first: Vector(First), ... each __a: Vector(⟬each __A⟭)) + -> (First, ... ⟬each __A⟭); +// Pack expanding +fn F[__N:! Arity, First:! type, ... ⟬each __A:! «type; __N+1»⟭] + (first: Vector(First), ... each __a: ⟬Vector(each __A)⟭) + -> (First, ... ⟬each __A⟭); +// Pack expanding +fn F[__N:! Arity, First:! type, ... ⟬each __A:! «type; __N+1»⟭] + (first: Vector(First), ... ⟬each __a: Vector(each __A)⟭) + -> (First, ... ⟬each __A⟭); +// Pack expansion splitting (in reverse) +fn F[__N:! Arity, ... ⟬First:! type, each __A:! «type; __N+1»⟭] + (... ⟬first: Vector(First), each __a: Vector(each __A)⟭) + -> (... ⟬First, each __A⟭); +// Pack expanding (in reverse) +fn F[__N:! Arity, ... ⟬First, each __A⟭:! «type; __N+2»⟭] + (... ⟬first, each __a⟭: ⟬Vector(First), Vector(each __A)⟭) + -> (... ⟬First, each __A⟭); +// Pack expanding (in reverse) +fn F[__N:! Arity, ... ⟬First, each __A⟭:! «type; __N+2»⟭] + (... ⟬first, each __a⟭: Vector(⟬First, each __A⟭)) + -> (... ⟬First, each __A⟭); +// Pack renaming +fn F[__N:! Arity, ... __B:! «type; __N+2»⟭] + (... __b: Vector(__B)) + -> (... __B); +``` + +Here again, this is a normal form, and there is demonstrably no way to perform +any further merging, so this must be the canonical form. + +> **TODO:** define the algorithm in more general terms, and discuss ways that +> merging can fail. + +## Alternatives considered + +- [Member packs](/proposals/p2240.md#member-packs) +- [Single semantic model for pack expansions](/proposals/p2240.md#single-semantic-model-for-pack-expansions) +- [Generalize `expand`](/proposals/p2240.md#generalize-expand) +- [Omit `expand`](/proposals/p2240.md#omit-expand) +- [Support expanding arrays](/proposals/p2240.md#support-expanding-arrays) +- [Omit each-names](/proposals/p2240.md#omit-each-names) + - [Disallow pack-type bindings](/proposals/p2240.md#disallow-pack-type-bindings) +- [Fold expressions](/proposals/p2240.md#fold-expressions) +- [Allow multiple pack expansions in a tuple pattern](/proposals/p2240.md#allow-multiple-pack-expansions-in-a-tuple-pattern) +- [Allow nested pack expansions](/proposals/p2240.md#allow-nested-pack-expansions) +- [Use postfix instead of prefix `...`](/proposals/p2240.md#use-postfix-instead-of-prefix-) +- [Avoid context-sensitity in pack expansions](/proposals/p2240.md#avoid-context-sensitity-in-pack-expansions) + - [Fold-like syntax](/proposals/p2240.md#fold-like-syntax) + - [Variadic blocks](/proposals/p2240.md#variadic-blocks) + - [Keyword syntax](/proposals/p2240.md#keyword-syntax) +- [Require parentheses around `each`](/proposals/p2240.md#require-parentheses-around-each) +- [Fused expansion tokens](/proposals/p2240.md#fused-expansion-tokens) +- [No parameter merging](/proposals/p2240.md#no-parameter-merging) +- [Exhaustive function call typechecking](/proposals/p2240.md#exhaustive-function-call-typechecking) + +## References + +- Proposal + [#2240: Variadics](https://github.com/carbon-language/carbon-lang/pull/2240) diff --git a/docs/project/principles/library_apis_only.md b/docs/project/principles/library_apis_only.md index a32b9b95e13da..daa87bc5c1e03 100644 --- a/docs/project/principles/library_apis_only.md +++ b/docs/project/principles/library_apis_only.md @@ -88,16 +88,12 @@ is more restricted, and this principle will not apply to them. Most importantly, function types might not be first-class types, in which case they need not be library types. -The logic for translating a literal expression to a value of the appropriate -type is arguably part of that type's public API, but will not be part of that -type's class definition. - -Tuple types will probably not fully conform to this principle, because doing so -would be circular: there is no way to name a tuple type that doesn't rely on -tuple syntax, and no way to define a class body for a tuple type that doesn't -contain tuple patterns. However, we will strive to ensure that it is possible to -define a parameterized class type within Carbon that supports all the same -operations as built-in tuple types. +Some types (such as tuples, structs, and certain integer types) will have +built-in literal syntaxes for creating values of those types. Furthermore, in +some cases (such as tuples and structs) the type's literal syntax will also be +usable as a pattern syntax. The logic for performing those operations is +arguably part of those types' public API, but will not be part of those types' +class definitions. ## Alternatives considered diff --git a/proposals/p2240.md b/proposals/p2240.md new file mode 100644 index 0000000000000..dfecfcd771fef --- /dev/null +++ b/proposals/p2240.md @@ -0,0 +1,1136 @@ +# Variadics + + + +[Pull request](https://github.com/carbon-language/carbon-lang/pull/2240) + + + +## Table of contents + +- [Abstract](#abstract) +- [Problem](#problem) +- [Background](#background) +- [Proposal](#proposal) + - [Examples](#examples) + - [Comparisons](#comparisons) +- [Rationale](#rationale) +- [Alternatives considered](#alternatives-considered) + - [Member packs](#member-packs) + - [Single semantic model for pack expansions](#single-semantic-model-for-pack-expansions) + - [Generalize `expand`](#generalize-expand) + - [Omit `expand`](#omit-expand) + - [Support expanding arrays](#support-expanding-arrays) + - [Omit each-names](#omit-each-names) + - [Disallow pack-type bindings](#disallow-pack-type-bindings) + - [Fold expressions](#fold-expressions) + - [Allow multiple pack expansions in a tuple pattern](#allow-multiple-pack-expansions-in-a-tuple-pattern) + - [Allow nested pack expansions](#allow-nested-pack-expansions) + - [Use postfix instead of prefix `...`](#use-postfix-instead-of-prefix-) + - [Avoid context-sensitity in pack expansions](#avoid-context-sensitity-in-pack-expansions) + - [Fold-like syntax](#fold-like-syntax) + - [Variadic blocks](#variadic-blocks) + - [Keyword syntax](#keyword-syntax) + - [Require parentheses around `each`](#require-parentheses-around-each) + - [Fused expansion tokens](#fused-expansion-tokens) + - [No parameter merging](#no-parameter-merging) + - [Exhaustive function call typechecking](#exhaustive-function-call-typechecking) + + + +## Abstract + +Proposes a set of core features for declaring and implementing generic variadic +functions. + +A "pack expansion" is a syntactic unit beginning with `...`, which is a kind of +compile-time loop over sequences called "packs". Packs are initialized and +referred to using "each-names", which are marked with the `each` keyword at the +point of declaration and the point of use. + +The syntax and behavior of a pack expansion depends on its context, and in some +cases by a keyword following the `...`: + +- In a tuple literal expression (such as a function call argument list), `...` + iteratively evaluates its operand expression, and treats the values as + successive elements of the tuple. +- `...and` and `...or` iteratively evaluate a boolean expression, combining + the values using `and` and `or`, and ending the loop early if the underlying + operator short-circuits. +- In a statement context, `...` iteratively executes a statement. +- In a tuple literal pattern (such as a function parameter list), `...` + iteratively matches the elements of the scrutinee tuple. In conjunction with + pack bindings, this enables functions to take an arbitrary number of + arguments. + +## Problem + +Carbon needs a way to define functions and parameterized types that are +_variadic_, meaning they can take a variable number of arguments. + +## Background + +C has long supported variadic functions through the "varargs" mechanism, but +that's heavily disfavored in C++ because it isn't type-safe. Instead, C++ +provides a separate feature for defining variadic _templates_, which can be +functions, classes, or even variables. However, variadic templates currently +suffer from several shortcomings. Most notably: + +- They must be templates, which means they cannot be definition-checked, and + suffer from a variety of other costs such as needing to be defined in header + files, and code bloat due to template instantiation. +- It is inordinately difficult to define a variadic function whose parameters + have a fixed type, and the signature of such a function does not clearly + communicate that fixed type to readers. +- The design encourages using recursion rather than iteration to process the + elements of a variadic parameter list. This results in more template + instantiations, and typically has at least quadratic overhead in the size of + the pack (at compile time, and sometimes at run time). In recent versions of + C++ it is also possible to iterate over packs procedurally, using a + [fold expressions](https://en.cppreference.com/w/cpp/language/fold) over the + comma operator, but that technique is awkward to use and not widely known. + +There have been a number of C++ standard proposals to address some of these +issues, and improve variadic templates in other ways, such as +[P1219R2: Homogeneous variadic function parameters](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1219r2.html), +[P1306R1: Expansion Statements](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1306r1.pdf), +[P1858R2: Generalized Pack Declaration and Usage](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p1858r2.html), +and +[P2277R0: Packs Outside of Templates](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p2277r0.html). +However, C++ has chosen not to pursue definition-checking even for non-variadic +functions, so definition-checked variadics seem out of reach. The most recent +proposal to support fixed-type parameter packs was +[rejected](https://github.com/cplusplus/papers/issues/297). A proposal to +support iterating over parameter packs was inactive for several years, but has +very recently been [revived](https://github.com/cplusplus/papers/issues/156). + +Swift supports +[variadic parameters](https://docs.swift.org/swift-book/documentation/the-swift-programming-language/functions/#Variadic-Parameters) +so long as all elements have the same type, and has recently approved +[SE-0393: Value and Type Parameter Packs](https://github.com/apple/swift-evolution/blob/main/proposals/0393-parameter-packs.md), +which adds support for definition-checked, heterogeneous variadic parameters +(with a disjoint syntax). +[SE-0404: Pack Iteration](https://github.com/simanerush/swift-evolution/blob/se-0404-pack-iteration/proposals/0404-pack-iteration.md), +which extends that to support iterating through a variadic parameter list, has +been positively received, but hasn't yet been approved. + +There have been several attempts to add such a feature to Rust, but that work is +[currently inactive](https://github.com/rust-lang/rfcs/issues/376#issuecomment-830034029). + +## Proposal + +See `/docs/design/variadics.md` in this pull request. + +### Examples + +```carbon +// Computes the sum of its arguments, which are i64s +fn SumInts(... each param: i64) -> i64 { + var sum: i64 = 0; + ... sum += each param; + return sum; +} +``` + +```carbon +// Concatenates its arguments, which are all convertible to String +fn StrCat[... each T:! ConvertibleToString](... each param: each T) -> String { + var len: i64 = 0; + ... len += each param.Length(); + var result: String = ""; + result.Reserve(len); + ... result.Append(each param.ToString()); + return result; +} +``` + +```carbon +// Returns the minimum of its arguments, which must all have the same type T. +fn Min[T:! Comparable & Value](var result: T, ... each next: T) -> T { + ... if (each next < result) { + result = each next; + } + return result; +} +``` + +```carbon +// Invokes f, with the tuple `args` as its arguments. +fn Apply[... each T:! type, F:! CallableWith(... each T)] + (f: F, args: (... each T)) -> auto { + return f(...expand args); +} +``` + +```carbon +// Takes an arbitrary number of vectors with arbitrary element types, and +// returns a vector of tuples where the i'th element of the vector is +// a tuple of the i'th elements of the input vectors. +fn Zip[... each ElementType:! type] + (... each vector: Vector(each ElementType)) + -> Vector((... each ElementType)) { + ... var each iter: auto = each vector.Begin(); + var result: Vector((... each ElementType)); + while (...and each iter != each vector.End()) { + result.push_back((... each iter)); + ... each iter++; + } + return result; +} +``` + +```carbon +// Toy example of mixing variadic and non-variadic parameters. +// Takes an i64, any number of f64s, and then another i64. +fn MiddleVariadic(first: i64, ... each middle: f64, last: i64); +``` + +```carbon +// Toy example of using the result of variadic type deduction. +fn TupleConcat[... each T1: type, ... each T2: type]( + t1: (... each T1), t2: (... each T2)) -> (... each T1, ... each T2) { + return (...expand t1, ...expand t2); +} +``` + +### Comparisons + +The following table compares selected examples of Carbon variadics against +equivalent code written in C++20 (with and without the extensions discussed +[earlier](#background)) and Swift. + + + + + + + + + + + + + + + + + + + + + + + + + + + +
CarbonC++20C++20 with extensionsSwift with extensions
+ +```carbon +// Computes the sum of its arguments, which are i64s +fn SumInts(... each param: i64) -> i64 { + var sum: i64 = 0; + ... sum += each param; + return sum; +} +``` + + + +```cpp +template + requires (std::convertible_to && ...) +int64_t SumInts(const Params&... params) { + return (static_cast(params) + ... + 0); +} +``` + + + +With P1219R2: + +```C++ +int64_t SumInts(int64... params) { + return (static_cast(params) + ... + 0); +} +``` + + + +(No extensions) + +```swift +func SumInts(_ params: Int64...) { + var sum: Int64 = 0 + for param in params { + sum += param + } + return sum +} +``` + +
+ +```carbon +fn Min[T:! Comparable & Value](first: T, ... each next: T) -> T { + var result: T = first; + ... if (each next < result) { + result = each next; + } + return result; +} +``` + + + +```cpp +template + requires std::totally_ordered && std::copyable && + (std::same_as && ...) +T Min(T first, Params... rest) { + if constexpr (sizeof...(rest) == 0) { + // Base case. + return first; + } else { + T min_rest = Min(rest...); + if (min_rest < first) { + return min_rest; + } else { + return first; + } + } +} +``` + + + +With P1219R2 and P1306R2 + +```cpp +template + requires std::totally_ordered && std::copyable +T Min(const T& first, const T&... rest) { + T result = first; + template for (const T& t: rest) { + if (t < result) { + result = t; + } + } + return result; +} +``` + + + +(No extensions) + +```swift +func Min(_ first: T, _ rest: T...) -> T { + var result: T = first; + for t in rest { + if (t < result) { + result = t + } + } + return result +} +``` + +
+ +```carbon +fn StrCat[... each T:! ConvertibleToString](... each param: each T) -> String { + var len: i64 = 0; + ... len += each param.Length(); + var result: String = ""; + result.Reserve(len); + ... result.Append(each param.ToString()); + return result; +} +``` + + + +```cpp +template +std::string StrCat(const Ts&... params) { + std::string result; + result.reserve((params.Length() + ... + 0)); + (result.append(params.ToString()), ...); + return result; +} +``` + + + +With P1306R2 + +```cpp +template +std::string StrCat(const Ts&... params) { + std::string result; + result.reserve((params.Length() + ... + 0)); + template for (auto param: params) { + result.append(param.ToString()); + } + return result; +} +``` + + + +With SE-0393 and SE-404 + +```swift +func StrCat(_ param: repeat each T) -> String { + var len: Int64 = 0; + for param in repeat each param { + len += param.Length() + } + var result: String = "" + result.reserveCapacity(len) + for param in repeat each param { + result.append(param.ToString()) + } + return result +} +``` + +
+ +## Rationale + +Carbon needs variadics to effectively support +[interoperation with and migration from C++](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code), +where variadic templates are fairly common. Variadics also make code +[easier to read, understand, and write](/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write), +because some APIs (such as `printf`) can't be naturally expressed in terms of a +fixed number of parameters. + +Furthermore, Carbon needs to support _generic_ variadics for the same reasons it +needs to support generic non-variadic functions: for example, +definition-checking makes APIs +[easier to read, understand, and write](/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write), +and [easier to evolve](/docs/project/goals.md#software-and-language-evolution). +Furthermore, the language as a whole is easier to understand and write code in +if separate features like variadics and generics compose in natural ways, rather +than being mutually exclusive. + +Variadics are also important for supporting +[performance-critical software](/docs/project/goals.md#performance-critical-software), +because variadic APIs can be more efficient than their non-variadic +counterparts. For example, `StrCat` is fundamentally more efficient than +something like a chain of `operator+` calls on `std::string`, because it does +not need to materialize a series of partial results, and it can pre-allocate a +buffer large enough for the final result. + +Variadics are also needed to support the principle that +[all APIs are library APIs](/docs/project/principles/library_apis_only.md), +because the library representations of types such as tuples and callables will +need to be variadic. This proposal may appear to deviate from that principle in +some ways, but that appearance is misleading: + +- The design of pack expansion expressions treats the tuple literal syntax as + built-in, but this isn't a problem because literal syntaxes are explicitly + excluded from the principle. +- The design of pack expansion patterns treats tuple types as built-in. This + is arguably consistent with the principle, if we regard a tuple pattern as a + kind of tuple literal (note that they have identical syntax). This proposal + also revises the text of the principle to make that more explicit. +- Pack types themselves are built-in types, with no library API. However, the + principle only applies to first-class types, and pack types are decidedly + not first-class: they cannot be function return types, they cannot even be + named, and an expression cannot evaluate to a value with a pack type unless + it's within a pack expansion _and_ it has compile-time expression phase (and + even that narrow exception only exists to make the formalism more + convenient). + +## Alternatives considered + +### Member packs + +We could potentially support declaring each-names as class members. However, +this raises some novel design issues. In particular, pack bindings currently +rely exclusively on type deduction for information like the arity of the pack, +but for class members, there usually isn't an initializer available to drive +type deduction. + +In addition, it's usually if not always possible to work around the lack of +member packs by using members with tuple or array types instead. Consequently, +this feature is deferred to future work. + +### Single semantic model for pack expansions + +There's a subtle discrepancy in how this proposal models expression pack +expansions: at run time, all pack expansions are modeled as procedural loops +that successively evaluate the expansion body for each element of the input +pack, and within each iteration, expressions have scalar values. However, in the +type system, expressions within a pack expansion are notionally evaluated once, +producing a pack value. In effect, this treats pack expansions like SIMD code, +with expressions operating on "vectors" of data in parallel, rather than +iteratively executing the code on a series of scalar values. + +This discrepancy leads to an impedance mismatch where the two models meet. In +particular, it leads to the result that expressions within a pack expansion have +pack types, but do not evaluate to pack values. This contravenes one of the +basic expectations of a type system, that the type of an expression equals (or +is at least a supertype of) the type of its value. + +It's tempting to resolve the inconsistency by applying the parallel model at run +time as well as in the type system. However, that isn't feasible, because the +parallel model has the same limitation for variadics as it does for SIMD: it +can't model branching control flow. For example, consider +`(... if (expand cond) then F(expand param) else G(expand param))`: if +`expand param` truly evaluated to a pack value, then evaluating this expression +would require N calls to _both_ `F` and `G`, rather than N calls to _either_ `F` +or `G`. Even for expressions that don't contain control flow, the same problem +applies when they occur within a statement pack expansion that does. We can't +even statically detect these problems, because a branch could be hidden inside a +function call. And this isn't just a performance problem -- if `F` or `G` have +side effects, it can also be a correctness problem. + +An earlier version of this proposal tried to address this problem in a more +limited way by saying that expressions within a pack expansion don't have types +at all, but instead have "type packs". This shift in terminology nominally +avoids the problem of having expressions that don't evaluate to a value of the +expression's type, but it doesn't seem to be very clarifying in practice, and it +doesn't address the substance of the problem. + +### Generalize `expand` + +The syntax "`...expand` _expression_" behaves like syntactic sugar for +`... each x`, where `x` is an invented pack binding in the same scope, defined +as if by "`let (... each x: auto) =` _expression_". We could generalize that by +saying that `expand` is a prefix operator with the same precedence as `*` that +can be used anywhere in a pack expansion, where "`expand` _expression_" is +syntactic sugar for `each x` (with `x` defined as before, in the scope +containing the pack expansion). This would make `expand` more useful, and also +resolve the anomaly where `...expand` is the only syntax that begins with `...` +but is not a pack expansion. It is also a precondition for several of the +alternatives discussed below. + +However, those semantics could be very surprising in practice. For example: + +```carbon +...if (Condition()) { + var x: auto = expand F(y); +} +``` + +In this code, `F(y)` is evaluated before the pack expansion is entered, which +means that it is evaluated unconditionally, and it cannot refer to names +declared inside the `if` block. + +We can avoid the name-resolution issue by disallowing `expand` in statement pack +expansions, but the sequencing of evaluation could still be surprising, +particularly with `if` expressions. + +### Omit `expand` + +As noted above, `...expand` is fundamentally syntactic sugar, so we could omit +it altogether. This would somewhat simplify the design, and avoid the anomaly of +having one syntax that starts with `...` but isn't a pack expansion. However, +that would make it substantially less ergonomic to do things like expand a tuple +into an argument list, which we expect to be relatively common. + +### Support expanding arrays + +Statically-sized arrays are very close to being a special case of tuple types: +the only difference between an array type `[i32; 2]` (using Rust syntax) and a +tuple type `(i32, i32)` is that the array type can be indexed with a run-time +subscript. Consequently, it would be fairly natural to allow `expand` to operate +on arrays as well as tuples, and even to allow arrays of types to be treated as +tuple types (in the same way that tuples of types can be treated as tuple +types). + +This functionality is omitted from the current proposal because we have no +motivating use cases, but it could be added as an extension. Note that there are +important motivating use cases under some of the alternatives considered below. + +### Omit each-names + +Rather than having packs be distinguished by their names, we could instead +distinguish them by their types. For example, under the current proposal, the +signature of `Zip` is: + +```carbon +fn Zip[... each ElementType:! type] + (... each vector: Vector(each ElementType)) + -> Vector((... each ElementType)); +``` + +With this alternative, it could instead be written: + +```carbon +fn Zip[ElementTypes:! [type;]] + (... vectors: Vector(expand ElementTypes)) + -> Vector((... expand ElementTypes)); +``` + +This employs several features not in the primary proposal: + +- In cases where the declared type of the each-name does not vary across + iterations (like `ElementType`), we can re-express it as an array binding if + [`expand` supports arrays](#support-expanding-arrays), and if + [`expand` is a stand-alone operator](#generalize-expand). Note that we only + need this in type position of a binding pattern, where we could more easily + restrict `expand` to avoid the problems discussed earlier. +- In cases where the declared type of the binding does vary, that fact alone + implies that the binding refers to a pack, so we can effectively infer the + presence of `each` from the type, rather than make the user spell it out + explicitly. + +This slight change in syntax belies a much larger shift in the underlying +semantics: since these are ordinary bindings, a given call to `Zip` must bind +each of them to a single value that represents the whole sequence of arguments +(which is why their names are now plural). In the case of `ElementTypes`, that +follows straightforwardly from its type: it represents the argument types as an +array of `type`s. The situation with `vectors` is more subtle: we have to +interpret `Vector(expand ElementTypes)` as the type of the whole sequence of +argument values, rather than as a generic description of the type of a single +argument. In other words, we have to interpret it as a pack type, and that means +`vectors` notionally binds to a run-time pack value. + +Consequently, when `vectors` is used in the function body, it doesn't need an +`each` prefix: we've chosen to express variadicity in terms of types, and it +already has a pack type, so it can be directly used as an expansion site. + +This approach has a few advantages: + +- We don't have to introduce the potentially-confusing concept of a binding + that binds to multiple values simultaneously. +- It avoids the anomaly where we have pack types in the type system, but no + actual values of those types. +- Removing the `each` keyword makes it more natural to spell `expand` as a + symbolic token (earlier versions of this proposal used `[:]`), which is more + concise and doesn't need surrounding whitespace. +- For fully homogeneous variadics (such as `SumInts` and `Min`) it's actually + possible to write the function body as an ordinary loop with no variadics, + by expressing the signature in terms of a non-pack binding with an array + type. + +However, it also has some major disadvantages: + +- The implicit expansion of pack-type bindings hurts readability. For example, + it's easy to overlook the fact that the loop condition + `while (...and expand iters != vectors.End())` in `Zip` has two expansion + sites, not just one. This problem is especially acute in cases where a + non-local name has a pack type. +- We have to forbid template-dependent names from having pack types (see + [leads issue #1162](https://github.com/carbon-language/carbon-lang/issues/1162)), + because the possibility that an expression might be an expansion site in + some instantiations but not others would cause serious readability and + implementability issues. +- A given _use_ of such a binding really represents a single value at a time, + in the same way that the iteration variable of a for-each loop does, so + giving the binding a plural name and a pack type creates confusion in that + context rather than alleviating it. + +It's also worth noting that we may eventually want to introduce operations that +treat the sequence of bound values as a unit, such as to determine the length of +the sequence (like `sizeof...` in C++), or even to index into it. This approach +might seem more amenable to that, because it conceptually treats the sequence of +values as a value in itself, which could have its own operations. However, this +approach leaves no "room" in the syntax to spell those operations, because any +mention of a pack-type binding implicitly refers to one of its elements. + +Conversely, the status quo proposal seems to leave a clear syntactic opening for +those operations: you can refer to the sequence as a whole by omitting `each`, +so `each vector.Size()` refers to the size of the current iteration's `vector`, +whereas `vector.Size()` could refer to the size of the sequence of bound values. +However, this could easily turn out to be a "wrong default": omitting `each` +seems easy to do by accident, and easy to misread during code review. + +There are other solutions to this problem that work equally well with the status +quo or this alternative. In particular, it's already possible to express these +operations outside of a pack expansion by converting to a tuple, as in +`(... each vector).Size()` (status quo) or `(... vectors).Size()` (this +alternative). That may be sufficient to address those use cases, especially if +we relax the restrictions on nesting pack expansions. Failing that, +variadic-only spellings for these operations (like `sizeof...` in C++) would +also work with both approaches. So this issue does not seem like an important +differentiator between the two approaches. + +#### Disallow pack-type bindings + +As a variant of the above approach, it's possible to omit both each-names and +pack-type bindings, and instead rely on variadic tuple-type bindings. For +example, the signature of `Zip` could instead be: + +```carbon +fn Zip[ElementTypes:! [type;]] + (... expand vectors: (... Vector(expand ElementTypes))) + -> Vector((... expand ElementTypes)); +``` + +This signature doesn't change the callsite semantics, but within the function +body `vectors` will be a tuple rather than a pack. This avoids or mitigates all +of the major disadvantages of pack-type bindings, but it comes at a substantial +cost: the function signature is substantially more complex and opaque. That +seems likely to be a bad tradeoff -- the disadvantages of pack-type bindings +mostly concern the function body, but readability of variadic function +signatures seems much more important than readability of variadic function +bodies, because the signatures will be read far more often, and by programmers +who have less familiarity with variadics. + +This approach requires us to relax the ban on nested pack expansions. This does +create some risk of confusion about which pack expansion a given `expand` +belongs to, but probably much less than if we allowed unrestricted nesting. + +The leads chose not to pursue this approach in +[leads issue #1162](https://github.com/carbon-language/carbon-lang/issues/1162). + +### Fold expressions + +We could generalize the `...and` and `...or` syntax to support a wider variety +of binary operators, and to permit specifying an initial value for the chain of +binary operators, as with C++'s +[fold expressions](https://en.cppreference.com/w/cpp/language/fold). This would +be more consistent with C++, and would give users more control over +associativity and over the behavior of the arity-zero case. + +However, fold expressions are arguably too general in some respects: folding +over a non-commutative operator like `-` is more likely to be confusing than to +be useful. Similarly, there are few if any plausible use cases for customizing +the arity-zero behavior of `and` or `or`. Conversely, fold expressions are +arguably not general enough in other respects, because they only support folding +over a fixed set of operators, not over functions or compound expressions. + +Furthermore, in order to support folds over operator tokens that can be either +binary or prefix-unary (such as `*`), we would need to choose a different syntax +for tuple element lists. Otherwise, `...*each foo` would be ambiguous between +`*foo[:0:], *foo[:1:],` etc. and `foo[:0:] * foo[:1:] *` etc. + +Note that even if Carbon supported more general C++-like fold expressions, we +would still probably have to give `and` and `or` special-case treatment, because +they are short-circuiting. + +As a point of comparison, C++ fold expressions give special-case treatment to +the same two operators, along with `,`: they are the only ones where the initial +value can be omitted (such as `... && args` rather than `true && ... && args`) +even if the pack may be empty. Furthermore, folding over `&&` appears to have +been the original motivation for adding fold expressions to C++; it's not clear +if there are important motivating use cases for the other operators. + +Given that we are only supporting a minimal set of operators, allowing `...` to +occur in ordinary binary syntax has few advantages and several drawbacks: + +- It might conflict with a future general fold facility. +- It would invite users to try other operators, and would probably give less + clear errors if they do. +- It would substantially complicate parsing and the AST. +- It would force users to make a meaningless choice between `x or ...` and + `... or x`, and likewise for `and`. + +See also the discussion [below](#fold-like-syntax) of using `...,` and `...;` in +place of the tuple and statement forms of `...`. This is inspired by fold +expressions, but distinct from them, because `,` and `;` are not truly binary +operators, and it's targeting a different problem. + +### Allow multiple pack expansions in a tuple pattern + +As currently proposed, we allow multiple `...` expressions within a tuple +literal expression, but only allow one `...` pattern within a tuple pattern. It +is superficially tempting to relax this restriction, but fundamentally +infeasible. + +Allowing multiple `...` patterns would create a potential for ambiguity about +where their scrutinees begin and end. For example, given a signature like +`fn F(... each xs: i32, ... each ys: i32)`, there is no way to tell where `xs` +ends and `ys` begins in the argument list; every choice is equally valid. That +ambiguity can be avoided if the types are different, but that would make type +_non_-equality a load-bearing part of the pattern. That's a very unusual thing +to need to reason about in the type system, so it's liable to be a source of +surprise and confusion for programmers, and in particular it looks difficult if +not impossible to usefully express with generic types, which would greatly limit +the usefulness of such a feature. + +Function authors can straightforwardly work around this restriction by adding +delimiters. For example, the current design disallows +`fn F(... each xs: i32, ... each ys: i32)`, but it allows +`fn F((... each xs: i32), (... each ys: i32))`, which is not only easier to +support, but makes the callsite safer and more readable, since the boundary +between the `xs` and `ys` arguments is explicitly marked. By contrast, if we +disallowed multiple `...` expressions in a function argument list, function +callers who ran into that restriction would often find it difficult or +impossible to work around. Note, however, that this workaround presupposes that +function signatures can have bindings below top-level, which is +[currently undecided](https://github.com/carbon-language/carbon-lang/issues/1229). + +To take a more abstract view of this situation: when we reuse expression syntax +as pattern syntax, we are effectively inverting expression evaluation, by asking +the language to find the operands that would cause an expression to evaluate to +a given value. That's only possible if the operations involved are invertible, +meaning that they do not lose information. When a tuple literal contains +multiple `...` expressions, evaluating it effectively discards structural +information about for example where `xs` ends and `ys` begins. The operation of +forming a tuple from multiple packs is not invertible, and consequently we +cannot use it as a pattern operation. Our rule effectively says that if the +function needs that structural information, it must ask the caller to provide +it, rather than asking the compiler to infer it. + +### Allow nested pack expansions + +Earlier versions of this design allowed pack expansions to contain other pack +expansions. This is in some ways a natural generalization, but it added +nontrivial complexity to the design. In particular, when an each-name is +lexically within two or more pack expansions, we need a rule for determining +which pack expansion iterates over it, in a way that is unsurprising and +supports the intended use cases. However, we have few if any motivating use +cases for it, which made it difficult to evaluate that aspect of the design. +Consequently, this proposal does not support nested pack expansions, although it +tries to avoid ruling them out as a future extension. + +### Use postfix instead of prefix `...` + +`...` is a postfix operator in C++, which aligns with the natural-language use +of "…", so it would be more consistent with both if `...`, `...and`, and `...or` +were postfix operators spelled `...`, `and...`, and `or...`, and likewise if +statement pack expansions were marked by a `...` at the end rather than the +beginning. + +However, prefix syntaxes are usually easier to parse (particularly for humans), +because they ensure that by the time you start parsing an utterance, you already +know the context in which it is used. This is clearest in the case of +statements: the reader might have to read an arbitrary amount of code in the +block before realizing that the code they've been reading will be executed +variadically, so that seems out of the question. The cases of `and`, `or`, and +`,` are less clear-cut, but we have chosen to make them all prefix operators for +consistency with statements. + +### Avoid context-sensitity in pack expansions + +This proposal "overloads" the `...` token with multiple different meanings +(including different precedences), and the meaning depends in part on the +surrounding context, despite Carbon's principle of +[avoiding context-sensitivity](/docs/project/principles/low_context_sensitivity.md). +We could instead represent the different meanings using separate syntaxes. + +There are several variants of this approach, but they all have substantial +drawbacks (see the following subsections). Furthermore, the problems associated +with context-sensitivity appear to be fairly mild in this case: the difference +between a tuple literal context and a statement context is usually quite local, +and is usually so fundamental that confusion seems unlikely. + +#### Fold-like syntax + +We could use a modifier after `...` to select the expansion's meaning (as we +already do with `and` and `or`). In particular, we could write `...,` to +iteratively form elements of a tuple, and write `...;` to iteratively execute a +statement. This avoids context-sensitivity (apart from `...,` having a dual role +in expressions and patterns, like many other syntaxes), and has an underlying +unity: `...,`, `...;` `...and`, and `...or` represent "folds" over the `,`, `;`, +`and`, and `or` tokens, respectively. As a side benefit, this would preserve the +property that a tuple literal always contains a `,` character (unlike the +current proposal). + +However, this approach has major readability problems. Using `...;` as a prefix +operator is completely at odds with the fact that `;` marks the end of a +statement, not the beginning. Furthermore, it would probably be surprising to +use `...;` in contexts where `;` is not needed, because the end of the statement +is marked with `}`. + +The problems with `...,` are less severe, but still substantial. In this syntax +`,` does not behave like a separator, but our eyes are trained to read it as +one, and that habit is difficult to unlearn. For example, most readers have +found that they can't help automatically reading `(..., each x)` as having two +sub-expressions, `...` and `each x`. This effect is particularly disruptive when +skimming a larger body of code, such as: + +```carbon +fn TupleConcat[..., each T1: type, ..., each T2: type]( + t1: (..., each T1), t2: (..., each T2)) -> (..., each T1, ..., each T2) { + return (..., expand t1, ..., expand t2); +} +``` + +#### Variadic blocks + +We could replace the statement form of `...` with a variadic block syntax such +as `...{ }`. However, this doesn't give us an alternative for the tuple form of +`...`, and yet heightens the problems with it: `...{` could read as as applying +the `...` operator to a struct literal. + +Furthermore, it gives us no way to variadically declare a variable that's +visible outside the expansion (such as `each iter` in the `Zip` example). This +can be worked around by declaring those variables as tuples, but this adds +unnecessary complexity to the code. + +#### Keyword syntax + +We could drop `...` altogether, and use a separate keyword for each kind of pack +expansion. For example, we could use `repeat` for variadic lists of tuple +elements, `do_repeat` for variadic statements, and `all_of` and `any_of` in +place of `...and` and `...or`. This leads to code like: + +```carbon +// Takes an arbitrary number of vectors with arbitrary element types, and +// returns a vector of tuples where the i'th element of the vector is +// a tuple of the i'th elements of the input vectors. +fn Zip[repeat each ElementType:! type] + (repeat each vector: Vector(each ElementType)) + -> Vector((repeat each ElementType)) { + do_repeat var each iter: auto = each vector.Begin(); + var result: Vector((repeat each ElementType)); + while (all_of each iter != each vector.End()) { + result.push_back((repeat each iter)); + repeat each iter++; + } + return result; +} +``` + +This approach is heavily influenced by +[Swift variadics](https://github.com/swiftlang/swift-evolution/blob/main/proposals/0393-parameter-packs.md), +but not quite the same. It has some major advantages: the keywords are more +consistent with `each` (and `expand` to some extent), substantially less +visually noisy than `...`, and they may also be more self-explanatory. However, +it does have some substantial drawbacks. + +Most notably, there is no longer any syntactic commonality between the different +tokens that mark the root of an expansion. That makes it harder to visually +identify expansions, and could also make variadics harder to learn, because the +spelling does not act as a mnemonic cue. And while it's already not ideal that +under the primary proposal a tuple literal is identified by the presence of +either `,` or `...`, it seems even worse if one of those two tokens is instead a +keyword. + +Relatedly, the keywords have less clear precedence relationships, because +`all_of` and `any_of` can't as easily "borrow" their precedence from their +non-variadic counterparts. For example, consider this line from `Zip`: + +```carbon +while (...and each iter != each vector.End()) { +``` + +Under this alternative, that becomes: + +```carbon +while (all_of each iter != each vector.End()) { +``` + +I find the precedence relationships in the initial `all_of expand iters !=` more +opaque than in `...and expand iters !=`, to the extent that we might need to +require additional parentheses: + +```carbon + while (all_of (expand iters != each vectors.End())) { +``` + +That avoids outright ambiguity, but obliging readers to maintain a mental stack +of parentheses in order to parse the expression creates its own readability +problems. + +It's appealing that the `repeat` keyword combines with `each` to produce code +that's almost readable as English, but it creates a temptation to read `expand` +the same way, which will usually be misleading. For example, `repeat expand foo` +sounds like it is repeatedly expanding `foo`, but in fact it expands it only +once. It's possible that a different spelling of `expand` could avoid that +problem, but I haven't been able to find one that does so while also avoiding +the potential for confusion with `each`. This is somewhat mitigated by the fact +that `expand` expressions are likely to be rare. + +It's somewhat awkward, and potentially even confusing, to use an imperative word +like `repeat` in a pattern context. By design, the pattern language is +descriptive rather than imperative: it describes the values that match rather +than giving instructions for how to match them. As a result, in a pattern like +`(repeat each param: i64)`, it's not clear what action is being repeated. + +Finally, it bears mentioning that the keywords occupy lexical space that could +otherwise be used for identifiers. Notably, `all_of`, `any_of`, and `repeat` are +all names of functions in the C++ standard library. This is not a fundamental +problem, because we expect Carbon to have some way of "quoting" a keyword for +use as an identifier (such as Rust's +[raw identifiers](https://doc.rust-lang.org/rust-by-example/compatibility/raw_identifiers.html)), +but it is likely to be a source of friction. + +### Require parentheses around `each` + +We could give `each` a lower precedence, so that expressions such as +`each vector.End()` would need to be written as `(each vector).End()`. This +could make the code clearer for readers, especially if they are new to Carbon +variadics. However, this would make the code visually busier, and might give the +misleading impression that `each` can be applied to anything other than an +identifier. I propose that we wait and see whether the unparenthesized syntax +has readability problems in practice, before attempting to solve those problems. + +We have discussed a more general solution to this kind of problem, where a +prefix operator could be embedded in a `->` token, in order to apply the prefix +operator to the left-hand operand without needing parentheses. However, this +approach is much more appealing when the prefix operator is a symbolic token: +`x-?>y` may be a plausible alternative to `(?x).y`, but `x-each>y` seems much +harder to visually parse. Furthermore, this approach is hard to reconcile with +treating `each` as fundamentally part of the name, rather than an operator +applied to the name. + +### Fused expansion tokens + +Instead of treating `...and` and `...or` as two tokens with whitespace +discouraged between them, we could treat them as single tokens. This might more +accurately reflect the fact that they are semantically different operations than +`...`, and reduce the potential for readability problems in code that doesn't +follow our recommended whitespace conventions. However, that could lead to a +worse user experience if users accidentally insert a space after the `...`. + +### No parameter merging + +Under the current proposal, the compiler attempts to merge function parameters +in order to support use cases like this one, where merging the parameters of +`Min` enables us to pair each argument with a single logical parameter that will +match it: + +```carbon +fn Min[T:! type](first: T, ... each next: T) -> T; + +fn F(... each arg: i32) { + Min(... each arg, 0 as i32); +} +``` + +However, this approach makes typechecking hard to understand (and predict), +because the complex conditions governing merging mean that subtle differences in +the code can cause dramatic differences in the semantics. For example: + +```carbon +fn F[A:! I, ... each B:! I](a: A, ... each b: each B); +fn G[A:! I, ... each B:! I](a: A, ... each b: each B) -> A; +``` + +These two function signatures are identical other than their return types, but +they actually have different requirements on their arguments: `G` requires the +first argument to be singular, whereas `F` only requires _some_ argument to be +singular. It seems likely to be hard to teach programmers that the function's +return type sometimes affects whether a given argument list is valid. Relatedly, +it's hard to see how a diagnostic could concisely explain why a given call to +`G` is invalid, in a way that doesn't seem to also apply to `F`. + +We could solve that problem by omitting parameter merging, and interpreting all +of the above signatures as requiring that the first argument must be singular, +because the first parameter is singular. Thus, there would be a clear and +predictable connection between the parameter list and the requirements on the +argument list. + +In order to support use cases like `Min` where the author doesn't intend to +impose such a requirement, we would need to provide some syntax for declaring +`Min` so that it has a single parameter, but can't be called with no arguments. +More generally, this syntax would probably need to support setting an arbitrary +minimum number of arguments, not just 1. For example, an earlier version of this +proposal used `each(>=N)` to require that a parameter match at least N +arguments, so `Min` could be written like this: + +```carbon +fn Min[T:! type](... each(>=1) param: T) -> T; +``` + +However, this alternative has several drawbacks: + +- We haven't been able to find a satisfactory arity-constraint syntax. In + addition to its aesthetic problems, `each(>=1) param` disrupts the mental + model where `each` is part of the name, and it's conceptually awkward + because the constraint actually applies to the pack expansion as a whole, + not to the each-name in particular. However, it's even harder to find an + arity-constraint syntax that could attach to `...` without creating + ambiguity. Furthermore, any arity-constraint syntax would be an additional + syntax that users need to learn, and an additional choice they need to make + when writing a function signature. +- Ideally, generic code should typecheck if every possible monomorphization of + it would typecheck. This alternative does not live up to that principle -- + see, for example, the above example of `Min`. The current design does not + fully achieve that aspiration either, but it's far more difficult to find + plausible examples where it fails. +- The first/rest style will probably be more natural to programmers coming + from C++, and if they define APIs in that style, there isn't any plausible + way for them to find out that they're imposing an unwanted constraint on + callers, until someone actually tries to make a call with the wrong shape. + +### Exhaustive function call typechecking + +The current proposal uses merging and splitting to try to align the argument and +parameter lists so that each argument has exactly one parameter than can match +it. We also plan to extend this design to also try the opposite approach, +aligning them so that each parameter has exactly one argument that it can match. +However, it isn't always possible to align arguments and parameters in that way. +For example: + +```carbon +fn F[... each T:! type](x: i32, ... each y: each T); + +fn G(... each z: i32) { + F(... each z, 0 as i16); +} +``` + +Every possible monomorphization of this code would typecheck, but we can't merge +the parameters because they have different types, and we can't merge the +arguments for the same reason. We also can't split the variadic parameter or the +variadic argument, because either of them could be empty. + +The fundamental problem is that, although every possible monomorphization +typechecks, some monomorphizations are structurally different from others. For +example, if `each z` is empty, the monomorphized code converts `0 as i16` to +`i32`, but otherwise `0 as i16` is passed into `F` unmodified. + +We could support such use cases by determining which parameters can potentially +match which arguments, and then typechecking each pair. For example, we could +typecheck the above code by cases: + +- If `each z` is empty, `x: i32` matches `0 as i16` (which typechecks because + `i16` is convertible to `i32`), and `each y: each T` matches nothing. +- If `each z` is not empty, `x: i32` matches its first element (which + typechecks because `i32` is convertible to `i32`), and `each y: each T` + matches the remaining elements of `each z`, followed by `0 as i16` (which + typechecks by binding `each T` to `⟬«i32; ‖each z‖-1», i16⟭`). + +More generally, this approach works by identifying all of the structurally +different ways that arguments could match parameters, typechecking them all in +parallel, and then combining the results with logical "and". + +However, the number of such cases (and hence the cost of typechecking) grows +quadratically, because the number of cases grows with the number of parameters, +and the case analysis has to be repeated for each variadic argument. +[Fast development cycles](/docs/project/goals.md#fast-and-scalable-development) +are a priority for Carbon, so if at all possible we want to avoid situations +where compilation costs grow faster than linearly with the amount of code. + +Furthermore, typechecking a function call doesn't merely need to output a +boolean decision about whether the code typechecks. In order to typecheck the +code that uses the call, and support subsequent phases of compilation, it needs +to also output the type of the call expression, and that can depend on the +values of deduced parameters of the function. + +These more complex outputs make it much harder to combine the results of +typechecking the separate cases. To do this in a general way, we would need to +incorporate some form of case branching directly into the type system. For +example: + +```carbon +fn P[T:! I, ... each U:! J](t: T, ... each u: each U) -> T; + +fn Q[X:! I&J, ... each Y:! I&J](x: X, ... each y: each Y) -> auto { + return P(... each y, x); +} + +fn R[A:! I&J ... each B:! I&J](a: A, ... each b: each B) { + Q(... each b, a); +} +``` + +The typechecker would need to represent the type of `P(... each x, y)` as +something like `(... each Y, X).0`. That subscript `.0` acts as a disguised form +of case branching, because now any subsequent code that depends on +`P(... each y, x)` needs to be typechecked separately for the cases where +`... each Y` is and is not empty. In this case, that even leaks back into the +caller `R` through `Q`'s return type, which compounds the complexity: the type +of `Q(... each b, a)` would need to be something like +`((... each B, A).(1..‖each B‖), (... each B, A).0).0` (where `.(M..N)` is a +hypothetical tuple slice notation). + +All of this may be feasible, but the cost in type system complexity and +performance would be daunting, and the benefits are at best unclear, because we +have not yet found plausible motivating use cases that benefit from this kind of +typechecking.