From f6e65586aefeb390aaeb1e9c0c9b9448820da141 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Thu, 15 Sep 2022 14:05:23 -0700 Subject: [PATCH 01/27] Proposal: pattern matching syntax and semantics. --- proposals/p2188.md | 427 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 427 insertions(+) create mode 100644 proposals/p2188.md diff --git a/proposals/p2188.md b/proposals/p2188.md new file mode 100644 index 0000000000000..1143795803b1c --- /dev/null +++ b/proposals/p2188.md @@ -0,0 +1,427 @@ +# Pattern matching syntax and semantics + + + +[Pull request](https://github.com/carbon-language/carbon-lang/pull/2188) + + + +## Table of contents + +- [TODO: Initial proposal setup](#todo-initial-proposal-setup) +- [Abstract](#abstract) +- [Problem](#problem) +- [Background](#background) +- [Proposal](#proposal) +- [Details](#details) +- [Rationale](#rationale) +- [Alternatives considered](#alternatives-considered) + + + +## Abstract + +This paper proposes concrete syntax and semantic choices for Carbon patterns. + +## Problem + +Carbon uses patterns wherever a value should be given a name, decomposed, or +matched against: in function parameters, variable declarations, `match` +statement `case`s, `for` statement loop variables, and so on. Simplified forms +of patterns, required to be just a simple name binding, appear in additional +contexts, such as fields in classes and implicit parameter lists. While we have +syntax specified for some of these constructs, we do not have an approved +proposal describing the syntax or semantics of patterns. + +## Background + +See [pattern matching](https://en.wikipedia.org/wiki/Pattern_matching) on +wikipedia for a broad overview of the subject. + +We refer to the value being matched by a pattern as the _scrutinee_. + +## Proposal + +Patterns in Carbon are a generalization of the expression grammar. Compared to +expressions, patterns add: + +- Bindings, of the form `name: type`, which give a name for the scrutinee. +- `var` _pattern_, which requests mutable storage be provided for the + scrutinee. +- Additional syntax to make matching against structs more convenient. + +## Details + +### Expressions versus proper patterns + +A pattern that contains within it any pattern-specific syntax, such as a +binding, is a _proper pattern_. Any other pattern must necessarily be an +expression. Many expression forms, such as arbitrary function calls, are not +permitted as proper patterns, so cannot contain bindings. + +- _pattern_ ::= _proper-pattern_ + +``` +fn F(n: i32) -> i32 { return n; } + +match (F(42)) { + // ❌ Error: binding can't appear in a function call. + case (F(n: i32)) => +``` + +### Expression patterns + +An expression is a pattern. + +- _pattern_ ::= _expression_ + +The pattern is compared with the expression using the `==` operator: _pattern_ `==` _scrutinee_. + +``` +fn F(n: i32) { + match (n) { + // ✅ Results in an `n == 5` comparison. + // OK despite `n` and `5` having different types. + case 5 => +``` + +### Bindings + +A name binding is a pattern. + +- _binding-pattern_ ::= `unused`? _identifier_ `:` _expression_ +- _binding-pattern_ ::= `unused`? _identifier_ `:` _proper-pattern_ +- _proper-pattern_ ::= _binding-pattern_ + +In the first form, the type of the _identifier_ is specified by the +_expression_. In the second form, the _proper-pattern_ is matched against the +type of the scrutinee, and the result is the type of the _identifier_. In +either case, the scrutinee is implicitly converted to the type of the +_identifier_. + +``` +fn F(p: i32*) { + match (p) { + // ✅ Matches `(T:! Type)` against `i32*`. + // `T` is bound to `i32`, `m` is bound to `p`, and the type of `m` is `i32*`. + case m: (T:! Type)* => +``` + +The `unused` keyword indicates that the binding is intended to not be used. + +### Wildcard + +A syntax like a binding but with `_` in place of an identifier can be used to +ignore part of a value. + +- _binding-pattern_ ::= `_` `:` _expression_ +- _binding-pattern_ ::= `_` `:` _proper-pattern_ + +The behavior is equivalent to that of an `unused` binding with a unique name. + +``` +fn F(n: i32) { + match (n) { + // ✅ Matches and discards the value of `n`. + case _: i32 => {} + // ❌ Error: unreachable. + default => {} + } +} +``` + +### Generic bindings + +A `:!` can be used in place of `:` for a binding that is usable at compile time. + +- _generic-pattern_ ::= `unused`? `template`? _identifier_ `:!` _expression_ +- _generic-pattern_ ::= `template`? `_` `:!` _expression_ +- _proper-pattern_ ::= _generic-pattern_ + +Note that, unlike in a _binding-pattern_, the type of a generic binding cannot +be a pattern. + +``` +// ✅ `F` takes a generic type parameter `T` and a parameter `x` of type `T`. +fn F(T:! Type, x: T) { + var v: T = x; +} +``` + +The `template` keyword indicates the binding is introducing a template +parameter, so name lookups into the parameter should be deferred until its +value is known. + +### `auto` + +The pattern `auto` is shorthand for `_:! Type`. + +- _proper-pattern_ ::= `auto` + +``` +fn F(n: i32) { + var v: auto = SomeComplicatedExpression(n); +} +``` + +### `var` + +A `var` prefix indicates that a pattern provides mutable storage for the +scrutinee. + +- _proper-pattern_ ::= `var` _proper-pattern_ + +A `var` pattern matches when its nested pattern matches. The type of the +storage is the resolved type of the nested _pattern_. Any bindings within the +nested pattern refer to portions of the corresponding storage rather than to +the scrutinee, somewhat as if the mutable storage is first initialized from the +scrutinee and then the nested pattern is matched against the mutable storage, +but the variable is not actually initialized unless the complete pattern +matches. + +``` +fn F(p: i32*); +fn G() { + match ((1, 2)) { + // `n` is a mutable `i32`. + case (var n: i32, 1) => { F(&n); } + // `n` and `m` are the elements of a mutable `(i32, i32)`. + case var (n: i32, m: i32) => { F(if n then &n else &m); } + } +} +``` + +A `var` pattern cannot be nested within another `var` pattern. The declaration +syntax `var` _pattern_ `=` _expresson_ `;` is equivalent to `let` `var` +_pattern_ `=` _expression_ `;`. + +### Tuple patterns + +A tuple of patterns can be used as a pattern. + +- _tuple-pattern_ ::= `(` [_expression_ `,`]\* _proper-pattern_ [`,` _pattern_]\* `)` +- _proper-pattern_ ::= _tuple-pattern_ + +A tuple pattern is matched left-to-right. The scrutinee is required to be of +tuple type. + +Note that a tuple pattern must contain at least one _proper-pattern_. +Otherwise, it is a tuple-valued expression. However, a tuple pattern and a +corresponding tuple-valued expression are matched in the same way because `==` +for a tuple compares fields left-to-right. + +### Struct patterns + +A struct can be matched with a struct pattern. + +- _proper-pattern_ ::= `{` [_field-init_ `,`]\* _proper-field-pattern_ [`,` + _field-pattern_]\* [, `...`]? `}` +- _field-init_ ::= _designator_ `=` _expression_ +- _proper-field-pattern_ ::= _designator_ `=` _proper-pattern_ +- _proper-field-pattern_ ::= _binding-pattern_ +- _field-pattern_ ::= _field-init_ +- _field-pattern_ ::= _proper-field-pattern_ + +A struct pattern resembles a struct literal, with at least one field +initialized with a proper pattern: + +``` +match ({.a = 1, .b = 2}) { + case {.b = n: i32, .a = m: i32} => +``` + +The scrutinee is required to be of struct type, and to have the same set of +field names as the pattern. The pattern is matched left-to-right, meaning that +matching is performed in the field order specified in the pattern, not in the +field order of the scrutinee. This is consistent with the behavior of matching +against a tuple-valued expression, where the left operand determines the order +in which `==` comparisons are performed. + +In the case where a field will be bound to an identifier with the same name, a +shorthand syntax is available: `a: T` is synonymous with `.a = a: T`. + +``` +match ({.a = 1, .b = 2}) { + case {a: i32, b: i32} => { return a + b; } +``` + +If some fields should be ignored when matching, a trailing `, ...` can be added +to specify this: + +``` +match ({.a = 1, .b = 2}) { + case {.a = 1, ...} => { return 1; } + case {b: i32, ...} => { return b; } +``` + +This is valid even if all fields are actually named in the pattern. + +### Choice patterns + +A choice pattern is used to match one alternative of a choice type. + +_proper-pattern_ ::= _callee-expression_ _tuple-pattern_ +_proper-pattern_ ::= _designator_ _tuple-pattern_? + +Here, _callee-expression_ is syntactically an expression that is valid as the +callee in a function call expression, and a choice pattern is syntactically a +function call expression whose argument list contains at least one +_proper-pattern_. + +If a _callee-expression_ is provided, it is required to name a choice type +alternative that has a parameter list, and the scrutinee is implicitly +converted to that choice type. Otherwise, the scrutinee is required to be of +some choice type, and the designator is looked up in that type and is required +to name an alternative with a parameter list if and only if a _tuple-pattern_ +is specified. + +The pattern matches if the active alternative in the scrutinee is the specified +alternative, and the arguments of the alternative match the given tuple +pattern (if any). + +``` +choice Optional(T:! Type) { + None, + Some(T) +} + +match (Optional(i32).None) { + // ✅ `.None` resolved to `Optional(i32).None`. + case .None => {} + // ✅ `.Some` resolved to `Optional(i32).Some`. + case .Some(n: i32) => { Print("{0}", n); } + // ❌ Error, no such alternative exists. + case .Other => {} +} + +class X { + external impl as ImplicitAs(Optional(i32)); +} + +match ({} as X) { + // ✅ OK, but expression pattern. + case Optional(i32).None => {} + // ✅ OK, implicitly converts to `Optional(i32)`. + case Optional(i32).Some(n: i32) => { Print("{0}", n); } +} +``` + +Note that a pattern of the form `Optional(T).None` is an expression pattern and +is compared using `==`. + +### Templates + +If the type of the scrutinee of a pattern is either a template parameter or is +an associated type that depends on a template parameter, any checking of the +type of the scrutinee against the type of the pattern is deferred until the +template parameter's value is known. During instantiation, patterns that are +not meaningful due to a type error are instead treated as not matching. This +includes cases where an `==` fails because of a missing `EqWith` +implementation. + +``` +fn TypeName[template T:! Type](x: T) -> String { + match (x) { + // ✅ OK, the type of `x` is a template parameter. + case _: i32 => { return "int"; } + case _: bool => { return "bool"; } + case _: auto* => { return "pointer"; } + default => { return "unknown"; } + } +} +``` + +``` +fn NeverWorks[template T:! Type](triple: (T, T, T)) { + match (triple) { + // ❌ Error: type mismatch matching struct against (T, T, T) + case {.a: i32, .b: i32} => +} +``` + +## Rationale + +- [Software and language evolution](/docs/project/goals.md#software-and-language-evolution) + - The `, ...` syntax for struct patterns enables a style where adding a + struct member is not a breaking change. +- [Code that is easy to read, understand, and write](/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write) + - Pattern syntax makes it easier to match complex values. + - Modeling pattern syntax after expressions eases the burden of learning + a new sub-language for pattern-matching: patterns are an extension of + expressions, and expressions are a special case of patterns. +- [Interoperability with and migration from existing C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code) + - The rules for matching a templated value can be used to replace `if + constexpr` in many cases. + +## Alternatives considered + +### Struct pattern syntax + +We could omit the `, ...` syntax. This would simplify struct patterns, but at +the cost of removing a feature that can be useful for reducing verbosity and +making library evolution easier. + +We could always allow a struct pattern to match a struct with more fields, +without requiring a `, ...` suffix. This would aid evolution by reducing the +cases where adding a field to a struct can be a breaking change, but such cases +would still exist. Further, this would make matches against a struct-valued +expression inconsistent with matches against a struct pattern. + +We could remove the `{field: type}` shorthand and require `{.field = field: +type}`. This may avoid encouraging reusing the field name even when it is not +appropriate, and would remove some syntactic sugar that's not formally +necessary. However, we expect this to be a common case whose ergonomics are +important. + +We could use a different syntax for `{field: type}` that is less divergent from +other struct syntaxes: + +- `{.field: type}` seems like a contender, but doesn't work because that is + already recognized as a tuple type literal. Also, the use of normal binding + syntax means that every locally-introduced name is always introduced by + `name: type` where `name` is not preceded by `.`. +- `{.=field: type}` might be a reasonable mnemonic shorthand for `{.field = + field: type}`, but looks a little surprising. This is probably the best + choice if concerns are found with `{field: type}` syntax. + +## Future work + +### Or patterns + +We could provide "or patterns", allowing matching of one pattern or another +with the same handler: + +``` +match (x) { + case (m: i32, 0) | (0, m: i32) => { return m; } +} +``` + +### Guards + +We could provide guards for patterns, allowing a pattern to match only if some +predicate involving the bindings holds: + +``` +match (x) { + case (m: i32, n: i32) if m + n < 5 => { return m - n; } +} +``` + +### User-defined pattern matching + +We could provide some mechanism for allowing a user-defined type to specify how +it can be matched by patterns. + +### Matching classes with struct patterns + +We could allow a class to be matched by a struct patttern that matches its +fields. This would make sense especially for data classes, and would be +consistent with the behavior of `==` in the case where a struct is implicitly +convertible to the class type. However, without a design for user-defined +pattern matching, there is a significant risk that this would conflict with the +rules there, so it is deferred for now. From 3040dcf7235896cad85e72318120d6e6eb8900c2 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Thu, 15 Sep 2022 14:31:38 -0700 Subject: [PATCH 02/27] Pre-commit. --- proposals/p2188.md | 105 ++++++++++++++++++++++++++------------------- 1 file changed, 60 insertions(+), 45 deletions(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index 1143795803b1c..2009eededde77 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -12,14 +12,30 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception ## Table of contents -- [TODO: Initial proposal setup](#todo-initial-proposal-setup) - [Abstract](#abstract) - [Problem](#problem) - [Background](#background) - [Proposal](#proposal) - [Details](#details) + - [Expressions versus proper patterns](#expressions-versus-proper-patterns) + - [Expression patterns](#expression-patterns) + - [Bindings](#bindings) + - [Wildcard](#wildcard) + - [Generic bindings](#generic-bindings) + - [`auto`](#auto) + - [`var`](#var) + - [Tuple patterns](#tuple-patterns) + - [Struct patterns](#struct-patterns) + - [Choice patterns](#choice-patterns) + - [Templates](#templates) - [Rationale](#rationale) - [Alternatives considered](#alternatives-considered) + - [Struct pattern syntax](#struct-pattern-syntax) +- [Future work](#future-work) + - [Or patterns](#or-patterns) + - [Guards](#guards) + - [User-defined pattern matching](#user-defined-pattern-matching) + - [Matching classes with struct patterns](#matching-classes-with-struct-patterns) @@ -79,7 +95,8 @@ An expression is a pattern. - _pattern_ ::= _expression_ -The pattern is compared with the expression using the `==` operator: _pattern_ `==` _scrutinee_. +The pattern is compared with the expression using the `==` operator: _pattern_ +`==` _scrutinee_. ``` fn F(n: i32) { @@ -99,9 +116,8 @@ A name binding is a pattern. In the first form, the type of the _identifier_ is specified by the _expression_. In the second form, the _proper-pattern_ is matched against the -type of the scrutinee, and the result is the type of the _identifier_. In -either case, the scrutinee is implicitly converted to the type of the -_identifier_. +type of the scrutinee, and the result is the type of the _identifier_. In either +case, the scrutinee is implicitly converted to the type of the _identifier_. ``` fn F(p: i32*) { @@ -153,8 +169,8 @@ fn F(T:! Type, x: T) { ``` The `template` keyword indicates the binding is introducing a template -parameter, so name lookups into the parameter should be deferred until its -value is known. +parameter, so name lookups into the parameter should be deferred until its value +is known. ### `auto` @@ -175,10 +191,10 @@ scrutinee. - _proper-pattern_ ::= `var` _proper-pattern_ -A `var` pattern matches when its nested pattern matches. The type of the -storage is the resolved type of the nested _pattern_. Any bindings within the -nested pattern refer to portions of the corresponding storage rather than to -the scrutinee, somewhat as if the mutable storage is first initialized from the +A `var` pattern matches when its nested pattern matches. The type of the storage +is the resolved type of the nested _pattern_. Any bindings within the nested +pattern refer to portions of the corresponding storage rather than to the +scrutinee, somewhat as if the mutable storage is first initialized from the scrutinee and then the nested pattern is matched against the mutable storage, but the variable is not actually initialized unless the complete pattern matches. @@ -203,16 +219,17 @@ _pattern_ `=` _expression_ `;`. A tuple of patterns can be used as a pattern. -- _tuple-pattern_ ::= `(` [_expression_ `,`]\* _proper-pattern_ [`,` _pattern_]\* `)` +- _tuple-pattern_ ::= `(` [_expression_ `,`]\* _proper-pattern_ [`,` + _pattern_]\* `)` - _proper-pattern_ ::= _tuple-pattern_ A tuple pattern is matched left-to-right. The scrutinee is required to be of tuple type. -Note that a tuple pattern must contain at least one _proper-pattern_. -Otherwise, it is a tuple-valued expression. However, a tuple pattern and a -corresponding tuple-valued expression are matched in the same way because `==` -for a tuple compares fields left-to-right. +Note that a tuple pattern must contain at least one _proper-pattern_. Otherwise, +it is a tuple-valued expression. However, a tuple pattern and a corresponding +tuple-valued expression are matched in the same way because `==` for a tuple +compares fields left-to-right. ### Struct patterns @@ -226,8 +243,8 @@ A struct can be matched with a struct pattern. - _field-pattern_ ::= _field-init_ - _field-pattern_ ::= _proper-field-pattern_ -A struct pattern resembles a struct literal, with at least one field -initialized with a proper pattern: +A struct pattern resembles a struct literal, with at least one field initialized +with a proper pattern: ``` match ({.a = 1, .b = 2}) { @@ -264,8 +281,8 @@ This is valid even if all fields are actually named in the pattern. A choice pattern is used to match one alternative of a choice type. -_proper-pattern_ ::= _callee-expression_ _tuple-pattern_ -_proper-pattern_ ::= _designator_ _tuple-pattern_? +_proper-pattern_ ::= _callee-expression_ _tuple-pattern_ _proper-pattern_ ::= +_designator_ _tuple-pattern_? Here, _callee-expression_ is syntactically an expression that is valid as the callee in a function call expression, and a choice pattern is syntactically a @@ -273,15 +290,14 @@ function call expression whose argument list contains at least one _proper-pattern_. If a _callee-expression_ is provided, it is required to name a choice type -alternative that has a parameter list, and the scrutinee is implicitly -converted to that choice type. Otherwise, the scrutinee is required to be of -some choice type, and the designator is looked up in that type and is required -to name an alternative with a parameter list if and only if a _tuple-pattern_ -is specified. +alternative that has a parameter list, and the scrutinee is implicitly converted +to that choice type. Otherwise, the scrutinee is required to be of some choice +type, and the designator is looked up in that type and is required to name an +alternative with a parameter list if and only if a _tuple-pattern_ is specified. The pattern matches if the active alternative in the scrutinee is the specified -alternative, and the arguments of the alternative match the given tuple -pattern (if any). +alternative, and the arguments of the alternative match the given tuple pattern +(if any). ``` choice Optional(T:! Type) { @@ -318,10 +334,9 @@ is compared using `==`. If the type of the scrutinee of a pattern is either a template parameter or is an associated type that depends on a template parameter, any checking of the type of the scrutinee against the type of the pattern is deferred until the -template parameter's value is known. During instantiation, patterns that are -not meaningful due to a type error are instead treated as not matching. This -includes cases where an `==` fails because of a missing `EqWith` -implementation. +template parameter's value is known. During instantiation, patterns that are not +meaningful due to a type error are instead treated as not matching. This +includes cases where an `==` fails because of a missing `EqWith` implementation. ``` fn TypeName[template T:! Type](x: T) -> String { @@ -350,12 +365,12 @@ fn NeverWorks[template T:! Type](triple: (T, T, T)) { struct member is not a breaking change. - [Code that is easy to read, understand, and write](/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write) - Pattern syntax makes it easier to match complex values. - - Modeling pattern syntax after expressions eases the burden of learning - a new sub-language for pattern-matching: patterns are an extension of + - Modeling pattern syntax after expressions eases the burden of learning a + new sub-language for pattern-matching: patterns are an extension of expressions, and expressions are a special case of patterns. - [Interoperability with and migration from existing C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code) - - The rules for matching a templated value can be used to replace `if - constexpr` in many cases. + - The rules for matching a templated value can be used to replace + `if constexpr` in many cases. ## Alternatives considered @@ -371,11 +386,11 @@ cases where adding a field to a struct can be a breaking change, but such cases would still exist. Further, this would make matches against a struct-valued expression inconsistent with matches against a struct pattern. -We could remove the `{field: type}` shorthand and require `{.field = field: -type}`. This may avoid encouraging reusing the field name even when it is not -appropriate, and would remove some syntactic sugar that's not formally -necessary. However, we expect this to be a common case whose ergonomics are -important. +We could remove the `{field: type}` shorthand and require +`{.field = field: type}`. This may avoid encouraging reusing the field name even +when it is not appropriate, and would remove some syntactic sugar that's not +formally necessary. However, we expect this to be a common case whose ergonomics +are important. We could use a different syntax for `{field: type}` that is less divergent from other struct syntaxes: @@ -384,16 +399,16 @@ other struct syntaxes: already recognized as a tuple type literal. Also, the use of normal binding syntax means that every locally-introduced name is always introduced by `name: type` where `name` is not preceded by `.`. -- `{.=field: type}` might be a reasonable mnemonic shorthand for `{.field = - field: type}`, but looks a little surprising. This is probably the best - choice if concerns are found with `{field: type}` syntax. +- `{.=field: type}` might be a reasonable mnemonic shorthand for + `{.field = field: type}`, but looks a little surprising. This is probably + the best choice if concerns are found with `{field: type}` syntax. ## Future work ### Or patterns -We could provide "or patterns", allowing matching of one pattern or another -with the same handler: +We could provide "or patterns", allowing matching of one pattern or another with +the same handler: ``` match (x) { From d2e03cb0ebb56b9a22286d98ec2f5cd55a81405a Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Thu, 15 Sep 2022 14:35:25 -0700 Subject: [PATCH 03/27] Allow `{.a: T, ...}` as a pattern even though it contains no bindings. --- proposals/p2188.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index 2009eededde77..f85d5065acf65 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -236,7 +236,8 @@ compares fields left-to-right. A struct can be matched with a struct pattern. - _proper-pattern_ ::= `{` [_field-init_ `,`]\* _proper-field-pattern_ [`,` - _field-pattern_]\* [, `...`]? `}` + _field-pattern_]\* `}` +- _proper-pattern_ ::= `{` [_field-pattern_ `,`]+ `...` `}` - _field-init_ ::= _designator_ `=` _expression_ - _proper-field-pattern_ ::= _designator_ `=` _proper-pattern_ - _proper-field-pattern_ ::= _binding-pattern_ From 931068b09e3832fbe76407c5bb5c2730a1b89c02 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Thu, 15 Sep 2022 14:38:29 -0700 Subject: [PATCH 04/27] Avoid mismatched braces in examples. --- proposals/p2188.md | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index f85d5065acf65..8f7234459ad71 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -86,7 +86,8 @@ fn F(n: i32) -> i32 { return n; } match (F(42)) { // ❌ Error: binding can't appear in a function call. - case (F(n: i32)) => + case (F(n: i32)) => {} +} ``` ### Expression patterns @@ -103,7 +104,9 @@ fn F(n: i32) { match (n) { // ✅ Results in an `n == 5` comparison. // OK despite `n` and `5` having different types. - case 5 => + case 5 => {} + } +} ``` ### Bindings @@ -124,7 +127,9 @@ fn F(p: i32*) { match (p) { // ✅ Matches `(T:! Type)` against `i32*`. // `T` is bound to `i32`, `m` is bound to `p`, and the type of `m` is `i32*`. - case m: (T:! Type)* => + case m: (T:! Type)* => {} + } +} ``` The `unused` keyword indicates that the binding is intended to not be used. @@ -249,7 +254,8 @@ with a proper pattern: ``` match ({.a = 1, .b = 2}) { - case {.b = n: i32, .a = m: i32} => + case {.b = n: i32, .a = m: i32} => {} +} ``` The scrutinee is required to be of struct type, and to have the same set of @@ -265,6 +271,7 @@ shorthand syntax is available: `a: T` is synonymous with `.a = a: T`. ``` match ({.a = 1, .b = 2}) { case {a: i32, b: i32} => { return a + b; } +} ``` If some fields should be ignored when matching, a trailing `, ...` can be added @@ -274,6 +281,7 @@ to specify this: match ({.a = 1, .b = 2}) { case {.a = 1, ...} => { return 1; } case {b: i32, ...} => { return b; } +} ``` This is valid even if all fields are actually named in the pattern. @@ -355,7 +363,8 @@ fn TypeName[template T:! Type](x: T) -> String { fn NeverWorks[template T:! Type](triple: (T, T, T)) { match (triple) { // ❌ Error: type mismatch matching struct against (T, T, T) - case {.a: i32, .b: i32} => + case {.a: i32, .b: i32} => {} + } } ``` From 9551c27ded375395f862df7075b99c48733fba7f Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Thu, 15 Sep 2022 14:57:20 -0700 Subject: [PATCH 05/27] Handle grouping parentheses. --- proposals/p2188.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/proposals/p2188.md b/proposals/p2188.md index 8f7234459ad71..e54acab9a9fb2 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -228,6 +228,10 @@ A tuple of patterns can be used as a pattern. _pattern_]\* `)` - _proper-pattern_ ::= _tuple-pattern_ +A _tuple-pattern_ containing no commas is treated as grouping parens: the +contained _proper-pattern_ is matched directly against the scrutinee. Otherwise, +the behavior is as follows. + A tuple pattern is matched left-to-right. The scrutinee is required to be of tuple type. From 0eb58342732fd55415d50cb357d91393c0bab684 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Fri, 16 Sep 2022 13:48:09 -0700 Subject: [PATCH 06/27] Add bullets to prevent grammar productions from wrapping together Co-authored-by: josh11b --- proposals/p2188.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index e54acab9a9fb2..52175208a9eee 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -294,8 +294,8 @@ This is valid even if all fields are actually named in the pattern. A choice pattern is used to match one alternative of a choice type. -_proper-pattern_ ::= _callee-expression_ _tuple-pattern_ _proper-pattern_ ::= -_designator_ _tuple-pattern_? +- _proper-pattern_ ::= _callee-expression_ _tuple-pattern_ +- _proper-pattern_ ::= _designator_ _tuple-pattern_? Here, _callee-expression_ is syntactically an expression that is valid as the callee in a function call expression, and a choice pattern is syntactically a From e90862395bd90e7474656012488bb7f72a2f3e67 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Fri, 16 Sep 2022 13:48:29 -0700 Subject: [PATCH 07/27] Fix tuple / struct typo. Co-authored-by: josh11b --- proposals/p2188.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index 52175208a9eee..0d3006e49d4ee 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -410,7 +410,7 @@ We could use a different syntax for `{field: type}` that is less divergent from other struct syntaxes: - `{.field: type}` seems like a contender, but doesn't work because that is - already recognized as a tuple type literal. Also, the use of normal binding + already recognized as a struct type literal. Also, the use of normal binding syntax means that every locally-introduced name is always introduced by `name: type` where `name` is not preceded by `.`. - `{.=field: type}` might be a reasonable mnemonic shorthand for From 7fee7d236848146fb530e2b633c1d5dfee8d9587 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Fri, 16 Sep 2022 14:12:34 -0700 Subject: [PATCH 08/27] Allow trailing comma for one-tuple patterns. --- proposals/p2188.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index 0d3006e49d4ee..172c86e947e0f 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -225,7 +225,7 @@ _pattern_ `=` _expression_ `;`. A tuple of patterns can be used as a pattern. - _tuple-pattern_ ::= `(` [_expression_ `,`]\* _proper-pattern_ [`,` - _pattern_]\* `)` + _pattern_]\* `,`? `)` - _proper-pattern_ ::= _tuple-pattern_ A _tuple-pattern_ containing no commas is treated as grouping parens: the From 1f3bb357f895678f9760391f700d69d2cd8d1ef7 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Fri, 16 Sep 2022 15:07:04 -0700 Subject: [PATCH 09/27] Allow reuse of comparison results. Improve wording. Add future work to support dynamic type matching. --- proposals/p2188.md | 83 ++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 72 insertions(+), 11 deletions(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index 172c86e947e0f..231fa7c65ae2f 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -34,6 +34,7 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception - [Future work](#future-work) - [Or patterns](#or-patterns) - [Guards](#guards) + - [Matching classes by dynamic type](#matching-classes-by-dynamic-type) - [User-defined pattern matching](#user-defined-pattern-matching) - [Matching classes with struct patterns](#matching-classes-with-struct-patterns) @@ -74,10 +75,10 @@ expressions, patterns add: ### Expressions versus proper patterns -A pattern that contains within it any pattern-specific syntax, such as a -binding, is a _proper pattern_. Any other pattern must necessarily be an -expression. Many expression forms, such as arbitrary function calls, are not -permitted as proper patterns, so cannot contain bindings. +Expressions are patterns, as described below. A pattern that is not an +expression, because it contains pattern-specific syntax such as a binding, is a +_proper pattern_. Many expression forms, such as arbitrary function calls, are +not permitted as proper patterns, so cannot contain bindings. - _pattern_ ::= _proper-pattern_ @@ -109,6 +110,31 @@ fn F(n: i32) { } ``` +Any `==` operations performed by a pattern match occur in lexical order, but for +repeated matches against the same `_pattern_`, later comparisons may be skipped +by reusing the result from an earlier comparison: + +``` +class ChattyIntMatcher { + external impl as EqWith(i32) { + fn Eq[me: ChattyIntMatcher](other: i32) { + Print("Matching {0}", other); + return other == 1; + } + } +} + +fn F() { + // Prints `Matching 1` then `Matching 2`, + // may or may not then print `Matching 1` again. + match ((1, 2)) { + case ({} as ChattyIntMatcher, 0) => {} + case (1, {} as ChattyIntMatcher) => {} + case ({} as ChattyIntMatcher, 2) => {} + } +} +``` + ### Bindings A name binding is a pattern. @@ -132,6 +158,16 @@ fn F(p: i32*) { } ``` +``` +fn G() -> i32 { + match (5) ( + // ✅ `5` is implicitly converted to `i32`. + // Returns `5 as i32`. + case n: i32 => { return n; } + } +} +``` + The `unused` keyword indicates that the binding is intended to not be used. ### Wildcard @@ -179,13 +215,15 @@ is known. ### `auto` -The pattern `auto` is shorthand for `_:! Type`. +The pattern `auto` is shorthand for `template _:! Type`. - _proper-pattern_ ::= `auto` ``` fn F(n: i32) { var v: auto = SomeComplicatedExpression(n); + // Equivalent to: + var w: (template _:! Type) = SomeComplicatedExpression(n); } ``` @@ -199,10 +237,7 @@ scrutinee. A `var` pattern matches when its nested pattern matches. The type of the storage is the resolved type of the nested _pattern_. Any bindings within the nested pattern refer to portions of the corresponding storage rather than to the -scrutinee, somewhat as if the mutable storage is first initialized from the -scrutinee and then the nested pattern is matched against the mutable storage, -but the variable is not actually initialized unless the complete pattern -matches. +scrutinee. ``` fn F(p: i32*); @@ -216,6 +251,23 @@ fn G() { } ``` +Pattern matching precedes the initialization of the storage for any `var` +patterns. An introduced variable is only initialized if the complete pattern +matches. + +``` +class X { + destructor { Print("Destroyed!"); } +} +fn F(x: X) { + match ((x, 1 as i32)) { + case (var y: X, 0) => {} + case (var z: X, 1) => {} + // Prints "Destroyed!" only once, when `z` is destroyed. + } +} +``` + A `var` pattern cannot be nested within another `var` pattern. The declaration syntax `var` _pattern_ `=` _expresson_ `;` is equivalent to `let` `var` _pattern_ `=` _expression_ `;`. @@ -258,6 +310,9 @@ with a proper pattern: ``` match ({.a = 1, .b = 2}) { + // Struct literal as an expression pattern. + case {.b = 2, .a = 1} => {} + // Struct pattern. case {.b = n: i32, .a = m: i32} => {} } ``` @@ -441,6 +496,12 @@ match (x) { } ``` +### Matching classes by dynamic type + +We could provide a way to match polymorphic class objects based on their dynamic +type. This might be the default when matching a polymorphic class, or might +require opt-in. + ### User-defined pattern matching We could provide some mechanism for allowing a user-defined type to specify how @@ -452,5 +513,5 @@ We could allow a class to be matched by a struct patttern that matches its fields. This would make sense especially for data classes, and would be consistent with the behavior of `==` in the case where a struct is implicitly convertible to the class type. However, without a design for user-defined -pattern matching, there is a significant risk that this would conflict with the -rules there, so it is deferred for now. +pattern matching and matching on dynamic type, there is a significant risk that +this would conflict with the rules there, so it is deferred for now. From 68de53b1e229c198e452c4bb3efaca2cdf6784b4 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Tue, 20 Sep 2022 16:23:50 -0700 Subject: [PATCH 10/27] Address review comments and feedback. Remove `identifier: pattern`. --- proposals/p2188.md | 235 +++++++++++++++++++++++++++++++++++---------- 1 file changed, 182 insertions(+), 53 deletions(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index 231fa7c65ae2f..bf63120a6dd2c 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -20,23 +20,27 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception - [Expressions versus proper patterns](#expressions-versus-proper-patterns) - [Expression patterns](#expression-patterns) - [Bindings](#bindings) - - [Wildcard](#wildcard) - - [Generic bindings](#generic-bindings) - - [`auto`](#auto) + - [Name bindings](#name-bindings) + - [Wildcard](#wildcard) + - [Generic bindings](#generic-bindings) + - [`auto` and type deduction](#auto-and-type-deduction) - [`var`](#var) - [Tuple patterns](#tuple-patterns) - [Struct patterns](#struct-patterns) - - [Choice patterns](#choice-patterns) + - [Alternative patterns](#alternative-patterns) - [Templates](#templates) - [Rationale](#rationale) - [Alternatives considered](#alternatives-considered) - [Struct pattern syntax](#struct-pattern-syntax) + - [Type pattern matching](#type-pattern-matching) + - [Introducer syntax for expression patterns](#introducer-syntax-for-expression-patterns) - [Future work](#future-work) - [Or patterns](#or-patterns) - [Guards](#guards) - [Matching classes by dynamic type](#matching-classes-by-dynamic-type) - [User-defined pattern matching](#user-defined-pattern-matching) - [Matching classes with struct patterns](#matching-classes-with-struct-patterns) + - [Type deduction](#type-deduction) @@ -111,8 +115,8 @@ fn F(n: i32) { ``` Any `==` operations performed by a pattern match occur in lexical order, but for -repeated matches against the same `_pattern_`, later comparisons may be skipped -by reusing the result from an earlier comparison: +repeated matches against the same _pattern_, later comparisons may be skipped by +reusing the result from an earlier comparison: ``` class ChattyIntMatcher { @@ -137,29 +141,18 @@ fn F() { ### Bindings +#### Name bindings + A name binding is a pattern. - _binding-pattern_ ::= `unused`? _identifier_ `:` _expression_ -- _binding-pattern_ ::= `unused`? _identifier_ `:` _proper-pattern_ - _proper-pattern_ ::= _binding-pattern_ -In the first form, the type of the _identifier_ is specified by the -_expression_. In the second form, the _proper-pattern_ is matched against the -type of the scrutinee, and the result is the type of the _identifier_. In either -case, the scrutinee is implicitly converted to the type of the _identifier_. +The type of the _identifier_ is specified by the _expression_. The scrutinee is +implicitly converted to that type. ``` -fn F(p: i32*) { - match (p) { - // ✅ Matches `(T:! Type)` against `i32*`. - // `T` is bound to `i32`, `m` is bound to `p`, and the type of `m` is `i32*`. - case m: (T:! Type)* => {} - } -} -``` - -``` -fn G() -> i32 { +fn F() -> i32 { match (5) ( // ✅ `5` is implicitly converted to `i32`. // Returns `5 as i32`. @@ -168,15 +161,16 @@ fn G() -> i32 { } ``` -The `unused` keyword indicates that the binding is intended to not be used. +As specified in +[#2022](https://github.com/carbon-language/carbon-lang/pull/2022), the `unused` +keyword indicates that the binding is intended to not be used. -### Wildcard +#### Wildcard A syntax like a binding but with `_` in place of an identifier can be used to ignore part of a value. - _binding-pattern_ ::= `_` `:` _expression_ -- _binding-pattern_ ::= `_` `:` _proper-pattern_ The behavior is equivalent to that of an `unused` binding with a unique name. @@ -191,7 +185,7 @@ fn F(n: i32) { } ``` -### Generic bindings +#### Generic bindings A `:!` can be used in place of `:` for a binding that is usable at compile time. @@ -199,9 +193,6 @@ A `:!` can be used in place of `:` for a binding that is usable at compile time. - _generic-pattern_ ::= `template`? `_` `:!` _expression_ - _proper-pattern_ ::= _generic-pattern_ -Note that, unlike in a _binding-pattern_, the type of a generic binding cannot -be a pattern. - ``` // ✅ `F` takes a generic type parameter `T` and a parameter `x` of type `T`. fn F(T:! Type, x: T) { @@ -213,20 +204,47 @@ The `template` keyword indicates the binding is introducing a template parameter, so name lookups into the parameter should be deferred until its value is known. -### `auto` +#### `auto` and type deduction -The pattern `auto` is shorthand for `template _:! Type`. +The `auto` keyword is a placeholder for a unique deduced type. -- _proper-pattern_ ::= `auto` +- _expression_ ::= `auto` ``` fn F(n: i32) { var v: auto = SomeComplicatedExpression(n); // Equivalent to: - var w: (template _:! Type) = SomeComplicatedExpression(n); + var w: T = SomeComplicatedExpression(n); + // ... where `T` is the type of the initializer. } ``` +The `auto` keyword is only permitted in specific contexts. Currently these are: + +- As the return type of a function. +- As the type of a binding. + +It is anticipated that `auto` may be permitted in more contexts in the future, +for example as a generic argument in a parameterized type that appears in a +context where `auto` is allowed, such as `Vector(auto)` or `auto*`. + +When the type of a binding requires type deduction, the type is deduced against +the type of the scrutinee and deduced values are substituted back into the type +before pattern matching is performed. + +``` +fn G[T:! Type](p: T*); +class X { external impl as ImplicitAs(i32*); } +// ✅ Deduces `T = i32` then implicitly and +// trivially converts `p` to `i32*`. +fn H1(p: i32*) { G(p); } +// ❌ Error, can't deduce `T*` from `X`. +fn H2(p: X) { G(p); } +``` + +The above is only an illustration; the behavior of type deduction is not +specified in this proposal. + ### `var` A `var` prefix indicates that a pattern provides mutable storage for the @@ -298,7 +316,7 @@ A struct can be matched with a struct pattern. - _proper-pattern_ ::= `{` [_field-init_ `,`]\* _proper-field-pattern_ [`,` _field-pattern_]\* `}` -- _proper-pattern_ ::= `{` [_field-pattern_ `,`]+ `...` `}` +- _proper-pattern_ ::= `{` [_field-pattern_ `,`]+ `_` `}` - _field-init_ ::= _designator_ `=` _expression_ - _proper-field-pattern_ ::= _designator_ `=` _proper-pattern_ - _proper-field-pattern_ ::= _binding-pattern_ @@ -321,8 +339,9 @@ The scrutinee is required to be of struct type, and to have the same set of field names as the pattern. The pattern is matched left-to-right, meaning that matching is performed in the field order specified in the pattern, not in the field order of the scrutinee. This is consistent with the behavior of matching -against a tuple-valued expression, where the left operand determines the order -in which `==` comparisons are performed. +against a struct-valued expression, where the expression pattern becomes the +left operand of the `==` and so determines the order in which `==` comparisons +for fields are performed. In the case where a field will be bound to an identifier with the same name, a shorthand syntax is available: `a: T` is synonymous with `.a = a: T`. @@ -333,29 +352,29 @@ match ({.a = 1, .b = 2}) { } ``` -If some fields should be ignored when matching, a trailing `, ...` can be added -to specify this: +If some fields should be ignored when matching, a trailing `, _` can be added to +specify this: ``` match ({.a = 1, .b = 2}) { - case {.a = 1, ...} => { return 1; } - case {b: i32, ...} => { return b; } + case {.a = 1, _} => { return 1; } + case {b: i32, _} => { return b; } } ``` This is valid even if all fields are actually named in the pattern. -### Choice patterns +### Alternative patterns -A choice pattern is used to match one alternative of a choice type. +An alternative pattern is used to match one alternative of a choice type. - _proper-pattern_ ::= _callee-expression_ _tuple-pattern_ - _proper-pattern_ ::= _designator_ _tuple-pattern_? Here, _callee-expression_ is syntactically an expression that is valid as the -callee in a function call expression, and a choice pattern is syntactically a -function call expression whose argument list contains at least one -_proper-pattern_. +callee in a function call expression, and an alternative pattern is +syntactically a function call expression whose argument list contains at least +one _proper-pattern_. If a _callee-expression_ is provided, it is required to name a choice type alternative that has a parameter list, and the scrutinee is implicitly converted @@ -430,7 +449,7 @@ fn NeverWorks[template T:! Type](triple: (T, T, T)) { ## Rationale - [Software and language evolution](/docs/project/goals.md#software-and-language-evolution) - - The `, ...` syntax for struct patterns enables a style where adding a + - The `, _` syntax for struct patterns enables a style where adding a struct member is not a breaking change. - [Code that is easy to read, understand, and write](/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write) - Pattern syntax makes it easier to match complex values. @@ -445,15 +464,30 @@ fn NeverWorks[template T:! Type](triple: (T, T, T)) { ### Struct pattern syntax -We could omit the `, ...` syntax. This would simplify struct patterns, but at -the cost of removing a feature that can be useful for reducing verbosity and -making library evolution easier. +We could omit the `, _` syntax. This would simplify struct patterns, but at the +cost of removing a feature that can be useful for reducing verbosity and making +library evolution easier. We could always allow a struct pattern to match a struct with more fields, -without requiring a `, ...` suffix. This would aid evolution by reducing the -cases where adding a field to a struct can be a breaking change, but such cases -would still exist. Further, this would make matches against a struct-valued -expression inconsistent with matches against a struct pattern. +without requiring a `, _` suffix. This would aid evolution by reducing the cases +where adding a field to a struct can be a breaking change, but such cases would +still exist. Further, this would make matches against a struct-valued expression +inconsistent with matches against a struct pattern. + +We could use a different syntax instead of `, _`. Other options that were +explicitly considered: + +- `, ...` seems visually evocative of "and more stuff", but risks conflicting + with variadic syntax or at least being confusing when used in a variadic + context, given that variadics are expected to claim `...` for pack + expansion. +- `, ._` suggests matching a field without specifying a name, but might create + an impression of matching just one field, and three low punctuation + characters in a row seems to be pushing the limits of readability. + +On balance, `, _` harmonizes well with the use of `_` to introduce a wildcard, +without being too visually confusing given that the other use of `_` has a +following `:`. We could remove the `{field: type}` shorthand and require `{.field = field: type}`. This may avoid encouraging reusing the field name even @@ -472,6 +506,93 @@ other struct syntaxes: `{.field = field: type}`, but looks a little surprising. This is probably the best choice if concerns are found with `{field: type}` syntax. +### Type pattern matching + +We could treat type deduction as a form of pattern matching. For example, we +could allow + +``` +fn F(a: Vector(T:! Type)) -> T { return a[0]; } +fn G() -> i32 { + let v: Vector(i32) = (1, 2, 3); + // Deduces `T = i32`. + return F(v); +} +``` + +where the value of `T` is determined by pattern-matching `Vector(T:! Type)` +against the supplied type `Vector(i32)` of `v`. And symmetrically: + +``` +fn H[m: i32, n: i32](k: i32, (m, n)) { return m + n; } +fn I() { + // Deduces `m = 2`, `n = 3`. + H(1, (2, 3)); +} +``` + +This would ensure consistency between pattern matching and deduction, +potentially reducing the number of rules that Carbon developers need to learn. + +We find that attempting to unify pattern-matching and type deduction in this way +harms readability. Having distinct syntax for type-level matching and +value-level matching helps guide the reader to the correct interpretation, even +though the underlying matching process is expected to be similar or identical. +As a result, we keep type deduction syntax and pattern matching syntax separate +for now: + +- In pattern matching, bindings and wildcards are introduced by nested `:` / + `:!` patterns. +- In type deduction, deduced values are specified separately and an expression + written in terms of those bindings describes the type. + +### Introducer syntax for expression patterns + +We could have some separate introducer syntax to distinguish expression patterns +from other kinds of patterns: + +``` +match ((a, b)) { + case is (1, 2) => {} + case (is 3, n: i32) => {} + case (m: i32, is 4) => {} +} +``` + +This would reduce the chance of confusion in cases where an expression and a +similar-looking pattern are treated differently: + +``` +class TupleLike { + external impl as (i32, i32); +} +fn MatchTupleLike(t: TupleLike) { + match (t) { + // ✅ OK, expression pattern; + // `t` implicitly converted to `(i32, i32)` by + // built-in `impl EqWith` for tuples. + case (1, 2) => {} + // ❌ Error, `t` is not a tuple. + case (n: i32, 3) => {} + } +} +``` + +However, this would also introduce additional ceremony for the common case where +part of a pattern is a specific value. This could be mitigated by permitting +certain kinds of value as patterns without an introducer, such as numeric +literals and `true` and `false`, at the cost of introducing more complexity and +more confusion over which cases require `is` and which do not. + +We have also not identified a good choice for the introducer syntax, should we +pursue this direction. `is` is not an ideal choice, because elsewhere in Carbon +syntax, it is a relation between a value and its type, so `is T` may be misread +as matching values whose type is `T`. `==` _expression_ has been suggested, but +that would imply the opposite operand order of that in this proposal -- +_scrutinee_ `==` _expression_ rather than _expression_ `==` _scrutinee_ -- which +would compare struct fields in a surprising order that diverges from the order +of comparison for a struct pattern. + ## Future work ### Or patterns @@ -515,3 +636,11 @@ consistent with the behavior of `==` in the case where a struct is implicitly convertible to the class type. However, without a design for user-defined pattern matching and matching on dynamic type, there is a significant risk that this would conflict with the rules there, so it is deferred for now. + +### Type deduction + +This proposal does not cover type deduction, instead considering it to be a +separate topic from pattern matching syntax, even though the semantic behavior +of the two may be quite similar or identical. + +We will need a proposal to explore type deduction and describe its functioning. From bbfccab28a9ccddc15976dc07e030644a04ba50d Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Tue, 20 Sep 2022 16:26:26 -0700 Subject: [PATCH 11/27] Add ref to #157. Co-authored-by: Geoff Romer --- proposals/p2188.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index bf63120a6dd2c..903e27aae1b71 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -625,8 +625,8 @@ require opt-in. ### User-defined pattern matching -We could provide some mechanism for allowing a user-defined type to specify how -it can be matched by patterns. +We plan to provide a mechanism for allowing a user-defined type to specify how +it can be matched by patterns. See [p0157.md](p0157) for details. ### Matching classes with struct patterns From db3de32444e2088f3d54d703dd2d10f3cf509479 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Tue, 20 Sep 2022 16:28:58 -0700 Subject: [PATCH 12/27] Fix xref. --- proposals/p2188.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index 903e27aae1b71..ccb7d5314991b 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -626,7 +626,7 @@ require opt-in. ### User-defined pattern matching We plan to provide a mechanism for allowing a user-defined type to specify how -it can be matched by patterns. See [p0157.md](p0157) for details. +it can be matched by patterns. See [proposal #157](p0157.md) for details. ### Matching classes with struct patterns From b12a666f6f79d62073778fdf8768839875718b9a Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Tue, 20 Sep 2022 16:37:05 -0700 Subject: [PATCH 13/27] Add more xrefs to #2022 and to #1084. --- proposals/p2188.md | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index ccb7d5314991b..cd25e6fe1b4c3 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -172,7 +172,10 @@ ignore part of a value. - _binding-pattern_ ::= `_` `:` _expression_ -The behavior is equivalent to that of an `unused` binding with a unique name. +See [#2022](https://github.com/carbon-language/carbon-lang/pull/2022) for +details. + +The behavior is similar to that of an `unused` binding with a unique name. ``` fn F(n: i32) { @@ -185,6 +188,22 @@ fn F(n: i32) { } ``` +As specified in [#1084](p1084.md), function redeclarations may replace named +bindings with wildcards but may not use different names. + +``` +fn G(n: i32); +fn H(n: i32); +fn J(n: i32); + +// ✅ Does not use `n`. +fn G(_: i32) {} +// ❌ Error: name of parameter does not match declaration. +fn H(m: i32) {} +// ✅ Does not use `n`. +fn J(unused n: i32); +``` + #### Generic bindings A `:!` can be used in place of `:` for a binding that is usable at compile time. From d7a568d1d1d9a3fcb36db0db87eb71c3a2bf2526 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Wed, 21 Sep 2022 10:12:33 -0700 Subject: [PATCH 14/27] Update proposals/p2188.md Co-authored-by: Geoff Romer --- proposals/p2188.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index cd25e6fe1b4c3..4b25071453ece 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -561,7 +561,8 @@ As a result, we keep type deduction syntax and pattern matching syntax separate for now: - In pattern matching, bindings and wildcards are introduced by nested `:` / - `:!` patterns. + `:!` patterns, and the right-hand side of a binding pattern is never a proper + pattern. - In type deduction, deduced values are specified separately and an expression written in terms of those bindings describes the type. From 591b77e73f0a1b5c3e4e4cdbb79a75662b7a660a Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Wed, 21 Sep 2022 12:29:50 -0700 Subject: [PATCH 15/27] Update links now #2022 has landed. --- proposals/p2188.md | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index 4b25071453ece..e63e60ce0899f 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -161,9 +161,8 @@ fn F() -> i32 { } ``` -As specified in -[#2022](https://github.com/carbon-language/carbon-lang/pull/2022), the `unused` -keyword indicates that the binding is intended to not be used. +As specified in [#2022](p2022.md#the-behavior-of-unused-name-bindings), the +`unused` keyword indicates that the binding is intended to not be used. #### Wildcard @@ -172,8 +171,7 @@ ignore part of a value. - _binding-pattern_ ::= `_` `:` _expression_ -See [#2022](https://github.com/carbon-language/carbon-lang/pull/2022) for -details. +See [#2022](p2022.md) for details. The behavior is similar to that of an `unused` binding with a unique name. From b4d810bafae960e62f0f137dec5409e02479717d Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Wed, 21 Sep 2022 12:31:13 -0700 Subject: [PATCH 16/27] Reformat --- proposals/p2188.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index e63e60ce0899f..3ec49d57198f5 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -559,8 +559,8 @@ As a result, we keep type deduction syntax and pattern matching syntax separate for now: - In pattern matching, bindings and wildcards are introduced by nested `:` / - `:!` patterns, and the right-hand side of a binding pattern is never a proper - pattern. + `:!` patterns, and the right-hand side of a binding pattern is never a + proper pattern. - In type deduction, deduced values are specified separately and an expression written in terms of those bindings describes the type. From eabdce5096b42c71b7f31a574bb02299a616171f Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Thu, 22 Sep 2022 16:33:13 -0700 Subject: [PATCH 17/27] Don't try to reason symbolically about types in templates. --- proposals/p2188.md | 21 ++++++--------------- 1 file changed, 6 insertions(+), 15 deletions(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index 3ec49d57198f5..0ecb2c802bb62 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -435,12 +435,12 @@ is compared using `==`. ### Templates -If the type of the scrutinee of a pattern is either a template parameter or is -an associated type that depends on a template parameter, any checking of the -type of the scrutinee against the type of the pattern is deferred until the -template parameter's value is known. During instantiation, patterns that are not -meaningful due to a type error are instead treated as not matching. This -includes cases where an `==` fails because of a missing `EqWith` implementation. +If the type of the scrutinee of a pattern involves a template parameter, any +checking of the type of the scrutinee against the type of the pattern is +deferred until the template parameter's value is known. During instantiation, +patterns that are not meaningful due to a type error are instead treated as not +matching. This includes cases where an `==` fails because of a missing `EqWith` +implementation. ``` fn TypeName[template T:! Type](x: T) -> String { @@ -454,15 +454,6 @@ fn TypeName[template T:! Type](x: T) -> String { } ``` -``` -fn NeverWorks[template T:! Type](triple: (T, T, T)) { - match (triple) { - // ❌ Error: type mismatch matching struct against (T, T, T) - case {.a: i32, .b: i32} => {} - } -} -``` - ## Rationale - [Software and language evolution](/docs/project/goals.md#software-and-language-evolution) From d968e0e4510286fc922cc4fd03e90849c6af7b28 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Tue, 1 Nov 2022 16:11:38 -0700 Subject: [PATCH 18/27] Fix typo. Co-authored-by: David Sankel --- proposals/p2188.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index 0ecb2c802bb62..df686162b1cf2 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -639,7 +639,7 @@ it can be matched by patterns. See [proposal #157](p0157.md) for details. ### Matching classes with struct patterns -We could allow a class to be matched by a struct patttern that matches its +We could allow a class to be matched by a struct pattern that matches its fields. This would make sense especially for data classes, and would be consistent with the behavior of `==` in the case where a struct is implicitly convertible to the class type. However, without a design for user-defined From 09fd308d06ab1ddb9d29ad7ee98ee78608096485 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Tue, 1 Nov 2022 16:42:11 -0700 Subject: [PATCH 19/27] Respond to some review comments. --- proposals/p2188.md | 387 ++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 361 insertions(+), 26 deletions(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index df686162b1cf2..8513b9f04cb18 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -29,14 +29,18 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception - [Struct patterns](#struct-patterns) - [Alternative patterns](#alternative-patterns) - [Templates](#templates) + - [Guards](#guards) + - [Refutability, overlap, usefulness, and exhaustiveness](#refutability-overlap-usefulness-and-exhaustiveness) - [Rationale](#rationale) - [Alternatives considered](#alternatives-considered) + - [Shorthand for `auto`](#shorthand-for-auto) - [Struct pattern syntax](#struct-pattern-syntax) - [Type pattern matching](#type-pattern-matching) - [Introducer syntax for expression patterns](#introducer-syntax-for-expression-patterns) + - [Allow guards on arbitrary patterns](#allow-guards-on-arbitrary-patterns) + - [Treat expression patterns as exhaustive if they cover all possible values](#treat-expression-patterns-as-exhaustive-if-they-cover-all-possible-values) - [Future work](#future-work) - [Or patterns](#or-patterns) - - [Guards](#guards) - [Matching classes by dynamic type](#matching-classes-by-dynamic-type) - [User-defined pattern matching](#user-defined-pattern-matching) - [Matching classes with struct patterns](#matching-classes-with-struct-patterns) @@ -71,8 +75,9 @@ Patterns in Carbon are a generalization of the expression grammar. Compared to expressions, patterns add: - Bindings, of the form `name: type`, which give a name for the scrutinee. -- `var` _pattern_, which requests mutable storage be provided for the - scrutinee. +- `var` _pattern_, which creates a separate object to hold the value of the + scrutinee, and causes any nested bindings to be mutable lvalues instead of + immutable rvalues. - Additional syntax to make matching against structs more convenient. ## Details @@ -149,7 +154,7 @@ A name binding is a pattern. - _proper-pattern_ ::= _binding-pattern_ The type of the _identifier_ is specified by the _expression_. The scrutinee is -implicitly converted to that type. +implicitly converted to that type if necessary. ``` fn F() -> i32 { @@ -161,8 +166,38 @@ fn F() -> i32 { } ``` -As specified in [#2022](p2022.md#the-behavior-of-unused-name-bindings), the -`unused` keyword indicates that the binding is intended to not be used. +When a new object needs to be created for the binding, the lifetime of the bound +value matches the scope of the binding. + +``` +class NoisyDestructor { + fn Make() -> Self { return {}; } + external impl i32 as ImplicitAs(NoisyDestructor) { + fn Convert[me: i32]() -> Self { return Make(); } + } + destructor { + Print("Destroyed!"); + } +} + +fn G() { + // Does not print "Destroyed!". + let n: NoisyDestructor = NoisyDestructor.Make(); + Print("Body of G"); + // Prints "Destroyed!" here. +} + +fn H(n: i32) { + // Does not print "Destroyed!". + let (v: NoisyDestructor, w: i32) = (n, n); + Print("Body of H"); + // Prints "Destroyed!" here. +} +``` + +As specified in +[#2022](/proposals/p2022.md#the-behavior-of-unused-name-bindings), the `unused` +keyword indicates that the binding is intended to not be used. #### Wildcard @@ -171,7 +206,7 @@ ignore part of a value. - _binding-pattern_ ::= `_` `:` _expression_ -See [#2022](p2022.md) for details. +See [#2022](/proposals/p2022.md) for details. The behavior is similar to that of an `unused` binding with a unique name. @@ -186,8 +221,8 @@ fn F(n: i32) { } ``` -As specified in [#1084](p1084.md), function redeclarations may replace named -bindings with wildcards but may not use different names. +As specified in [#1084](/proposals/p1084.md), function redeclarations may +replace named bindings with wildcards but may not use different names. ``` fn G(n: i32); @@ -454,6 +489,173 @@ fn TypeName[template T:! Type](x: T) -> String { } ``` +### Guards + +We allow `case`s within a `match` statement to have _guards_. These are not part +of pattern syntax, but instead are specific to `case` syntax: + +- _case_ ::= `case` _pattern_ [`if` _expression_]? `=>` _block_ + +A guard indicates that a `case` only matches if some predicate holds. The +bindings in the pattern are in scope in the guard: + +``` +match (x) { + case (m: i32, n: i32) if m + n < 5 => { return m - n; } +} +``` + +For consistency, this facility is also available for `default` clauses, so that +`default` remains equivalent to `case _: auto`. + +### Refutability, overlap, usefulness, and exhaustiveness + +Some definitions: + +- A pattern _P_ is _useful_ in the context of a set of patterns _C_ if _P_ can + match any value that no pattern in _C_ matches. +- A set of patterns _C_ is _exhaustive_ if it matches all possible values. + Equivalently, _C_ is exhaustive if the pattern `_: auto` is not useful in + the context of _C_. +- A pattern _P_ is _refutable_ if there are values that it does not match, + that is, if the pattern `_` is useful in the context of {_P_}. +- A set of patterns _C_ is _overlapping_ if there exists any value that is + matched by more than one pattern in _C_. + +For the purpose of these terms, expression patterns that match a constant tuple, +struct, or choice value are treated as if they were tuple, struct, or +alternative patterns, respectively, and `bool` is treated like a choice type. +Any expression patterns that remain after applying this rule are considered to +match a single value from an infinite set of values so that a set of expression +patterns is never exhaustive: + +``` +fn IsEven(n: u8) -> bool { + // Not considered exhaustive. + match (n) { + case 0 => { return true; } + case 1 => { return false; } + ... + case 255 => { return false; } + } + // Code here is considered to be reachable. +} +``` + +``` +fn IsTrue(b: bool) -> bool { + match (b) { + case false => { return false; } + case true => { return true; } + } + // Code here is considered to be unreachable. +} +``` + +When determining whether a pattern is useful, no attempt is made to determine +the value of any guards, and instead a worst-case assumption is made: a guard on +that pattern is assumed to evaluate to true and a guard on any pattern in the +context set is assumed to evaluate to false. + +We will diagnose the following situations: + +- A pattern is not useful in the context of prior patterns. In a `match` + statement, this happens if a pattern or `default` cannot match because all + cases it could cover are handled by prior cases or a prior `default`. For + example: + + ``` + choice Optional(T:! Type) { + None, + Some(T) + } + fn F(a: Optional(i32), b: Optional(i32)) { + match ((a, b)) { + case (.Some(a: i32), _: auto) => {} + // ✅ OK, but only matches values of the form `(None, Some)`, + // because `(Some, Some)` is matched by the previous pattern. + case (_: auto, .Some(b: i32)) => {} + // ✅ OK, matches all remaining values. + case (.None, .None) => {} + // ❌ Error, this pattern never matches. + case (_: auto, _: auto) => {} + } + } + ``` + +- A pattern match is not exhaustive and some other language rule requires + exhaustiveness. For example: + + - If control flow from the end of a `match` can reach the end of a + value-returning function, the `match` is required to be exhaustive. + + ``` + fn F(n: i32) -> i32 { + // ❌ Error, control flow can reach end of value-returning function + // because this `match` is not exhaustive. + match (n) { + case 0 => { return 2; } + case 1 => { return 3; } + case 2 => { return 5; } + case 3 => { return 7; } + case 4 => { return 11; } + } + } + ``` + + - If a variable could potentially be used while in an unformed state if no + case in a `match` statement matches. + + ``` + fn F(n: i32) -> i32 { + var m: i32; + // ❌ Error, `m` used in an unformed state if the `default` case of + // this `match` is chosen. + match (n) { + case 0 => { m = 2; } + case 1 => { m = 3; } + case 2 => { m = 5; } + case 3 => { m = 7; } + case 4 => { m = 11; } + } + return m; + } + ``` + +- A pattern is refutable and is used in a context that requires an irrefutable + pattern. This currently includes all pattern matching contexts other than + `match` statements, but the `var`/`let`-`else` feature in + [#1871](https://github.com/carbon-language/carbon-lang/pull/1871) would + introduce a second context permitting refutable matches, and overloaded + functions might introduce a third context. + + ``` + fn F(n: i32) { + // ❌ Error, refutable expression pattern `5` used in context + // requiring an irrefutable pattern. + var 5 = n; + } + // ❌ Error, refutable expression pattern `5` used in context + // requiring an irrefutable pattern. + fn G(n: i32, 5); + ``` + +- When a set of patterns have no ordering or tie-breaker, it is an error for + them to overlap unless there is a unique best match for any value that + matches more than one pattern. However, this situation does not apply to any + current language rule: + + - For `match` statements, patterns are matched top-down, so overlap is + permitted. + - We do not yet have an approved design for overloaded functions, but it + is anticipated that declaration order will be used in that case too. + - For a set of `impl`s that match a given `impl` lookup, argument + deduction is used rather than pattern matching, but overlapping `impl`s + with the same type structure are an error unless either an `impl` is + provided to cover the overlap or a `match_first` declaration is used to + order the `impl`s. (This is a pre-existing rule and is unchanged by this + proposal.) + ## Rationale - [Software and language evolution](/docs/project/goals.md#software-and-language-evolution) @@ -470,6 +672,51 @@ fn TypeName[template T:! Type](x: T) -> String { ## Alternatives considered +### Shorthand for `auto` + +We could provide a shorter syntax for `name: auto`. +[Proposal #851](https://github.com/carbon-language/carbon-lang/blob/trunk/proposals/p0851.md#elide-the-type-instead-of-using-auto) +considered the following shorthands and decided against using them: + +``` +var n: _ = init; +var n = init; +``` + +A novel suggestion that avoids some of the disadvantages of those syntaxes would +be to use: + +``` +var n:= init; +``` + +Advantages: + +- Shorter syntax for variables with a deduced type. +- Potentially allows removal of the `auto` keyword. + +Disadvantages: + +- Appears to introduce a `:=` syntax, but that only arises in cases where an + initializer immediately follows the name. + - Cases such as `var (a:, b:) = my_pair;` would either be invalid or would + not use the `:=` syntax. + - If we accept such cases, there is a risk of grammar ambiguities. + - If we reject such cases, we may still want to keep `auto` around for + them, creating inconsistency. +- Not a complete replacement for `auto` if we want to also allow things like + `v: Vector(auto)`. C++ doesn't allow the equivalent syntax currently, but it + was part of the Concepts TS and seems likely to return at some point. +- No syntactic difference between accidentally omitting a type entirely and + requesting type deduction. However, the mistake of omitting a type but + retaining the `:` seems unlikely, and the `:` followed by the absence of a + type is a signal that something is happening, so this seems to be less of a + concern than for the `var n = init;` syntax. + +See discussion topics +[1](https://github.com/carbon-language/carbon-lang/discussions/1495) and +[2](https://github.com/carbon-language/carbon-lang/discussions/1988). + ### Struct pattern syntax We could omit the `, _` syntax. This would simplify struct patterns, but at the @@ -542,12 +789,19 @@ fn I() { This would ensure consistency between pattern matching and deduction, potentially reducing the number of rules that Carbon developers need to learn. -We find that attempting to unify pattern-matching and type deduction in this way -harms readability. Having distinct syntax for type-level matching and -value-level matching helps guide the reader to the correct interpretation, even -though the underlying matching process is expected to be similar or identical. -As a result, we keep type deduction syntax and pattern matching syntax separate -for now: +We find that use of pattern-matching in type position can harm readability. For +example, the first of these two examples may be easier to read due to having +less nesting: + +``` +fn F[T:! Type](x: T); +fn F(x: (T:! Type)); +``` + +Having distinct syntax for type-level matching and value-level matching helps +guide the reader to the correct interpretation, even though the underlying +matching process is expected to be similar or identical. As a result, we keep +type deduction syntax and pattern matching syntax separate for now: - In pattern matching, bindings and wildcards are introduced by nested `:` / `:!` patterns, and the right-hand side of a binding pattern is never a @@ -602,6 +856,62 @@ _scrutinee_ `==` _expression_ rather than _expression_ `==` _scrutinee_ -- which would compare struct fields in a surprising order that diverges from the order of comparison for a struct pattern. +### Allow guards on arbitrary patterns + +We could treat guards as part of pattern syntax instead of as part of `case` +syntax. However, since guards make a pattern refutable, this wouldn't allow them +anywhere other than in `case`s in the current language design. It would allow +them to be nested within cases: + +``` +match (x) { + case (n: i32 if n > 5, "some string") => { ... } +} +``` + +Such nesting might allow an expensive later check to be avoided. For example, in +the above case we can avoid an `==` comparison on a string if a cheaper +comparison of `n > 5` fails. However, this would introduce complexity into the +grammar, and it's not clear that this feature would add sufficient value to +justify that complexity. + +An additional concern is that if we add `let`...`else` syntax, this would +presumably permit things like: + +``` +let n: i32 if n > 5 = 20 else { return 0; }; +``` + +... where it would be easy to misparse the `if ... else` as being a single +construct, where the intended parse would be: + +``` +let ((n: i32) if n > 5) = 20 else { return 0; }; +``` + +### Treat expression patterns as exhaustive if they cover all possible values + +We could do more work to treat a set of expression patterns as being exhaustive, +if each pattern has a constant value and between those constant values, all +possible values of the type are covered. The advantage of this would be that we +improve the precision of our language rules. + +This change in rules has some disadvantages and problems: + +- It would add some complexity to the rules and to implementations in order to + track whether all possible values have been created. +- For even the simplest types where this would apply, such as `i8`, it seems + unlikely that a `match` covering all possible values would be written, due + to the large number of patterns required. +- In many cases, the value being matched will carry an invariant so that + matching a subset of the representable values would match all meaningful + values. We would still have imprecise rules in those cases. +- Expression patterns are matched with `==`, and for an arbitrary `==` it is + not computable in general to tell whether a given set of values is + exhaustive, so we would only be able to apply this in some subset of cases. +- This restriction is straightforward to work around by replacing the final + value match with a `default` or `_`. + ## Future work ### Or patterns @@ -615,27 +925,52 @@ match (x) { } ``` -### Guards +### Matching classes by dynamic type -We could provide guards for patterns, allowing a pattern to match only if some -predicate involving the bindings holds: +We could provide a way to match polymorphic class objects based on their dynamic +type. This might be the default when matching a polymorphic class, or might +require opt-in. The behavior in this proposal is that only the static type of +the operand is considered. + +For example, we could default to matching the static type, and allow a `dyn` +_pattern_ syntax for matching a pointer to a polymorphic class type, meaning +that we match the dynamic type rather than the static type of the pointer: ``` -match (x) { - case (m: i32, n: i32) if m + n < 5 => { return m - n; } +abstract class Base { virtual fn F[me: Self](); } +class Derived1 extends Base {} +class Derived2 extends Base {} + +fn PrintType(b: Base*) { + match (b) { + // `case d1: Derived*` would be invalid here, + // because it could never match. + case dyn d1: Derived1* => { Print("Derived1"); } + case dyn d2: Derived2* => { Print("Derived2"); } + default => { Print("Unknown derived class"); } + } } -``` -### Matching classes by dynamic type +fn PrintTemplateType[template T:! Type](p: T*) { + match (p) { + // OK, dispatch is based on the static type. + case b: Base* => { Print("Base"); } + case d1: Derived1* => { Print("Derived1"); } + case d2: Derived2* => { Print("Derived2"); } + default => { Print("Unknown class"); } + } +} +``` -We could provide a way to match polymorphic class objects based on their dynamic -type. This might be the default when matching a polymorphic class, or might -require opt-in. +However, at this time we do not have a design for a checked down-cast, so it's +not clear how this matching operation would fit into the design of classes, +either syntactically or semantically. ### User-defined pattern matching We plan to provide a mechanism for allowing a user-defined type to specify how -it can be matched by patterns. See [proposal #157](p0157.md) for details. +it can be matched by patterns. See [proposal #157](/proposals/p0157.md) for +details. ### Matching classes with struct patterns From c7ad276419ea8d6a91fb0a9743c4049a105ec594 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Wed, 2 Nov 2022 11:47:25 -0700 Subject: [PATCH 20/27] Add as-patterns as future work. --- proposals/p2188.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/proposals/p2188.md b/proposals/p2188.md index 8513b9f04cb18..7f6770af32788 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -41,6 +41,7 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception - [Treat expression patterns as exhaustive if they cover all possible values](#treat-expression-patterns-as-exhaustive-if-they-cover-all-possible-values) - [Future work](#future-work) - [Or patterns](#or-patterns) + - [As patterns](#as-patterns) - [Matching classes by dynamic type](#matching-classes-by-dynamic-type) - [User-defined pattern matching](#user-defined-pattern-matching) - [Matching classes with struct patterns](#matching-classes-with-struct-patterns) @@ -925,6 +926,24 @@ match (x) { } ``` +### As patterns + +We could provide +["as-patterns"](https://en.wikibooks.org/wiki/Haskell/Pattern_matching#As-patterns) +as a convenient way to give a name to a value while still matching parts of that +value. Following Haskell, we could use: + +- _pattern_ ::= _identifier_ `@` _pattern_ + +For example: + +``` +match (x) { + // `s` names the first element of the tuple. + case (s@{.a = n: i32, .b = 12}, 4) if n > 1 => { return s; } +} +``` + ### Matching classes by dynamic type We could provide a way to match polymorphic class objects based on their dynamic From f0cc392f51c5a6bb7ee60c983d365f0431597c0a Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Tue, 15 Nov 2022 23:07:36 -0800 Subject: [PATCH 21/27] Require exhaustiveness as suggested in code review and discussed on discord. See https://discord.com/channels/655572317891461132/748959784815951963/1037445448908161104 --- proposals/p2188.md | 94 +++++++++++++++++++++++++++------------------- 1 file changed, 55 insertions(+), 39 deletions(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index 7f6770af32788..64945dca679c4 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -39,6 +39,7 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception - [Introducer syntax for expression patterns](#introducer-syntax-for-expression-patterns) - [Allow guards on arbitrary patterns](#allow-guards-on-arbitrary-patterns) - [Treat expression patterns as exhaustive if they cover all possible values](#treat-expression-patterns-as-exhaustive-if-they-cover-all-possible-values) + - [Allow non-exhaustive `match` statements](#allow-non-exhaustive-match-statements) - [Future work](#future-work) - [Or patterns](#or-patterns) - [As patterns](#as-patterns) @@ -519,7 +520,9 @@ Some definitions: Equivalently, _C_ is exhaustive if the pattern `_: auto` is not useful in the context of _C_. - A pattern _P_ is _refutable_ if there are values that it does not match, - that is, if the pattern `_` is useful in the context of {_P_}. + that is, if the pattern `_` is useful in the context of {_P_}. Equivalently, + the pattern _P_ is _refuatble_ if the set of patterns {_P_} is not + exhaustive. - A set of patterns _C_ is _overlapping_ if there exists any value that is matched by more than one pattern in _C_. @@ -584,16 +587,15 @@ We will diagnose the following situations: } ``` -- A pattern match is not exhaustive and some other language rule requires - exhaustiveness. For example: +- A pattern match is not exhaustive and the program doesn't explicitly say + what to do when no pattern matches. For example: - - If control flow from the end of a `match` can reach the end of a - value-returning function, the `match` is required to be exhaustive. + - If the patterns in a `match` are not exhaustive and no `default` is + provided. ``` fn F(n: i32) -> i32 { - // ❌ Error, control flow can reach end of value-returning function - // because this `match` is not exhaustive. + // ❌ Error, this `match` is not exhaustive. match (n) { case 0 => { return 2; } case 1 => { return 3; } @@ -604,43 +606,25 @@ We will diagnose the following situations: } ``` - - If a variable could potentially be used while in an unformed state if no - case in a `match` statement matches. + - If a refutable pattern appears in a context where only one pattern can + be specified, such as a `let` or `var` declaration, and there is no + fallback behavior. This currently includes all pattern matching contexts + other than `match` statements, but the `var`/`let`-`else` feature in + [#1871](https://github.com/carbon-language/carbon-lang/pull/1871) would + introduce a second context permitting refutable matches, and overloaded + functions might introduce a third context. ``` - fn F(n: i32) -> i32 { - var m: i32; - // ❌ Error, `m` used in an unformed state if the `default` case of - // this `match` is chosen. - match (n) { - case 0 => { m = 2; } - case 1 => { m = 3; } - case 2 => { m = 5; } - case 3 => { m = 7; } - case 4 => { m = 11; } - } - return m; + fn F(n: i32) { + // ❌ Error, refutable expression pattern `5` used in context + // requiring an irrefutable pattern. + var 5 = n; } + // ❌ Error, refutable expression pattern `5` used in context + // requiring an irrefutable pattern. + fn G(n: i32, 5); ``` -- A pattern is refutable and is used in a context that requires an irrefutable - pattern. This currently includes all pattern matching contexts other than - `match` statements, but the `var`/`let`-`else` feature in - [#1871](https://github.com/carbon-language/carbon-lang/pull/1871) would - introduce a second context permitting refutable matches, and overloaded - functions might introduce a third context. - - ``` - fn F(n: i32) { - // ❌ Error, refutable expression pattern `5` used in context - // requiring an irrefutable pattern. - var 5 = n; - } - // ❌ Error, refutable expression pattern `5` used in context - // requiring an irrefutable pattern. - fn G(n: i32, 5); - ``` - - When a set of patterns have no ordering or tie-breaker, it is an error for them to overlap unless there is a unique best match for any value that matches more than one pattern. However, this situation does not apply to any @@ -667,6 +651,9 @@ We will diagnose the following situations: - Modeling pattern syntax after expressions eases the burden of learning a new sub-language for pattern-matching: patterns are an extension of expressions, and expressions are a special case of patterns. + - Requiring exhaustiveness for matches makes control flow easier to + understand as there is never a value for which the `match` skips all + cases. - [Interoperability with and migration from existing C++ code](/docs/project/goals.md#interoperability-with-and-migration-from-existing-c-code) - The rules for matching a templated value can be used to replace `if constexpr` in many cases. @@ -913,6 +900,35 @@ This change in rules has some disadvantages and problems: - This restriction is straightforward to work around by replacing the final value match with a `default` or `_`. +### Allow non-exhaustive `match` statements + +We could permit `match` statements that are not exhaustive, and execute none of +the case blocks if none of the patterns match. This is a very common choice in +the design of such language features. However, it is also a common source of +errors, for example when matching a sum type and an alternative is missed. By +making this a language rule, we ensure that developers can rely on such mistakes +being caught. + +There is an easy syntactic way to disable this check, by adding an explicit +`default => {}` case. If we made the opposite choice, we would not automatically +have an easy way to request that the check be performed. We could add syntax to +request it, but such syntax would likely be forgotten and the default behavior +would be that errors are silently permitted. This seems sufficient to outweigh +the potential ergonomic cost of requiring the `default => {}` case to be written +explicitly. + +Another motivation for requiring exhaustiveness is that it simplifies other +language rules. For example, when determining whether control flow can reach the +end of a function with a declared return type, a separate exhaustiveness +analysis is not necesasry. + +One concern with exhaustiveness checking is that it will cause the addition of +an alternative to a choice type to be a breaking change by default. However, +this is also one of the main advantages, and the design of choice types is +intended to eventually provide a mechanism to specify that a choice type is +extensible, which if used would mean that a set of patterns for that choice type +would only be considered exhaustive if it includes a wildcard pattern. + ## Future work ### Or patterns From 213981b5f6270bf76bd9670b971561205abc1d93 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Wed, 30 Nov 2022 17:14:45 -0800 Subject: [PATCH 22/27] Add examples. --- proposals/p2188.md | 623 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 623 insertions(+) diff --git a/proposals/p2188.md b/proposals/p2188.md index 64945dca679c4..423371eb804f2 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -46,7 +46,19 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception - [Matching classes by dynamic type](#matching-classes-by-dynamic-type) - [User-defined pattern matching](#user-defined-pattern-matching) - [Matching classes with struct patterns](#matching-classes-with-struct-patterns) + - [Matching by reference](#matching-by-reference) - [Type deduction](#type-deduction) +- [Examples](#examples) + - [Examples from P0095R1](#examples-from-p0095r1) + - [Figure 1. Declaration of a command data structure](#figure-1-declaration-of-a-command-data-structure) + - [Figure 2. Implementation of an output operator](#figure-2-implementation-of-an-output-operator) + - [Figure 3. Switching an enum](#figure-3-switching-an-enum) + - [Figure 4. Expression datatype](#figure-4-expression-datatype) + - [Figure 5. `struct` inspection](#figure-5-struct-inspection) + - [Example from P1371R5: Red-black tree rebalancing](#example-from-p1371r5-red-black-tree-rebalancing) + - [With P1371R5 pattern matching](#with-p1371r5-pattern-matching) + - [With this proposal](#with-this-proposal) + - [With this proposal plus #2187](#with-this-proposal-plus-2187) @@ -942,6 +954,10 @@ match (x) { } ``` +See the +[red-black tree rebalancing example](#example-from-p1371r5-red-black-tree-rebalancing) +for a real-world example where this would result in a simplification. + ### As patterns We could provide @@ -1016,6 +1032,36 @@ convertible to the class type. However, without a design for user-defined pattern matching and matching on dynamic type, there is a significant risk that this would conflict with the rules there, so it is deferred for now. +### Matching by reference + +When the scrutinee is an lvalue, it is sometimes desirable to form a mutable +binding to it. For example, see the +[`struct` inspection example](#figure-5-struct-inspection) below. We currently +support this only for the `me` binding in a method, using the `addr` keyword, +but could allow this more generally. + +```carbon +fn takeDamage(p: player*) { + match (*p) { + case {.hitpoints = 0, .lives = 0, _} => { gameOver(); } + case {.hitpoints = addr hp: i32*, .lives = addr l: i32*, _} if *hp == 0 => { + *hp = 10; + --*l; + } + case {.hitpoints = addr hp: i32*, _} if *hp <= 3 => { + --*hp; + messageAlmostDead(); + } + case {.hitpoints = addr hp: i32*, _} => { + --*hp; + } + } +} +``` + +Work in this area will need to consider whether we can provide this feature +ergonomically without introducing reference-like behavior. + ### Type deduction This proposal does not cover type deduction, instead considering it to be a @@ -1023,3 +1069,580 @@ separate topic from pattern matching syntax, even though the semantic behavior of the two may be quite similar or identical. We will need a proposal to explore type deduction and describe its functioning. + +## Examples + +These examples are translations of examples in WG21 paper +[P0095R1](https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0095r1.html) +and +[P1371R3](https://www.open-std.org/JTC1/SC22/WG21/docs/papers/2020/p1371r3.pdf), +with permission from the author of those examples, David Sankel. Thank you, +David! + +### Examples from P0095R1 + +#### Figure 1. Declaration of a command data structure + + + + +
C++P0095R1This proposal
+ +```c++ +struct set_score { + std::size_t value; +}; + +struct fire_missile {}; + +struct fire_laser { + unsigned intensity; +}; + +struct rotate { + double amount; +}; + +struct command { + std::variant< + set_score, + fire_missile, + fire_laser, + rotate > value; +}; +``` + + + +```c++ +lvariant command { + std::size_t set_score; + std::monostate fire_missile; + unsigned fire_laser; + double rotate; +}; +``` + + + +```carbon +choice command { + set_score(u64), + fire_missile, + fire_laser(u32), + rotate(f64) +} +``` + +
+ +#### Figure 2. Implementation of an output operator + + + + +
C++P0095R1This proposal
+ +```c++ +namespace { +struct Output { + std::ostream& operator()(std::ostream& stream, const set_score& ss) const { + return stream << "Set the score to " << ss.value << ".\n"; + } + std::ostream& operator()(std::ostream& stream, const fire_missile&) const { + return stream << "Fire a missile.\n"; + } + std::ostream& operator()(std::ostream& stream, const fire_laser& fl) const { + return stream << "Fire a laser with " << fl.intensity << " intensity.\n"; + } + std::ostream& operator()(std::ostream& stream, const rotate& r) const { + return stream << "Rotate by " << r.degrees << " degrees.\n" + } +}; +} + +std::ostream& operator<<(std::ostream& stream, const command& cmd) { + return std::visit(std::bind(Output(), std::ref(stream), std::placeholders::_1), + cmd.value); +} +``` + + + +```c++ +std::ostream& operator<<(std::ostream& stream, const command& cmd) { + return inspect(cmd) { + set_score value => + stream << "Set the score to " << value << ".\n" + fire_missile _ => + stream << "Fire a missile.\n" + fire_laser intensity => + stream << "Fire a laser with " << intensity << " intensity.\n" + rotate degrees => + stream << "Rotate by " << degrees << " degrees.\n" + } +} +``` + + + +```carbon +impl command as Printable { + fn Print[me: Self](stream: Cpp.std.ostream*) { + match (me) { + case .set_score(value: u64) => { + stream->Print("Set the score to {0}.\n", value); + } + case .fire_missile => { + stream->Print("Fire a missile.\n"); + } + case .fire_laser(intensity: u32) => { + stream->Print("Fire a laser with {0} intensity.\n", intensity); + } + case .rotate(degrees: f64) => { + stream->Print("Rotate by {9} degrees.\n", degrees); + } + } + } +} +``` + +
+ +#### Figure 3. Switching an enum + + + + + +
C++P0095R1This proposal
+ +```c++ +enum color { red, yellow, green, blue }; +``` + + + +```carbon +choice color { red, yellow, green, blue } +``` + +
+ +```c++ +const Vec3 opengl_color = [&c] { + switch(c) { + case red: + return Vec3(1.0, 0.0, 0.0); + break; + case yellow: + return Vec3(1.0, 1.0, 0.0); + break; + case green: + return Vec3(0.0, 1.0, 0.0); + break; + case blue: + return Vec3(0.0, 0.0, 1.0); + break; + default: + std::abort(); + }(); +``` + + + +```c++ +const Vec3 opengl_color = + inspect(c) { + red => Vec3(1.0, 0.0, 0.0) + yellow => Vec3(1.0, 1.0, 0.0) + green => Vec3(0.0, 1.0, 0.0) + blue => Vec3(0.0, 0.0, 1.0) + }; +``` + + + +```carbon +// Carbon has neither expression-match nor lambdas yet, +// so this can't easily be done in line. +fn GetOpenGLColor(c: color) -> Vec3 { + match (c) { + case .red => { return Vec3.Make(1.0, 0.0, 0.0); } + case .yellow => { return Vec3.Make(1.0, 1.0, 0.0); } + case .green => { return Vec3.Make(0.0, 1.0, 0.0); } + case .blue => { return Vec3.Make(0.0, 0.0, 1.0); } + } +} +let opengl_color: Vec3 = GetOpenGLColor(c); +``` + +
+ +#### Figure 4. Expression datatype + + + + +
C++P0095R1This proposal
+ +```c++ +struct expression; + +struct sum_expression { + std::unique_ptr left_hand_side; + std::unique_ptr right_hand_side; +}; + +struct expression { + std::variant value; +}; + +expression simplify(const expression & exp) { + if(sum_expression const * const sum = std::get_if(&exp)) { + if( int const * const lhsInt = std::get_if( sum->left_hand_side.get() ) + && *lhsInt == 0 ) { + return simplify(*sum->right_hand_side); + } + else if( int const * const rhsInt = std::get_if( sum->right_hand_side.get() ) + && *rhsInt == 0 ) { + return simplify(*sum->left_hand_side); + } else { + return {sum_expression{ + std::make_unique(simplify(*sum->left_hand_side)), + std::make_unique(simplify(*sum->right_hand_side))}} + } + } + return exp; +} + +void simplify2(expression & exp) { + if(sum_expression * const sum = std::get_if(&exp)) { + if( int * const lhsInt = std::get_if( sum->left_hand_side.get() ) + && *lhsInt == 0 ) { + expression tmp(std::move(*sum->right_hand_side)); + exp = std::move(tmp); + simplify(exp); + } + else if( int * const rhsInt = std::get_if( sum->right_hand_side.get() ) + && *rhsInt == 0 ) { + expression tmp(std::move(*sum->left_hand_side)); + exp = std::move(tmp); + simplify(exp); + } else { + simplify(*sum->left_hand_side); + simplify(*sum->right_hand_side); + } + } + return exp; +} +``` + + + +```c++ +lvariant expression; + +struct sum_expression { + std::unique_ptr left_hand_side; + std::unique_ptr right_hand_side; +}; + +lvariant expression { + sum_expression sum; + int literal; + std::string var; +}; + +expression simplify(const expression & exp) { + return inspect(exp) { + sum {*(literal 0), *rhs} => simplify(rhs) + sum {*lhs , *(literal 0)} => simplify(lhs) + sum {*lhs , *rhs} + => expression::sum{ + std::make_unique(simplify(lhs)), + std::make_unique(simplify(rhs))}; + _ => exp + }; +} + +void simplify2(expression & exp) { + inspect(exp) { + sum {*(literal 0), *rhs} => { + expression tmp(std::move(rhs)); + exp = std::move(tmp); + simplify2(exp); + } + sum {*lhs , *(literal 0)} => { + expression tmp(std::move(lhs)); + exp = std::move(tmp); + simplify2(exp); + } + sum {*lhs , *rhs} => { + simplify2(lhs); + simplify2(rhs); + } + _ => ; + }; +} +``` + + + +```carbon +choice expression { + sum(UniquePtr(expression), UniquePtr(expression)), + literal(i32), + var: String +} + +// This assumes that UniquePtr provides the matching +// functionality described in #2187. +fn simplify(exp: expression) -> expression { + match (exp) { + case .sum(.PtrTo(.literal(0)), .PtrTo(rhs: expression)) => { + return simplify(rhs); + } + case .sum(.PtrTo(lhs: expression), .PtrTo(.literal(0))) => { + return simplify(lhs); + } + case .sum(.PtrTo(lhs: expression), .PtrTo(rhs: expression)) => { + return expression.sum(MakeUnique(simplify(lhs)), + MakeUnique(simplify(rhs))); + } + default => { return exp; } + } +} +``` + +
+ +#### Figure 5. `struct` inspection + + + + + +
C++P0095R1This proposal
+ +```c++ +struct player { + std::string name; + int hitpoints; + int lives; +}; +``` + + + +```carbon +class player { + var name: String; + var hitpoints: i32; + var lives: i32; +} +``` + +
+ +```c++ +void takeDamage(player &p) { + if(p.hitpoints == 0 && p.lives == 0) + gameOver(); + else if(p.hitpoints == 0) { + p.hitpoints = 10; + p.lives--; + } + else if(p.hitpoints <= 3) { + p.hitpoints--; + messageAlmostDead(); + } + else { + p.hitpoints--; + } +} +``` + + + +```c++ +void takeDamage(player &p) { + inspect(p) { + {hitpoints: 0, lives:0} => gameOver(); + {hitpoints:hp@0, lives:l} => hp=10, l--; + {hitpoints:hp} if (hp <= 3) => { hp--; messageAlmostDead(); } + {hitpoints:hp} => hp--; + } +} +``` + + + +```carbon +fn takeDamage(p: player*) { + match (*p) { + case {.hitpoints = 0, .lives = 0, _} => { gameOver(); } + case {.hitpoints = 0, _} => { p->hitpoints = 10; --p->lives; } + case {.hitpoints = hp: i32, _} if hp <= 3 => { + --p->hitpoints; + messageAlmostDead(); + } + default => { --p->hitpoints; } + } +} +``` + +
+ +### Example from P1371R5: Red-black tree rebalancing + +#### With P1371R5 pattern matching + +```c++ +enum Color { Red, Black }; + +template +struct Node { + void balance(); + + Color color; + std::shared_ptr lhs; + T value; + std::shared_ptr rhs; +}; + +template +void Node::balance() { + *this = inspect (*this) { + [case Black, (*?) [case Red, (*?) [case Red, a, x, b], y, c], z, d] + => Node{Red, std::make_shared(Black, a, x, b), + y, + std::make_shared(Black, c, z, d)}; + [case Black, (*?) [case Red, a, x, (*?) [case Red, b, y, c]], z, d] // left-right case + => Node{Red, std::make_shared(Black, a, x, b), + y, + std::make_shared(Black, c, z, d)}; + [case Black, a, x, (*?) [case Red, (*?) [case Red, b, y, c], z, d]] // right-left case + => Node{Red, std::make_shared(Black, a, x, b), + y, + std::make_shared(Black, c, z, d)}; + [case Black, a, x, (*?) [case Red, b, y, (*?) [case Red, c, z, d]]] // right-right case + => Node{Red, std::make_shared(Black, a, x, b), + y, + std::make_shared(Black, c, z, d)}; + self => self; + }; +} +``` + +#### With this proposal + +```carbon +choice Color { Red, Black } + +class Node(T:! Type) { + fn balance[addr me: Self*](); + + var color: Color; + var lhs: SharedPtr(Node); + var value: T; + var rhs: SharedPtr(Node); +} + +fn MakeBalanced[T:! Type](a: SharedPtr(Node(T)), x: T, + b: SharedPtr(Node(T)), y: T, + c: SharedPtr(Node(T)), z: T, + d: SharedPtr(Node(T))) -> Node(T) { + return {.color = Color.Red, + .lhs = MakeShared({.color = Color.Black, .lhs = a, .value = x, .rhs = b}), + .value = y, + .rhs = MakeShared({.color = Color.Black, .lhs = c, .value = z, .rhs = d})}; +} + +fn Node(T:! Type).balance[addr me: Self*]() { + match (*me) { + {.color = .Black, .lhs = .PtrTo( + {.color = .Red, .lhs = .PtrTo( + {.color = .Red, .lhs = a: auto, .value = x: T, .rhs = b: auto}), + .value = y: T, .rhs = c: auto}), + .value = z: T, .rhs = d: auto} => { + *me = MakeBalanced(a, x, b, y, c, z, d); + } + {.color = .Black, .lhs = .PtrTo( + {.color = .Red, .lhs = a: auto, .value = x: T, .rhs = .PtrTo( + {.color = .Red, .lhs = b: auto, .value = y: T, .rhs = c: auto})}), + .value = z: T, .rhs = d: auto} => { + *me = MakeBalanced(a, x, b, y, c, z, d); + } + {.color = .Black, .lhs = a: auto, .value = x: T, .rhs = .PtrTo( + {.color = .Red, .lhs = .PtrTo( + {.color = .Red, .lhs = b: auto, .value = y: T, .rhs = c: auto}), + .value = z: T, .rhs = d: auto})} => { + *me = MakeBalanced(a, x, b, y, c, z, d); + } + {.color = .Black, .lhs = a: auto, .value = x: T, .rhs = .PtrTo( + {.color = .Red, .lhs = b: auto, .value = y: T, .rhs = .PtrTo( + {.color = .Red, .lhs = c: auto, .value = z: T, .rhs = d: auto})})} => { + *me = MakeBalanced(a, x, b, y, c, z, d); + } + default => {} + }; +} +``` + +#### With this proposal plus #2187 + +```carbon +choice Color { Red, Black } + +class Node(T:! Type) { + fn balance[addr me: Self*](); + + var color: Color; + var lhs: SharedPtr(Self); + var value: T; + var rhs: SharedPtr(Self); + + external impl as Match { + interface Continuation { + extends Match.BaseContinuation; + fn Red[addr me: Self*](lhs: SharedPtr(Self), value: T, rhs: SharedPtr(Self)) -> ReturnType; + fn Black[addr me: Self*](lhs: SharedPtr(Self), value: T, rhs: SharedPtr(Self)) -> ReturnType; + } + fn Op[me: Self, C:! Continuation](continuation: C*) -> C.ReturnType { + match (me.color) { + case .Red => { return continuation->Red(me.lhs, me.value, me.rhs); } + case .Black => { return continuation->Black(me.lhs, me.value, me.rhs); } + } + } + } +} + +fn MakeBalanced[T:! Type](a: SharedPtr(Node(T)), x: T, + b: SharedPtr(Node(T)), y: T, + c: SharedPtr(Node(T)), z: T, + d: SharedPtr(Node(T))) -> Node(T) { + return {.color = Color.Red, + .lhs = MakeShared({.color = Color.Black, .lhs = a, .value = x, .rhs = b}), + .value = y, + .rhs = MakeShared({.color = Color.Black, .lhs = c, .value = z, .rhs = d})}; +} + +fn Node(T:! Type).balance[addr me: Self*]() { + match (*me) { + .Black(.PtrTo(.Red(.PtrTo(.Red(a: auto, x: T, b: auto)), y: T, c: auto)), z: T, d: auto) => { + *me = MakeBalanced(a, x, b, y, c, z, d); + } + .Black(.PtrTo(.Red(a: auto, x: T, .PtrTo(.Red(b: auto, y: T, c: auto)))), z: T, d: auto) => { + *me = MakeBalanced(a, x, b, y, c, z, d); + } + .Black(a: auto, x: T, .PtrTo(.Red(.PtrTo(.Red(b: auto, y: T, c: auto)), z: T, d: auto))) => { + *me = MakeBalanced(a, x, b, y, c, z, d); + } + .Black(a: auto, x: T, .PtrTo(.Red(b: auto, y: T, .PtrTo(.Red(c: auto, z: T, d: auto))))) => { + *me = MakeBalanced(a, x, b, y, c, z, d); + } + default => {} + }; +} +``` From 4f5b78e79e9ffe27e519ed158f23cb3406d47968 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Thu, 1 Dec 2022 15:09:30 -0800 Subject: [PATCH 23/27] Re-add sharper template dependence rule. --- proposals/p2188.md | 35 +++++++++++++++++++++++++++++------ 1 file changed, 29 insertions(+), 6 deletions(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index 423371eb804f2..f9d70d4bf2a17 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -484,12 +484,12 @@ is compared using `==`. ### Templates -If the type of the scrutinee of a pattern involves a template parameter, any -checking of the type of the scrutinee against the type of the pattern is -deferred until the template parameter's value is known. During instantiation, -patterns that are not meaningful due to a type error are instead treated as not -matching. This includes cases where an `==` fails because of a missing `EqWith` -implementation. +Any checking of the type of the scrutinee against the type of the pattern that +cannot be performed because the type of the scrutinee involves a template +parameter is deferred until the template parameter's value is known. During +instantiation, patterns that are not meaningful due to a type error are instead +treated as not matching. This includes cases where an `==` fails because of a +missing `EqWith` implementation. ``` fn TypeName[template T:! Type](x: T) -> String { @@ -503,6 +503,29 @@ fn TypeName[template T:! Type](x: T) -> String { } ``` +Cases where the match is invalid for reasons not involving the template +parameter are rejected when type-checking the template: + +``` +fn MeaninglessMatch[template T:! Type](x: T*) { + match (*x) { + // ✅ OK, `T` could be a tuple. + case (_: auto, _: auto) => {} + default => {} + } + match (x->y) { + // ✅ OK, `T.y` could be a tuple. + case (_: auto, _: auto) => {} + default => {} + } + match (x) { + // ❌ Error, tuple pattern cannot match value of non-tuple type `T*`. + case (_: auto, _: auto) => {} + default => {} + } +} +``` + ### Guards We allow `case`s within a `match` statement to have _guards_. These are not part From 85a11356a4d12c684e978ade76a72d9dcac4850f Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Fri, 2 Dec 2022 17:20:57 -0800 Subject: [PATCH 24/27] Apply suggestions from code review Co-authored-by: josh11b --- proposals/p2188.md | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index f9d70d4bf2a17..85a8dd96c3b15 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -549,14 +549,14 @@ For consistency, this facility is also available for `default` clauses, so that Some definitions: -- A pattern _P_ is _useful_ in the context of a set of patterns _C_ if _P_ can - match any value that no pattern in _C_ matches. +- A pattern _P_ is _useful_ in the context of a set of patterns _C_ if there exists + a value that _P_ can match that no pattern in _C_ matches. - A set of patterns _C_ is _exhaustive_ if it matches all possible values. Equivalently, _C_ is exhaustive if the pattern `_: auto` is not useful in the context of _C_. - A pattern _P_ is _refutable_ if there are values that it does not match, - that is, if the pattern `_` is useful in the context of {_P_}. Equivalently, - the pattern _P_ is _refuatble_ if the set of patterns {_P_} is not + that is, if the pattern `_: auto` is useful in the context of {_P_}. Equivalently, + the pattern _P_ is _refutable_ if the set of patterns {_P_} is not exhaustive. - A set of patterns _C_ is _overlapping_ if there exists any value that is matched by more than one pattern in _C_. @@ -583,6 +583,7 @@ fn IsEven(n: u8) -> bool { ``` fn IsTrue(b: bool) -> bool { + // Considered exhaustive. match (b) { case false => { return false; } case true => { return true; } @@ -670,11 +671,10 @@ We will diagnose the following situations: - We do not yet have an approved design for overloaded functions, but it is anticipated that declaration order will be used in that case too. - For a set of `impl`s that match a given `impl` lookup, argument - deduction is used rather than pattern matching, but overlapping `impl`s - with the same type structure are an error unless either an `impl` is - provided to cover the overlap or a `match_first` declaration is used to - order the `impl`s. (This is a pre-existing rule and is unchanged by this - proposal.) + deduction is used rather than pattern matching, but `impl`s with the + same type structure are an error unless a `match_first` declaration + is used to order the `impl`s. (This is a pre-existing rule and is + unchanged by this proposal.) ## Rationale @@ -1222,7 +1222,7 @@ impl command as Printable { stream->Print("Fire a laser with {0} intensity.\n", intensity); } case .rotate(degrees: f64) => { - stream->Print("Rotate by {9} degrees.\n", degrees); + stream->Print("Rotate by {0} degrees.\n", degrees); } } } From a0803c0f808b9179e3495fc0abf71d85121b47d6 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Fri, 2 Dec 2022 17:33:48 -0800 Subject: [PATCH 25/27] Review comments. --- proposals/p2188.md | 38 ++++++++++++++++++++++++++++---------- 1 file changed, 28 insertions(+), 10 deletions(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index 85a8dd96c3b15..0e0c3ed6edd22 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -549,15 +549,15 @@ For consistency, this facility is also available for `default` clauses, so that Some definitions: -- A pattern _P_ is _useful_ in the context of a set of patterns _C_ if there exists - a value that _P_ can match that no pattern in _C_ matches. +- A pattern _P_ is _useful_ in the context of a set of patterns _C_ if there + exists a value that _P_ can match that no pattern in _C_ matches. - A set of patterns _C_ is _exhaustive_ if it matches all possible values. Equivalently, _C_ is exhaustive if the pattern `_: auto` is not useful in the context of _C_. - A pattern _P_ is _refutable_ if there are values that it does not match, - that is, if the pattern `_: auto` is useful in the context of {_P_}. Equivalently, - the pattern _P_ is _refutable_ if the set of patterns {_P_} is not - exhaustive. + that is, if the pattern `_: auto` is useful in the context of {_P_}. + Equivalently, the pattern _P_ is _refutable_ if the set of patterns {_P_} is + not exhaustive. - A set of patterns _C_ is _overlapping_ if there exists any value that is matched by more than one pattern in _C_. @@ -672,15 +672,18 @@ We will diagnose the following situations: is anticipated that declaration order will be used in that case too. - For a set of `impl`s that match a given `impl` lookup, argument deduction is used rather than pattern matching, but `impl`s with the - same type structure are an error unless a `match_first` declaration - is used to order the `impl`s. (This is a pre-existing rule and is - unchanged by this proposal.) + same type structure are an error unless a `match_first` declaration is + used to order the `impl`s. (This is a pre-existing rule and is unchanged + by this proposal.) ## Rationale - [Software and language evolution](/docs/project/goals.md#software-and-language-evolution) - The `, _` syntax for struct patterns enables a style where adding a struct member is not a breaking change. + - The requirement that matches be exhaustive makes it easier to add new + cases to a choice type, by requiring the compiler to detect places where + the new value is not handled. - [Code that is easy to read, understand, and write](/docs/project/goals.md#code-that-is-easy-to-read-understand-and-write) - Pattern syntax makes it easier to match complex values. - Modeling pattern syntax after expressions eases the burden of learning a @@ -832,6 +835,17 @@ type deduction syntax and pattern matching syntax separate for now: - In type deduction, deduced values are specified separately and an expression written in terms of those bindings describes the type. +There are some contexts where there is no syntactic location for introducing +deduced values, such as in `case` labels. For syntactic consistency, such cases +should be addressed by adding `forall` syntax, if there is motivation to support +deduction: + +``` +match (templated_value) { + case forall [template T:! Type] (p: T*) => { heap.Delete(p); } +} +``` + ### Introducer syntax for expression patterns We could have some separate introducer syntax to distinguish expression patterns @@ -912,6 +926,9 @@ construct, where the intended parse would be: let ((n: i32) if n > 5) = 20 else { return 0; }; ``` +See also a +[related Discord discussion](https://discord.com/channels/655572317891461132/748959784815951963/981691016040030279). + ### Treat expression patterns as exhaustive if they cover all possible values We could do more work to treat a set of expression patterns as being exhaustive, @@ -932,8 +949,9 @@ This change in rules has some disadvantages and problems: - Expression patterns are matched with `==`, and for an arbitrary `==` it is not computable in general to tell whether a given set of values is exhaustive, so we would only be able to apply this in some subset of cases. -- This restriction is straightforward to work around by replacing the final - value match with a `default` or `_`. + +This restriction is straightforward to work around by adding a final unreachable +`default`, or by replacing the final value match with a `default` or `_`. ### Allow non-exhaustive `match` statements From f832e408c453cdb178f4c2be43e8df4677b442b2 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Tue, 6 Dec 2022 14:31:19 -0800 Subject: [PATCH 26/27] Updates based on review comments. --- proposals/p2188.md | 72 +++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 71 insertions(+), 1 deletion(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index 0e0c3ed6edd22..7d56fe4716de7 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -48,6 +48,7 @@ SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception - [Matching classes with struct patterns](#matching-classes-with-struct-patterns) - [Matching by reference](#matching-by-reference) - [Type deduction](#type-deduction) + - [Match expressions](#match-expressions) - [Examples](#examples) - [Examples from P0095R1](#examples-from-p0095r1) - [Figure 1. Declaration of a command data structure](#figure-1-declaration-of-a-command-data-structure) @@ -878,6 +879,46 @@ fn MatchTupleLike(t: TupleLike) { } ``` +This would also permit the same, or overlapping, syntax to be used in +expressions and patterns with different meanings, for example: + +``` +fn F() { + // Here, `[...]` is an array type. + var array: [i32; 5]; + match (&array) { + // Here, `[...]` could introduce deduced parameters. + case [T:! Type] x: T* => {} + } +} +``` + +Similarly, if we added a `&PATT` pattern to match the address of a value, a +construct like `&n` would be ambiguous without in introducer: + +``` +var n: i32 = 5; +fn MatchPointerOrPointee(p: i32*) { + match (p) { + case is &n => { + Print("given pointer to n"); + } + // Here, the pattern `&PATT` could match the address of + // a value that matches `PATT`. + case &(is n) => { + Print("given pointer to i32 whose value equals the value of n"); + } + case &(m: i32) => { + Print("given pointer to value {0}", m); + } + } +} +``` + +Such an introducer would also denote the portions of a pattern that are +evaluated at runtime rather than at compile time, benefitting Carbon's goals of +readability and of predictable performance. + However, this would also introduce additional ceremony for the common case where part of a pattern is a specific value. This could be mitigated by permitting certain kinds of value as patterns without an introducer, such as numeric @@ -891,7 +932,16 @@ as matching values whose type is `T`. `==` _expression_ has been suggested, but that would imply the opposite operand order of that in this proposal -- _scrutinee_ `==` _expression_ rather than _expression_ `==` _scrutinee_ -- which would compare struct fields in a surprising order that diverges from the order -of comparison for a struct pattern. +of comparison for a struct pattern. A prefix `==` may also be visually +surprising. + +For now, we do not add such an introducer, but this decision is expected to be +revisited by a future proposal. + +See: + +- [2022-09-16 Discord discussion in #syntax](https://discord.com/channels/655572317891461132/709488742942900284/1020559500597538886) +- [2022-12-05 open discussion](https://docs.google.com/document/d/1tEt4iM6vfcY0O0DG0uOEMIbaXcZXlNREc2ChNiEtn_w/edit#bookmark=id.utvzosvvfg80) ### Allow guards on arbitrary patterns @@ -1111,6 +1161,26 @@ of the two may be quite similar or identical. We will need a proposal to explore type deduction and describe its functioning. +### Match expressions + +As demonstrated in the [switching an enum](#figure-3-switching-an-enum) example +below, it would be valuable to have an expression `match` syntax in addition to +the statement `match` syntax. We could follow the same approach as for `if` +statements, and say that a `match` that appears at the start of a statement is a +statement `match` and any other `match` is an expression `match`. + +As candidate syntax, braced `case` bodies could be replaced by an expression +followed by a comma. For example: + +``` +let opengl_color: Vec3 = match (c) { + case .red => Vec3.Make(1.0, 0.0, 0.0), + case .yellow => Vec3.Make(1.0, 1.0, 0.0), + case .green => Vec3.Make(0.0, 1.0, 0.0), + case .blue => Vec3.Make(0.0, 0.0, 1.0) +}; +``` + ## Examples These examples are translations of examples in WG21 paper From acba0cd4c877a6e7b5481917eab4b501f345c490 Mon Sep 17 00:00:00 2001 From: Richard Smith Date: Wed, 7 Dec 2022 14:04:47 -0800 Subject: [PATCH 27/27] Add another benefit of `is` from the weekly sync meeting. --- proposals/p2188.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/proposals/p2188.md b/proposals/p2188.md index 7d56fe4716de7..d286afdc97576 100644 --- a/proposals/p2188.md +++ b/proposals/p2188.md @@ -917,7 +917,9 @@ fn MatchPointerOrPointee(p: i32*) { Such an introducer would also denote the portions of a pattern that are evaluated at runtime rather than at compile time, benefitting Carbon's goals of -readability and of predictable performance. +readability and of predictable performance, and would also add syntactic +separation between the parts of a pattern that have full exhaustiveness checking +and the parts that do not. However, this would also introduce additional ceremony for the common case where part of a pattern is a specific value. This could be mitigated by permitting