-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Add match/in statements #2144
Conversation
It's not really clear to me from the RFC as written, so: Is the only benefit of this new syntax the ability to have multiple patterns per-arm using the pipe operator? If so, why does that require a new kind of match block? #1882 essentially proposed adding the pipe operator to the existing match block. If I understand that comment thread correctly, it was closed not because of some fundamental problem with that idea, but because the RFC required a lot more work and wasn't a high priority. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't help but feel that this RFC is a bit poorly written (no offense). It confuses terms, uses inaccurate language and contains grammatical mistakes. This can be fixed, of course.
It also feels like it is missing a proper reason to complicate the language with non-obvious syntax like this, only containing a single example that works almost just as well today.
text/0000-overlapping_match.md
Outdated
# Summary | ||
[summary]: #summary | ||
|
||
This feature is to facilitate the writing and using of `match` statements in a whole new way. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"a whole new way" makes it sound like an advertisement and doesn't convey any information the reader might be interested in. It's also grammatically questionable.
text/0000-overlapping_match.md
Outdated
|
||
Benefits of this syntax: | ||
1. No new keywords need to be used. This is good thing since it means for a relatively small | ||
addition there would be no breaks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"there would be no breaks"? What's that supposed to mean?
text/0000-overlapping_match.md
Outdated
|
||
Basic Syntax: | ||
```rust | ||
match val in { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the only difference to the existing match
is a small keyword in the first line, I feel like it's too hard to see what kind of match block you're in when editing something in an arm.
I also think in
as a distinguishing keyword is far too meaningless for such a large change in semantics. It's not obvious at all what's happening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I chose in
because it was the best choice from all the keywords already in the language. Also, as I mentioned in the list of benefits for/in
is used for doing something with every element in an iterator so match/in
is used for doing something with every branch.
text/0000-overlapping_match.md
Outdated
match val in { | ||
pat | pat => expr, | ||
pat => expr | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This deviation from the _ => ...
syntax used by "regular" match is a bit unfortunate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be a bit unfortunate but if you consider the meaning of _
it makes sense. The _
pattern means to match everything.
text/0000-overlapping_match.md
Outdated
|
||
Meaning of parts: | ||
1. The `else` is used in a similar sort of vein to that of the `_` pattern in normal matches and | ||
could include several of the same warnings. The expression enclosed within this is only executed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"several of the same warnings" which warning are you referring to?
text/0000-overlapping_match.md
Outdated
|
||
Edge cases: | ||
1. If the `_` pattern in present in any of the contained matches and the `else` block is also | ||
present then a `unreachable_code` lint on the code within the `else` block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another missing verb. Perhaps "is emitted"? Also it feels weird that it's not an unreachable pattern warning, but it's not a pattern, so it seems technically correct...
text/0000-overlapping_match.md
Outdated
|
||
Implementation Assumptions: | ||
1. Assuming that a `match` statement is currently implemented similar to a long chain of | ||
`if/else if` statements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not, how would it be implemented in this way? if
only works on booleans. match
is built into the language and lowered into MIR later in the compiler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will rewrite this. I chose to use the word if
because it signified a check and a jump
text/0000-overlapping_match.md
Outdated
`if/else if` statements. | ||
|
||
Implementation: | ||
1. This can be implemented as if it was a list of `if` statements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, but it could be implemented as a list of matches.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would that work? I do not see an implementation via matches
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take a look at the macro I posted below
text/0000-overlapping_match.md
Outdated
1. This can be implemented as if it was a list of `if` statements. | ||
2. To cover the `else` case the location to jump to at the end after checking all the branches | ||
can be stored, initially set to the start of the `else` block but if it enters any of the | ||
branches then it is set to immediately after the `else` block. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just storing a bool is sufficient.
text/0000-overlapping_match.md
Outdated
[how-we-teach-this]: #how-we-teach-this | ||
|
||
This should be called `match/in` statements since that is the combination of keywords that are | ||
used similar to `for/in` statements. This idea would be best presented as a continuation of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of these are actually statements, they're expressions.
@Ixrec No I am not proposing just allowing multiple arms use the pipe operator since that is currently already allowed. This allows multiple arms to be executed. |
@Nokel81 That RFC proposed allowing |
FWIW this is reasonably easy to implement with a macro (with many caveats and without lints, of course) |
@jonas-schievink Yes I know that it can be implemented with a macro but as you said it doesn't have all the lints or exhaustiveness checking the main two benefits of adding it as a fully supported control flow structure |
text/0000-overlapping_match.md
Outdated
``` | ||
into | ||
```rust | ||
match cmp.compare(&array[left], &array[right]) in { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The semantic equality of these two versions implies that match ... in
, where a pattern is used in more than one match arm, has well-defined semantics for the order of match arms such that the statements of the first mention of pattern happens-before the statements of the second mention. This might be intuitive, or not at all - and such a divergence from the semantics of match
should be syntactically clearer, in my opinion.
I feel this proposal is too close to the switch statements of C/C++/Java which, as far as I know, are confusing to newcomers. I think this is because intuitively, humans think of choice as exclusive - so "A or B" is parsed as "A xor B", which is why a lot of introductory books in discrete mathematics and logic has to clarify "or vs xor" (example: Logic in Computer Science, Huth & Ryan, p.4).
I think the price of repetition is worth the potential unreadability of the proposed match ... in
construct.
text/0000-overlapping_match.md
Outdated
addition there would be no code breaks of existing code with this change. | ||
2. The `in` seems to imply that it sort of like an "iterator" of statements and will go through | ||
each of them in turn. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Semantically, I consider this a big addition - which is why I'd like a more visible syntactic difference where this RFC to be accepted. Therefore, I consider the slight variation in syntax to be a downside. I expand on this further down.
Instead of in
, I'd recommend using a terminal that more clearly indicates that this is top-down and more crucially that multiple branches may be taken. Perhaps the syntax might be: match inclusive <expr> { <patterns> }
, or match topdown <expr> { <patterns> }
.
- I don't agree that
in
implies this nearly strong enough for this to be instantly understandable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is fair and I am pretty sure that adding an additional keyword here wouldn't break any code since if a variable was called what ever it was that would be able to be checked but that is definitely something to consider
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, adding the additional "keyword" should work w/o backwards incompatibility since there's no (iirc) parsing rule "<expr> <expr>".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a correct DFA adding that state would be rather complicated but I agree is do able
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... DFA for what? Lexing? The language is not regular...
Do note that parsing Rust
is already at least context sensitive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not actually looked that deep into Rust I didn't consider it to be that high order, though tbh I don't know why
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rust's lexical grammar is already context-sensitive due to r##"strings like this"##
. The actual grammar after tokenization makes do with a fixed lookahead (I forgot how much exactly, something in the order of 5 tokens or so).
Not a single real programming language is regular (not counting assembly and machine code). Fun fact: Even the simple-looking Lua grammar requires infinite lookahead to parse and is thus harder to parse than Rust.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More details here: https://stackoverflow.com/a/43693150/1063961 and here https://stackoverflow.com/a/43693194/1063961
|
||
This should not be done because it increases the size of language and might not be used by | ||
everyone. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see some discussion of the readability of inclusive top-down multiple-branch match
constructs.
Sorry to invoke Wadler's Law on you, but I believe you have to really consider the readability of the syntax w.r.t. its semantics and possible confusion with the normal semantics of |
@Centril Since I do believe strongly in this proposal I will consider a more obvious syntax. However, I also would not like to have to wait for an epoch for a new keyword so maybe having a keyword with Would something like this be easier to differentiate: @all_branches
match x {
pat => expr
...
} else {
expr
} |
@Nokel81 I don't think you have to wait for an epoch for On with the 🚲 🏠 ... =) |
I think that |
Wouldn't a 'fallthrough' statement work just as well and better indicate the desired behavior? |
Do you mean |
Sorry about the confusion, I was editing my post to elaborate as you were asking that question. |
Thank you for the clarification. However I don't think that is an alternative to this proposal because fallthrough seems to indicate not checking (sort of like how C switch statements fall through automatically) and that is not what I want, the pattern matching is a key part of the idea |
@Techcable That might imply that it is like C/C++/Javas fall-through which is not exactly what this RFC proposes. The only commonality is that this RFC and C style switch fall-through can execute the statements of multiple match arms - but in completely different ways. This is not "go to the next case even if the pattern did not match" but rather "execute all matching paths starting with the first one mentioned lexically". This RFC proposes something much less worse than C style fall-throughs. @Nokel81 Another issue you have to clarify in this RFC is how to deal with the scenario when you have multiple patterns being valid but the type of their last expression is other than How do you deal with the following? let x = match topdown expr {
A | B => 1,
C | B => 2,
}; What is the value of Also - I think I'm slightly more in favour of Btw... what is the most important part of this RFC...? The |
@Centril I need to add that to the file. The However, having the last branch's value being chosen also doesn't seem that strange since Rust code is sequential in nature to begin with and some people might want a return value with some side effect. |
There was previously discussion on this in #2003 where folks produced some macro to do roughly this. Also, the one example provided by the RFC can be handled using the ordering
I think If you want to avoid the disjointness checks of Also, if you do want exhaustiveness but not disjointness checks, like this RFC proposes, then you might want unusual exhaustiveness checks that this RFC cannot provide, like asking that two arms get entered or something more subtle. In such cases, you could provide the desired assurance with assertions and tests, but one could imagine more flexible tools that prove specific code invariants hold at compile time. tl;dr We have a mechanism to drop exhaustiveness and disjointness checks via |
Also, if you have complex flow control, then frequently nested functions or closures produce more readable code, partially because they preserve disjointness, and they provide far more flexibility regardless.
In particular, we should observe that this approach supports order of execution impossible with the proposed RFC, meaning the RFC is truly quite a niche construct. |
@burdges Yes I know of the rough macro but even with a macro exhaustiveness checks are not possible. Currently there is no way to have something more subtle and I would disagree that extending Also something as subtle as only entering two branches I would consider part of what you call "complex flow control" and that would able to be done with this RFC and a label for breaking. |
text/0000-overlapping_match.md
Outdated
sensitive language. And adding such a keyword increases the perceptual area of the new syntax | ||
so as to make it clear which type of match is being used. | ||
2. The word `fallthrough` is used because it implies that after a branch is finished then the | ||
control falls through to the check of the next branch. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think fallthrough
implies that it will execute the statements / yield the expressions of the next match arm irrespective of whether the it's pattern holds or not, i.e: readers will expect C++ style switch
behavior.
Instead I'd like to add match many
to the 🚲 🏠
text/0000-overlapping_match.md
Outdated
`else` would be marked as unreadable. | ||
|
||
Edge cases: | ||
1. If the `_` pattern in present in any of the contained matches and the `else` block is also | ||
present then a `unreachable_code` lint is emitted on the code within the `else` block | ||
2. Since the main reason for using a `match` is the exhaustiveness checks as long as there isn't | ||
an `else` block then the compiler will output an error for `non-exhaustive patterns` if not all | ||
branches of the `match/in` are exhaustive. | ||
branches of the `match/fallthrough` are exhaustive. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another possibility would be to use !
(never type) pattern instead of else
to make the syntax more terse.
This would work because let value: T = panic!();
where T
is any type is valid - i.e: bottom inhabits every type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So would it be the following (for clarification):
match many x {
_ if x % 5 == 0 => expr,
! => expr
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely see the appeal of it, since as you say it is much more terse, but I think that it would be more confusing since it is in the area of pat/expr of the expression but does not follow the same "rules" as the other patterns since it will only be executed if no other patterns are executed sort of how the else
block of an if
statement is only executed if none of the other if
s or else if
s are executed
@burdges I agree completely on extending When it comes to exhaustiveness checking, couldn't you simply reuse the current checks of As you argue, this RFC is probably too niche, but it might be worth it to tests its use in nightly and see if it becomes popular...? We could make this an experimental RFC perhaps? Technically implementing this RFC should not be too complicated. As long as it is clearly marked syntactically that the construct is different from an ordinary |
@Centril I think that that would be a good idea but I think it should be a new RFC (I can write it up). However with the exhaustiveness checking, yes it might be possible to reuse the current checks but then there has to be some sort of syntax for what |
… like a C-style `switch` statement and `match all` impies that every branch is automatically matched
What happens to data that gets moved out of a pattern more than once? enum Foo {
Bar(String),
Baz(String),
Quux(String),
}
let foo = Foo::Baz("example".to_string());
match foo in {
Bar(s) | Baz(s) => …,
Quux(s) | Baz(s) => …,
} // ??? |
@pthariensflame A workable semantic would be to simply throw a borrowck error on the second pattern indicating that a move occurred on the first pattern. This fits well with the principle of least surprise. It would still be possible to use a by-ref pattern to avoid the move. |
Btw, our current fn main() {
let x: Result<&str, &str> = Ok("Hello, world!");
match x {
Ok(x) | Err(x) => println!("{}", x)
}
} EDIT: nevermind, I got it. And I don’t see the whole point of that. Use |
Having every branch being checked and have the possibility to execute if matched even if a previous branch has matched |
@rfcbot fcp close This RFC proposes a form of match in which more than one arm can execute (a kind of fall through). This seems like something that would potentially be useful from time to time, but not often enough to merit addition as a major feature, and match blocks can already be a source of stumbling blocks for learners. Moreover, it could readily be implemented as a procedural macro. Therefore, I move to close the RFC. Thanks @Nokel81 for the suggestion, in any case! |
Team member @nikomatsakis has proposed to close this. The next step is review by the rest of the tagged teams: No concerns currently listed. Once these reviewers reach consensus, this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
The final comment period is now complete. |
1 similar comment
The final comment period is now complete. |
Closing as per @nikomatsakis's summary. Thanks again, @Nokel81! |
This is an RFC for adding overlapping match statements using the syntax of
match/in
.Edit: rendered