Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pattern matching #213

Closed
wants to merge 53 commits into from
Closed

Pattern matching #213

wants to merge 53 commits into from

Conversation

cottand
Copy link

@cottand cottand commented May 9, 2020

Proposal

The goal of this proposal is to bring full blown pattern matching to Kotlin.

Pattern matching is a tried and tested feature of many languages, such as Haskell, Rust, Scala, Ruby, and in the near future, Java.

In a nutshell, it is a quick and intuitive way of checking whether an object is of some type, and to extract and check its contents. The propsal aims to do this through the existing is and destructuring componentN() semantics.

Pattern matching is a major addition to a language, and as such several aspects of the proposal are up to debate. Discussion and contributions are welcome.

Here is a short example of what Kotlin with pattern matching could look like:

sealed class Download
data class App(val name: String, val developer: Person) : Download()
data class Movie(val title: String, val director: Person) : Download()
val download: Download = // ...

// Without pattern matching:
val result = when(download) {
  is App -> {
    val (name, dev) = download
    when(dev) {
      is Person -> 
        if(dev.name == "Alice") "Alice's app $name" else "Not by Alice"
      else -> "Not by Alice"
    }
  }
  is Movie -> {
    val (title, diretor) = download
    if (director.name == "Alice") {
      "Alice's movie $title"
    } else {
      "Not by Alice"
    }
  }
}

// With pattern matching:
val result = when(download) {
  is App(name, Person("Alice", _)) -> "Alice's app $name"
  is Movie(title, Person("Alice", _)) -> "Alice's movie $title"
  is App, Movie -> "Not by Alice"
}

For more details, you can read a rendered version of the proposal here.

cottand added 28 commits May 7, 2020 23:42
@He-Pin
Copy link

He-Pin commented May 9, 2020

With case not is, or match

@cottand
Copy link
Author

cottand commented May 9, 2020

I purposefully aimed to not introduce a new keyword (like match ), because that would be a breaking change

@cottand
Copy link
Author

cottand commented May 23, 2020

Good remark. I think the expected behaviour would be to only call each deconstructor once. The main case for this is avoiding an extra function call every time and benefitting from smart casts as usual.

I don't see a scenario where the alternative would be more desirable.

@elizarov
Copy link
Contributor

elizarov commented May 25, 2020

Here's a really short personal review of this proposal (disclaimer: it does not necessarily represent the opinion of the Kotlin team on the pattern matching):

👍 Being able to combine type test and guard expression in when expression (e.g. is JsonArray where { it.isEmpty() } is great. I'm not a fan of this specific syntax (and it is breaking, since where is not a hard keyword), but it is a good start anyway. It is a simple, helpful, and nice feature that can be moved to a separate proposal.

👉 Use of positional destructuring needs a careful review from a tooling perspective. is Customer(name, age, addr) looks clear, but we already have complains about the existing positional destructuring in Kotlin that it can cause subtle problems when refactoring the order of parameters. These concerns have to be addressed before doubling down on position destructuring and making it even more widely spread in the language.

👎 Introducing a new age declaration by Person(age) when age is "an undefined identifier" sets a dangerous precedent. Nowhere in Kotlin declaration/usage depends on such non-local information as whether a name "is already defined is scope" (whatever it means). Scopes in Kotlin can contain tons of available declarations coming from a plethora of different files. While it is definitely easy for the compiler to track, it would be impossible for a human reading the code to figure out what this piece of code does (check the person has a previously computed age or destruct a person's age into a newly introduced name). Even if you produce a version of code that actually works and does what it supposed to do, it becomes extremely easy to break without getting any compilation error by unrelated changes in other files. I do not think that this kind of approach is compatible with Kotlin's vision of a reliable programming language for large-scale software development.

@Kroppeb
Copy link

Kroppeb commented May 25, 2020

👍 Being able to combine type test and guard expression in when expression (e.g. is JsonArray where { it.isEmpty() } is great. I'm not a fan of this specific syntax (and it is breaking, since where is not a hard keyword), but it is a good start anyway. It is a simple, helpful, and nice feature that can be moved to a separate proposal.

I would suggest to go with if instead of where and also allow it to be used in a case such as this

when(val item = getItem()){
	is Book -> ...
	is Bottle -> ...
	if(item.canBreak()) -> ...
	// or potentially
	if item.canBreak ->
	else -> ...
}

Now, if you have a long list of is Type in your when, and realise you need a slightly different boolean expression, we need to remove our subject and test individually. This syntax would allow adding arbitrary boolean expressions to a when with subject

👎 Introducing a new age declaration by Person(age) when age is "an undefined identifier" sets a dangerous precedent. Nowhere in Kotlin declaration/usage depends on such non-local information as whether a name "is already defined is scope" (whatever it means). Scopes in Kotlin can contain tons of available declarations coming from a plethora of different files. While it is definitely easy for the compiler to track, it would be impossible for a human reading the code to figure out what this piece of code does (check the person has a previously computed age or destruct a person's age into a newly introduced name).

Yeah, this would be weird behaviour. I thought we were planning to use Person(val age) to extract a value, but idk whether that hasn't been updated in the proposal yet, or if this needed more discussion.

@cottand
Copy link
Author

cottand commented May 25, 2020

Thanks for your inputs, @elizarov , @Kroppeb !

Being able to combine type test and guard expression in when expression (e.g. is JsonArray where { it.isEmpty() } is great. I'm not a fan of this specific syntax (and it is breaking, since where is not a hard keyword), but it is a good start anyway. It is a simple, helpful, and nice feature that can be moved to a separate proposal.

I am not married to this syntax, and would also be completely fine with moving it into a new proposal. I do believe though that both guards and pattern matching in general should be designed closely in consideration of each other so they are nice to use if both ever make it into the language.

Use of positional destructuring needs a careful review from a tooling perspective. is Customer(name, age, addr) looks clear, but we already have complains about the existing positional destructuring in Kotlin that it can cause subtle problems when refactoring the order of parameters. These concerns have to be addressed before doubling down on position destructuring and making it even more widely spread in the language.

I agree, and it was also discussed in Slack that Kotlin could potentially suffer from matches like is (_, _, _, _, age) for large data classes (some see this as a problem in Scala).

Potentially, this could be solved by matching on named properties. The proposal quickly takes a look at the idea but could not come up with nice syntax. Suggestions are welcome!

Introducing a new age declaration by Person(age) when age is "an undefined identifier" sets a dangerous precedent. Nowhere in Kotlin declaration/usage depends on such non-local information as whether a name "is already defined is scope" (whatever it means). Scopes in Kotlin can contain tons of available declarations coming from a plethora of different files. While it is definitely easy for the compiler to track, it would be impossible for a human reading the code to figure out what this piece of code does (check the person has a previously computed age or destruct a person's age into a newly introduced name).

Yeah, this would be weird behaviour. I thought we were planning to use Person(val age) to extract a value, but idk whether that hasn't been updated in the proposal yet, or if this needed more discussion.

This concern was raised early when this proposal was submitted, and I think 2 solutions exist: explicitly declaring new variables through val (like @Kroppeb suggests) or just not allow matching on defined identifiers (as in that file would not compile at all) thus only matching on literals (this is what Haskell does).

This is discussed in the Design Decisions section (in particular here). I am personally against having to type val every time in favour of equality checks on identifiers with guards only, but again, I am not married to the syntax and think it does need further discussion.

In examples:

val age = 2
when (person) {
is Person(_, age) -> ...
...

would not be valid, much like one cannot write:

val age = 3
...
val age = 2.3

I would like to note that this is at present valid Kotlin:

val a : String.() -> Unit = {
  val length = 3.0f
  val x : Float = length
}

In short, the destructuring this proposal suggests would be similar to the semantics of delcaring something new through val.

I would suggest to go with if instead of where and also allow it to be used in a case such as this

when(val item = getItem()){
	is Book -> ...
	is Bottle -> ...
	if(item.canBreak()) -> ...
	// or potentially
	if item.canBreak ->
	else -> ...
}

I fail to see how this syntax combines a type test and a guard. Could you please provide another example? For example, an equivalent of:

when(val x = getSomeObject()) {
    is Bottle where { it.canBreak() } -> // x is a breakable Bottle
    is Bottle -> // x is an unbreakable Bottle
    is Book -> // x is not a bottle at all
    else ->
}

where only a Bottle has canBreak() (and not a Book).

Or did I misunderstand and you meant something along the lines of:
(using my example above)

when(val x = getSomeObject()) {
    is Bottle if (x.canBreak()) -> // x is a breakable Bottle
                  else ->    // x is an unbreakable Bottle
    is Book -> // x is not a bottle at all
    else -> 
}

If this is the case, I do personally like not having to write the type test for each match. It also acts more like Haskell pattern matching: a type match is performed, then a series of guards, then the next type match.

@Kroppeb
Copy link

Kroppeb commented May 25, 2020

I fail to see how this syntax combines a type test and a guard. Could you please provide another example? For example, an equivalent of:

when(val x = getSomeObject()) {
    is Bottle where { it.canBreak() } -> // x is a breakable Bottle
    is Bottle -> // x is an unbreakable Bottle
    is Book -> // x is not a bottle at all
    else ->
}

where only a Bottle has canBreak() (and not a Book).

Or did I misunderstand and you meant something along the lines of:
(using my example above)

when(val x = getSomeObject()) {
    is Bottle if (x.canBreak()) -> // x is a breakable Bottle
                  else ->    // x is an unbreakable Bottle
    is Book -> // x is not a bottle at all
    else -> 
}

If this is the case, I do personally like not having to write the type test for each match. It also acts more like Haskell pattern matching: a type match is performed, then a series of guards, then the next type match.

The example I write was not about pattern matching.
My suggested syntax would be

when(val x = getSomeObject()) {
     is Bottle if (x.canBreak()) -> // x is a breakable Bottle
     is Bottle -> // x is an unbreakable Bottle
     is Book -> // x is not a bottle at all
     else ->
 }

However what I was talking about was having the type test optional and just having a guard alone also be valid, so you could rewrite the following

val x = getSomeObject()
when{
	x is Book -> // use book
	isTrash(x) -> // throw away
	x is Chair -> // use chair
	x is Table -> // use table
}

into

when(val x = getSomeObject()){
	is Book -> // use book
	if(isTrash(x)) -> // throw away
	is Chair -> // use chair
	is Table -> // use table
}

* Add subsection on 'if' guards as proposed by @Kroppeb

* Remove additional else in chained guards

* Add haskell source and format

Also make the arrow Option example exhaustive
@cottand
Copy link
Author

cottand commented Jun 17, 2020

As per @elizarov 's and @Kroppeb 's suggestions, I decided to include an alternative if
syntax for guards in the proposal. If it proves popular or someone from the Kotlin team
backs it, I am happy with replacing the where syntax altoghether and change all exmaples
accordingly.

The same goes for other 'alternatives' of the proposal (like writing nested is and
explicitly writing val, etc).

Personally, I am happy with this suggestion as it allows chained guards, and looks more
like the already familiar if structure.

However what I was talking about was having the type test optional and just having a guard alone also be valid

Additionally, I think this proposal is mainly concerned with pattern matching and this suggestion
could deserve its own separate discussion. But again, if it proves popular I could inlcude it.

@Wlodas
Copy link

Wlodas commented Jun 26, 2020

Relating to the original proposal:

val result = when(download) {
  is App(name, Person("Alice", _)) -> "Alice's app $name"
  is Movie(title, Person("Alice", _)) -> "Alice's movie $title"
  is App, Movie -> "Not by Alice"
}

This probably won't work as expected, because name and title variables are just read from outer scope. Full example:

val name = "someName"
val title = "someTitle"

val result = when(download) {
  is App(name, Person("Alice", _)) -> "Alice's app $name"
  is Movie(title, Person("Alice", _)) -> "Alice's movie $title"
  is App, Movie -> "Not by Alice"
}

To indicate that we want to extract data and not just match variable from outer scope we should somehow declare "new" variables. For example like this:

val result = when(download) {
  is App(val name, Person("Alice", _)) -> "Alice's app $name"
  is Movie(val title, Person("Alice", _)) -> "Alice's movie $title"
  is App, Movie -> "Not by Alice"
}

Another thing I don't like in current implementation is how declared subject variable is just smart-casted, but you can't declare new variable name for each case. This is how I would like it to work more less:

val result = when(download) {
  is val app: App -> "App: $app"
  is val movie: Movie -> "Movie: $movie"
}

or without val keyword:

val result = when(download) {
  is app: App -> "App: $app"
  is movie: Movie -> "Movie: $movie"
}

@cottand
Copy link
Author

cottand commented Jun 26, 2020

Hi @Wlodas !
All the remarks you make are addressed in the proposal already, so please read carefully before posting. I'll go through them anyway:

This probably won't work as expected, because name and title variables are just read from outer scope

As defind in the Semantics section, this is a compile error (conflicting declarations). It's clear and will avoid unexpected mistakes.

To indicate that we want to extract data and not just match variable from outer scope we should somehow declare "new" variables.

Discussed as an alternative to current syntax.

Another thing I don't like in current implementation is how declared subject variable is just smart-casted, but you can't declare new variable name for each case.

I assume that waht you want is to 'name' a match. This is briefly discussed here, and is called an as-pattern or binder. In the example you provide, you don't even extract new variables, you simply rename the download variable. I personally think smart casting covers this use case, and that the specific example you give is not particularily tied to pattern matching.

@LifeIsStrange
Copy link

The Java pattern matching proposal has recently evolved, if you're interested for inspiration:
https://mail.openjdk.java.net/pipermail/amber-spec-experts/2020-August/002318.html

@vlsi
Copy link

vlsi commented Aug 21, 2020

Do you think regexp destructuring is a case for pattern matching?

For instance, suppose there's a regexp for version parsing:

Regex("(?<major>\d+).(?<minor>\d+)(?:.(?<patch>\d+))")

it would be nice to "pattern match" regexp outcomes to major: Int, minor: Int, and patch: Int? variables somehow.

@cottand
Copy link
Author

cottand commented Aug 21, 2020

Hi @vlsi

I think that, as the proposal stands right now, something like this would be possible:

val pattern = """(?<major>\d+).(?<minor>\d+)(?:.(?<patch>\d+))"""

val comment = when (pattern.toRegex().matchEntire("1.2.3")?.destructured) {
  null -> "not a match!"
  is ("0", _, _) -> "some experimental stuff"
  is (major, minor, patch) -> "major version number is $major"
}

But specifics (like matching on something nullable and smart casting afterwards, syntax) would need to be discussed some more. Let me know what you think and whether this is what you had in mind.

@vlsi
Copy link

vlsi commented Aug 21, 2020

I would like if the matching was implemented by name rather than by position.

For instance:

val pattern = """(?<major>\d+).(?<minor>\d+)(?:.(?<patch>\d+))"""

val (major: Int, minor: Int, patch: Int?) by pattern.toRegex().matchEntire("1.2.3")

// The order should not matter, so the regex groups should be recognized by name
val (minor: Int, patch: Int?, major: Int) by pattern.toRegex().matchEntire("1.2.3")

@cottand
Copy link
Author

cottand commented Aug 21, 2020

Matching by name is currently not part of the proposal.

I feel like what you suggested here

// The order should not match
val (minor: Int, patch: Int?, major: Int) by pattern.toRegex().matchEntire("1.2.3")

...has to do more with tying delegates and destructuring, rather than matching inside of when (which is the scope of the KEEP).

@vlsi
Copy link

vlsi commented Aug 21, 2020

What do you think if typing inside when was shifted towards destructuring by name?

For instance:

val pattern = """(?<major>\d+).(?<minor>\d+)(?:.(?<patch>\d+))"""

when (pattern.toRegex().matchEntire("1.2.3")) {
    // Named destructuring
    (major: Int, minor: Int, patch: Int?) -> ...
    // constructor-like destructuring
    MatchResult(_) -> w
    else -> null  
}

@cottand
Copy link
Author

cottand commented Aug 21, 2020

We discussed this a bit with @rnett but we could not come up with nice syntax. What you propose

 (major: Int, minor: Int, patch: Int?) -> ...

is fine for destructuring only, but how would you write that for a nested pattern? For example a Pair<Option<Int>, Option<String>>. Even then, I still think it would not achieve the effect you are looking for, since you say you also want a match to fail if the position and the name don't match (if I understood this reply correctly).

Additionally, I'm not sure we could destructure regex by name: you'd still not have properties named like the groups in the regex.

I suggest you make a PR detailing how you envision that with added details :)

@hfhbd
Copy link

hfhbd commented Apr 26, 2021

FYI: This would be the Swift code, which also uses let/val as variable declaration. But Swift has no auto casting, so you will need a title, name variable in the case. https://docs.swift.org/swift-book/LanguageGuide/ControlFlow.html#ID129

struct Person: Equatable {
    let name: String
}

enum Download: Equatable {
    case app(name: String, developer: Person)
    case movie(title: String, director: Person)
}

let download: Download = .app(name: "App", developer: Person(name: "Alice"))
        
switch download {
    case .app(let name, Person(name: "Alice")):
        print("Alice's app \(name)")
    case .movie(let title, Person(name: "Alice")):
        print("Alice's movie \(title)")
    default:
        print("Not by Alice")
}

You can use where and move let to the front too:

case .movie(let title, let person):
    print("\(person.name)'s movie \(title)")    
// you can move the let to the front
case let .movie(title, person):              
    print("\(person.name)'s movie \(title)")
// using with another where clause
case let .movie(title, person) where title.count <= 3: 
    print("\(person.name)'s movie with a short title: \(title)")

@elizarov
Copy link
Contributor

I've reopened https://youtrack.jetbrains.com/issue/KT-186 language design request for Pattern Matching and added a "wish list" for features and qualities that a good Kotlin-style design for pattern matching should strive to achieve. That issue might be a better place to discuss what pattern matching in Kotlin should and should not do since it does not propose any particular design, but just establishes some basics and calls for a more open-ended discussion, consistently with the Contribution Guidelines for the Kotlin language as spelled out in this KEEP's readme.

@cottand
Copy link
Author

cottand commented Apr 30, 2021

Thanks for re-opening @elizarov

I had been meaning to discuss this proposal's shortcomings (in particular nominal destructuring). I posted over at https://youtrack.jetbrains.com/issue/KT-186

@LifeIsStrange
Copy link

LifeIsStrange commented May 20, 2021

@elizarov Java 17 is officially getting pattern matching support in switch:
https://mail.openjdk.java.net/pipermail/jdk-dev/2021-May/005579.html
(release in 4 months)
However proper deconstruction pattern support will not come until OpenJDK 18

@elizarov
Copy link
Contributor

Let's move all further discussion to https://youtrack.jetbrains.com/issue/KT-186 Closing this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.