-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Early version of RFC: Pattern matching ? #245
Comments
I have nothing in particular to add, except for the fact that pattern matching is often done on algebraic data types. Nim today has almost algebraic data types, via variant objects, but:
If pattern matching lands in stdlib, it would be nice if at least the second of these issues is solved. If I remember correctly there is another RFC for that. Other than this, I would be happy with just putting gara in the stdlib, as it seems the most complete option today |
Ideally, pattern matching should work well with AST manipulation, since recognizing structural patterns in the AST tends to be 50% of a macro's job. That being said, I've only ever needed the kind of pattern matching provided by something like py-good. I can't recall ever having needed it anywhere else. |
Additional arguments in favor of pattern matching in stdlibA lot of popular programming languages (almost all with exception of Go/C that position themselves as "simple" languages with very few functional features) either already have support for pattern matching or existing proposals.
I'm not a big functional programmer myself, but i think Nim is already a really good functional programming language - closures, sum types, optional types etc. Support for custom operators and UFCS makes transition even more seamless. Pattern matching is an important addition to the list if features that would allow nim to position itself as a functional programming language. Use
|
More programming languages using Pattern Matching:
|
@haxscramper cool! but please take a look at impl-s of patty/ast-pattern-matching/gara . They do some of those, so this might be cool. Also please keep in mind the goal of zero overhead : expanded code should be usually similar to hand-written condition/extraction(not always happening in those libs tho) Awesome! |
@haxscramper thanks and Also: can you write down a bit more detailed test spec/proposal here or in your own rfc : that was one of the goals of the rfc in my mind, to discuss a more official dsl |
After some thinking I came to conclusion it is not necessary to specifically annotate captured variables with prefix since it really differs from any other nim dsl in stdlib. I started with Capture and submatch can be done using infix Although I think this is a really slippery slope because there will be a lot of different behaviors of the match depending on written expression (e.g. One thing I think that is really necessary is to make it look closer to pattern matching in functional language and not enum case-of v2.0. Default nim convention for enum naming is to have More examplesCreate value to compare against macro e(body: untyped): untyped =
case body[0]:
of ForStmt([$ident, _, $expr]):
quote do:
9
of ForStmt([$ident, Infix([%ident(".."), $rbegin, $rend]),
$body]):
quote do:
`rbegin` + `rend`
else:
quote do:
90
let a = e:
for i in 10 .. 12:
echo i
assertEq a, 22 I would prefer to write Iflet macro macro ifLet2(head: untyped, body: untyped): untyped =
case head[0]:
of Asgn([$lhs is Ident(), $rhs]):
quote do:
let expr = `rhs`
if expr.isSome():
let `lhs` = expr.get()
`body`
else:
head[0].assertNodeKind({nnkAsgn})
head[0][0].assertNodeKind({nnkIdent})
head[0].raiseCodeError("Expected assgn expression")
ifLet2 (nice = some(69)):
echo nice
ifLet2 (`nice` = some(69)):
echo nice It uses some of my own node kind assertions, but this can be removed when design is finalized. Although due to complexity of pattern matching I think having something like this to show errors would be nice. But that is a topic for another RFC/PR.
Non-goals
Other goals
If case statement macro will be moved to 'stable' part of the language (e.g. no longer hidden behind experimental switch) this can be done quite easily I think. Some comments on other librariesPatty I don't think matching non-tuple types by position only is a good idea as it makes matches really fragile in case new fields are added (which is much more likely to happen than with regular anonymous tuple). ast-pattern-matching Supports only nim nodes |
@alehander92 just addition to whatever I wrote above to RFC comment for pattern matching - I think it is better to just take as much potential ideas for syntax and features and then sort through them to determine if this is really necessary. But what I'm really certain about several things
|
The only thing I really care about is supporting if conditions with the pattern like so: case n.kind
of foo and n.len == 4:
actionA
else:
actionB
This might require a language extension as a macro can only support it well via code duplication. |
@Araq, This is already supported thanks to UFCS - I just generate access path to the field in object, without differentiating between fields and function calls macro e(body: untyped): untyped =
expandMacros:
case body:
of Bracket(len: in {3 .. 6}):
newLit(expr.toStrLit().strVal() & " matched")
else:
newLit("not matched")
echo e([2,3,4])
echo e([3, 4]) Generated code block:
let expr = body
if kind(expr) == nnkBracket and contains({3..6}, len(expr)): newLit(
strVal(toStrLit(expr)) & " matched")
else:
newLit("not matched") |
@haxscramper thanks!
sounds good
iirc it seemed not hard to implement, but this is not a top goal indeed
makes sense: I'll look at the bigger comment later
yep! |
please keep in mind that probably nim nodes should work in a similar way to other variant types(even if nim nodes are defined as a more special type): we don't want to overspecialize for one case ! Also, exhaustiveness checking would be useful even if not for all possible situations 👍 |
There is zero specific distinction for Nim node in a way I currently implement it - I have a little hack that just takes Exhaustiveness check is possible in theory but only for simpler patterns and it would make things more complicated as it would be necessary to either write custom checker or somehow lift |
iirc exhaustiveness checking isn't really that hard . Also we want to take advantage of generation of case if possible: pattern matching should be zero overhead. Of course those are not top priorities, so probably they don't need to be implemented now |
basically for compound patterns you need to compose the simpler checks and if you already have those: this might not be too hard, but this is just old brainstorming, so I might be very wrong about it |
tl;dr - maybe I'm missing something, but to me it seems like extremely difficult task. Whether or not it is possible to check for exhaustiveness is a question that should be delayed until after we have finalized set of supported features. For example set checks |
disclaimer: don't really look hard into this: maybe you're right and it's not so useful for Nim ! hm, I might not understand all the details .. but this means that it might be good to design the dsl in a way that it makes it a bit more prone to exhaustiveness checking. but what's the problem with set / array checks? we would have just several data structures to check for, I guess and yeah, as it's not only about the shape, the analysis can always be a bit more conservative, but a useful thing would be to just autogenerate a case which is not being handled as a warning: this is out of scope tho, so I am just discussing ideas |
More todo
|
Replacing special prefixes with keywords?
# With special keyworkds like `add`, `until`, `incl`
[any @leading, until @middle is "d", any @trailing]
[any, Patt(), any]
# With prefix annotations like `*`
[*@leading, @middle is "d", *@trailing]
[.._, Patt(), .._] Variadic match would use And just in general would you rather prefer DSL to introduce keywords or work with special symbols like We will still have And |
Match expression
Tuple matchingcase (true, false):
of (true, true) | (false, false): 3
else: 2 Object/named tuple matchingMatching of fields is performed using To match case object you can either use Object matches can contains either
Matching optionas (
Thanks to UFCS I can treat functions as regular fields, making it possible to have patterns like macro e(body: untyped): untyped =
case body:
of Bracket([Bracket(len: in {1 .. 3})]):
newLit("Nested bracket !")
of Bracket(len: in {3 .. 6}):
newLit(expr.toStrLit().strVal() & " matched")
else:
newLit("not matched")
echo e([2,3,4])
echo e([[1, 3, 4]])
echo e([3, 4])
Sequence matchingSequence elements can be either matched exactly (for example Right now I have two ideas for expression syntax - first one relies on use of symbols like
I'm more in favor of keyword-heavy syntax because Set matching
Key-value pairs matching
case %{"hello" : %"world"}:
of {"999": _}: "nice"
of {"hello": _}: "000"
else: "discard" |
@haxscramper Just to clarify: Is this different from |
@hlaaftana No it is not different and comma version should work the same way it works in regular case. So of (true, false), (false, true): of (true, false) | (false, true): Should be identical |
Another use case for pattern matching is input data validation. One of the main use cases for pattern matching is macro implementation. Use I withdraw my previous statement about support for custom predicates - it is not really that hard to implement, so there is no reason to avoid it, but it will be very useful for input validation.
I think pattern matching (data destructuring) should at least have all features of data synthesis (object construction, json input, array comprehension etc), so support for validation is a good thing to have. {
# Match pattern, otherwise execute `doError`
"key": @a is ("2" | "3") isnot doError(),
# execute regular code
"key": @a is ("2" | "3") isnot (
echo "Expected `2` or `3`"
),
# Do nothing on fail
"key": @a is ("2" | "3"),
# Execute callback if match failed
"key": "2" | "3" isnot doError(),
# Check for match
"key": "2" | "3",
"key": _.isString() isnot (echo "Expected string for key"),
"key": @a.isInt() isnot (echo "expected integer")
}
(
fld: ForStmt([
# First element must be an `nnkIdent("..")`
== ident("..") isnot (
echo "Expected infix `..` on ", it.lineInfoObj()
),
# Second element should have kind `IntLit()` and will be bound
# to injected vaeriable `a`
@a is IntLit() isnot (
echo "Expected integer literal `..`", it.lineInfoObj()
),
@a is (
IntLit() or StrLit()
) isnot (
echo "Expected either integer literal or string but found " &
$it.lineInfoObj()
)
])
) https://github.com/Originate/lodash-match-pattern
|
Final draft of the syntaxTo me it looks good enough, so most likely I will be implementing this or something really simiar to it.
Supported structures for pattersnSimilar to flexible serialization RFC #247, nimtrs several kinds of collections are supported which are differentiated based on pattern match syntax.
Element accessWhere
It is possible to have mixed assess for objects. Mixed object access via Checks
Notation:
Examples
Variable bindingMatch can be bound to new varaible. All variable declarations happen via
Bind orderBind order: if check evaluates to true variable is bound immediately, making it possible to use in other checks. Variable is never rebound. After it is bound, then it will have the value of first binding. Bind variable type
Matching different thingsSequence matchingInput sequence:
Greedy patterns match until the end of a sequence and cannot be followed by anything else. For sequence to match is must either be completely matched by all subpatterns or have trailing
More use examples
Tuple matchingInput tuple:
Case object matchingInput AST ForStmt
Ident "i"
Infix
Ident ".."
IntLit 1
IntLit 10
StmtList
Command
Ident "echo"
IntLit 12
KV-pairs matchingInput json string {"menu": {
"id": "file",
"value": "File",
"popup": {
"menuitem": [
{"value": "New", "onclick": "CreateNewDoc()"},
{"value": "Open", "onclick": "OpenDoc()"},
{"value": "Close", "onclick": "CloseDoc()"}
]
}
}}
Option matching
Footnotes1 it might be possible to generate 2 you can, 3 even though it opens possibilities for 4 consecutiveGreedy 5 Trailing 6 actually you can also use 7 First match for 8 Binding of variables to pattern that contains alternative is only supported if varible occurs once in the expression. Reason: there is no backtracking for match. First tuple element matches to 9 if your |
@alehander92 initial implementation as described in the draft is mostly complete (sets and predicate calls (
You can try current version by installing In addition to syntax described above I would also like to introduce two operators template `:=`*(lhs, rhs: untyped): untyped = assertMatch(rhs, lhs)
template `?=`*(lhs, rhs: untyped): untyped = matches(rhs, lhs) Which would allow sequence and nested tuple unpacking, similar to #194, unpack and definesugar module. rust-like (@a, (@b, @c), @d) := (1, (2, 3), 4)
[all @head] := [1,2,3]
if (Some(@x) ?= some("hello")) and
(Some(@y) ?= some("world")):
assertEq x, "hello"
assertEq y, "world"
else:
discard |
Implementation is completed (feature-wise) in nim-lang/fusion#33. Some additional notes in this comment, but everything else is complete (as far as I'm concerned) |
This bug should be closed, pattern matching already exists and works. |
I'm not sure if it's already too late to change stuff, but here are some opinions (
Don't get me wrong, I love pattern matching and I appreciate the effort you make, but I think we can do better (especially since this is in |
It's not too late but further progress needs my upcoming "let/var inside expressions" RFC. IMHO. |
I think this RFC can be closed, because there nothing else to add to it. "let expressions" should be discussed in a separate RFC (when it comes out), and particular details of pattern matching implementation should be discussed in |
Is it possible to make capturing matched values to variables not so ugly? const b = 2
case some_seq
of [a, (b)]: ... # b is constant and = 2
of [a, b]: ... # b = some_seq[1]
else: ... instead of const b = 2
case some_seq
of [@a, b]: ... # b is constant and = 2
of [@a, @b]: ... # b = some_seq[1]
else: ... |
Current implementation won't be changed, but when/if we get let expressions this part of the syntax could be revised. Drop the requirements for |
A bit late to the party. It's strange that among other langauges the Elixir/Erlang was not mentioned. As it has one of the most powerfull and elegant pattern matching capabilities https://elixir-lang.org/getting-started/pattern-matching.html and https://elixir-lang.org/getting-started/case-cond-and-if.html |
Prolog also wasn't mentioned, and it seems like elixir adopted a large portion of the list operations from its import fusion/matching
import std/[macros, options]
{.experimental: "caseStmtMacros".}
macro matchCall(procs: untyped): untyped =
var patterns: seq[tuple[pattern: NimNode, funcName: string]]
result = newStmtList()
var topParams: tuple[name: string, arg0, type0Name, returnType: NimNode]
for idx, pr in pairs(procs):
pr.assertMatch:
ProcDef:
Ident(strVal: @name)
@trmTemplate # Term rewriting template
@genParams # Generic params
FormalParams:
@returnType
all @arguments
@pragmas
_ # Reserved
@implementation
topParams.name = name
arguments[0].assertMatch: # FIXME handles only one-argument functions
IdentDefs:
@arg0Name
CurlyExpr[@type0Name, @pattern]
_
topParams.returnType = returnType
topParams.type0Name = type0Name
topParams.arg0 = arg0Name
let funcName = name & "impl" & $idx
result.add nnkProcDef.newTree(
ident(funcName),
trmTemplate,
genParams,
nnkFormalParams.newTree(@[
returnType, nnkIdentDefs.newTree(arg0Name, type0Name, newEmptyNode())
]),
pragmas,
newEmptyNode(),
implementation
)
patterns.add((pattern, funcName))
var dispatchImpl = nnkCaseStmt.newTree(topParams.arg0)
for (patt, funcName) in patterns:
dispatchImpl.add nnkOfBranch.newTree(
patt, nnkReturnStmt.newTree(newCall(funcName, topParams.arg0)))
result.add nnkProcDef.newTree(
ident(topParams.name),
newEmptyNode(),
newEmptyNode(),
nnkFormalParams.newTree(@[
topParams.returnType,
nnkIdentDefs.newTree(topParams.arg0, topParams.type0Name, newEmptyNode())
]),
newEmptyNode(),
newEmptyNode(),
dispatchImpl
)
echo result.repr matchCall:
proc pattern(a: NimNode{Infix[Ident(strVal: == "+"), .._]}): NimNode =
echo "Matches plus"
result = newEmptyNode()
proc pattern(a: NimNode{Infix[Ident(strVal: == "-"), .._]}): NimNode =
echo "Matches minus"
result = newEmptyNode()
macro usesPattern(body: untyped): untyped =
return pattern(body)
usesPattern(12 + 2)
usesPattern(2 - 2) |
Pattern matching for Nim
Pattern matching can be a good addition to the stdlib/fusion. It seems it becomes more mainstream:
This is an early version of a RFC: we need to decide what to do before having a more complete document.
We already have several pattern matching libraries :
A first question is, what do we want:
Motivation
TODO
Many more examples in gara, ast_pattern_matching and patty's docs: TODO when a general approach is decided on
Constraints
Research
It would be useful to research a bit more Rust, Haskell, OCaml, Python, Ruby, Scala, Erlang, Prolog, Elixir
We can also compare several possible designs like the Python folks did in their preparation for the feature.
Next steps
Future directions
Not sure
Disclaimer
I am the author of one of the libs gara, so I might be biased.
pinging @krux02 and @andreaferretti which probably have something to say on the topic
The text was updated successfully, but these errors were encountered: