forked from opencypher/openCypher
-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add proposal for isomorphic pattern matching in response to opencyphe…
- Loading branch information
Showing
1 changed file
with
79 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,79 @@ | ||
= CIP2017-01-18 - Isomorphic Matching Semantics | ||
:numbered: | ||
:toc: | ||
:toc-placement: macro | ||
:source-highlighter: codemirror | ||
|
||
*Author:* Stefan Plantikow <stefan[email protected]> | ||
|
||
This proposal is a response to CIR-2017-174. | ||
|
||
=== Proposal: Add new uniqueness modes | ||
|
||
It is proposed to add the capability to select one of three uniqueness modes for a uniqueness scope: | ||
|
||
* `MATCH ALL`: Impose no uniqueness requirements on candidate matches | ||
* `MATCH UNIQUE RELATIONSHIPS`: Only consider candidate matches that are relationship-unique | ||
* `MATCH UNIQUE NODES`: Only consider candidate matches that are node-unique | ||
|
||
The default uniqueness mode used by `MATCH` (without a further specification of the preferred uniqueness mode) is relationship-unique matching. | ||
|
||
`MATCH ALL` does not reject any paths - not even paths containing cycles - and hence can lead to infinite result sets for the whole query. | ||
It is recommended that implementations generate at least a warning when static analysis is not able to proof query termination due to the chosen uniqueness mode. | ||
|
||
It is conceivable that this approach for the specification of uniqueness is extensible by adding further ways to restrict uniqueness. | ||
|
||
=== Proposal: Specifying the uniqueness mode of a subquery | ||
|
||
Changing the uniqueness mode of a sub query recursively changes the default uniqueness mode for all contained `MATCH` clauses unless it is overridden again. Examples: | ||
|
||
* `MATCH <uniqueness-modes> { MATCH ... } ...` | ||
* `DO <uniqueness-modes> { MATCH ... } ...` | ||
|
||
=== Proposal: Default uniqueness mode | ||
|
||
Additionally, it is proposed that a conforming implementation should provide a pre-parser option for defining a default uniqueness level for use with regular pattern matching. | ||
|
||
* `unique=nodes` for configuring node-uniqueness as the default for `MATCH` | ||
* `unique=relationships` for configuring relationship-uniqueness as the default for `MATCH` | ||
|
||
=== Proposal: Path classes | ||
|
||
Graph theory has defined various classes of paths. | ||
Cypher so far only supports a single notion of path. | ||
|
||
To improve expressivity and to help preventing the generation of infinite result sets when working with non-unique matches, it is proposed to introduce additional predicates for testing paths: | ||
|
||
* `open(p)`: true if the start and the end node of `p` are not the same node | ||
* `closed(p)`: true if the start and the end node of `p` are the same node | ||
* `trail(p)`: true if `p` contains no duplicate relationships | ||
* `simple(p)`: true if `p` contains no duplicate relationships and either no duplicate nodes at all or the start node and the end node are the same node | ||
* `trek(p)`: true if `p` contains two identical consecutive relationships | ||
* `repetetive(p)`: true if `p` contains any closed subpath `q` of `size > 1` that is immediately repeated after itself in `p` | ||
|
||
Using `repetetive` allows ensuring variable length path matching under no-uniqueness yields a finite result set: | ||
|
||
[source, Cypher] | ||
---- | ||
MATCH ALL p=(a)-[*]->(b), (b)-[*2..4]->(c) WHERE NOT repetetive(p) | ||
RETURN p | ||
---- | ||
|
||
Note that these functions naturally extend to lists. | ||
|
||
Path predicates may be used to further restrict which paths are enumerated by pattern matching. | ||
All uniqueness modes naturally correspond to default path classes: | ||
|
||
* Non-uniqueness implies no restrictions on the path class. | ||
* Relationship-uniqueness implies that all matched paths are trails. | ||
* Node-uniqueness implies that all matched paths are simple paths. | ||
|
||
== Benefits to this proposal | ||
|
||
Cypher is able to express more general classes of patterns. | ||
|
||
== Caveats to this proposal | ||
|
||
Non-uniqueness allows for non-terminating queries. | ||
|
||
A moderate increase in language complexity. |