Syntactic coverage #521

hvub · 2021-11-26T17:36:58Z

Build on PR #520.

This PR add a little tool that does some syntax coverage analysis of the TCK scenarios.

The current output is:

Cypher                             3752   3752
Statement                          3752   3752
Query                              3752   3752
RegularQuery                       3743   3735
Union                                16     12
SingleQuery                        3759   3735
SinglePartQuery                    3757   3733
MultiPartQuery                      972    972
UpdatingClause                     1353    318
ReadingClause                      1984   1493
Match                              1534   1268
Unwind                              414    216
Merge                               105     81
MergeAction                          28     25
Create                             1107    117
Set                                  88     85
SetItem                              90     85
Delete                               48     48
Remove                               33     33
RemoveItem                           35     33
InQueryCall                          36     35
StandaloneCall                       17     17
YieldItems                           34     33
YieldItem                            50     33
With                               1646    972
Return                             3633   3610
ProjectionBody                     5279   3624
ProjectionItems                    5278   3623
ProjectionItem                     7274   3616
Order                               341    338
Skip                                 46     46
Limit                               276    276
SortItem                            536    338
Where                              1239    831
Pattern                            2646   1367
PatternPart                        3302   1395
AnonymousPatternPart               3302   1395
PatternElement                     3302   1395
NodePattern                        6059   1403
PatternElementChain                2689    753
RelationshipPattern                2689    753
RelationshipDetail                 2525    657
Properties                          756    152
RelationshipTypes                  1862    290
NodeLabels                         1516    425
NodeLabel                          1561    425
RangeLiteral                        154    145
LabelName                          1561    425
RelTypeName                        1872    290
Expression                            0      0
OrExpression                        123     60
XorExpression                        87     32
AndExpression                       123     60
NotExpression                       181    153
ComparisonExpression               1417    689
AddOrSubtractExpression             393    227
MultiplyDivideModuloExpression      275    156
PowerOfExpression                     1      1
UnaryAddOrSubtractExpression        165    139
StringListNullOperatorExpression    404    338
ListOperatorExpression              167    137
StringOperatorExpression             32     29
NullOperatorExpression              210    172
PropertyOrLabelsExpression         1321    735
Atom                              32341   3679
Literal                           19445   2925
BooleanLiteral                     1022    263
ListLiteral                        1861    976
PartialComparisonExpression        1428    689
ParenthesizedExpression             294    179
RelationshipsPattern                 68     64
FilterExpression                    998    617
IdInColl                           1040    659
FunctionInvocation                 3254   1578
FunctionName                       3254   1578
ExistentialSubquery                  13     10
ExplicitProcedureInvocation          48     47
ImplicitProcedureInvocation           5      5
ProcedureResultField                 21     13
ProcedureName                        53     52
Namespace                          3307   1629
ListComprehension                   283    137
PatternComprehension                 16     16
PropertyLookup                     1383    754
CaseExpression                      251     75
CaseAlternative                     311     75
Variable                          18547   3501
NumberLiteral                     11144   2185
MapLiteral                         2767   1197
Parameter                            89     61
PropertyExpression                   79     73
PropertyKeyName                    8970   1727
IntegerLiteral                    10428   2100
DoubleLiteral                       811    264
SchemaName                        12403   1978
ReservedWord                         25     21
SymbolicName                      34904   3624
LeftArrowHead                       607    100
RightArrowHead                     1612    489
Dash                               5378    753

In the grammar rules for expressions are “drop-through”, so that even a literal causes all expression non-terminals to show up in the parse tree. With regards to coverage that is obviously not very informative. Hence, the numbers above do not count rules with they

end with “Expression”,
have less the 2 rule children in the parse tree,
have 0 terminal children in the parse tree, and
have less than 2 alternative and in the grammar.

With that the expression numbers look much more informative.

The difference between the columns is as follows:

The first column counts all instances of a rule (non terminal) in the parse trees.
The second column counts all instances of a rule only once per When executing query step. For instance, RETURN 123, 456, would have NumberLiteral 2 1. I do not know of any scenario having more than one When executing query step, so you can think of the second column as the number of scenarios where the tested query has at least one instance of the respective rules.

I am still somewhat puzzled about where being practically no syntax which is not covered by a scenario query. However, potential reason are:

Syntax is used in some scenarios, which actually test something else, e.g. mathematical operations are likely to have counts for this reason.
A feature does not have special syntax, e.g. it is exposed through a built-in functions or are special combination of otherwise normal syntax.

However, this is coverage of syntax — not of semantics.

hvub added 3 commits January 13, 2022 10:29

add grammar dependency

51184c4

add syntax coverage counting tool

1d5f3f5

update license header to 2022

a8cc81f

hvub force-pushed the syntactic-coverage branch from ce1c558 to a8cc81f Compare January 13, 2022 09:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Syntactic coverage #521

Syntactic coverage #521

hvub commented Nov 26, 2021

Syntactic coverage #521

Are you sure you want to change the base?

Syntactic coverage #521

Conversation

hvub commented Nov 26, 2021