-
Notifications
You must be signed in to change notification settings - Fork 470
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tests for duplicate named capture groups #3704
Comments
@Ms2ger and I split up the work in the tables above. |
ptomato
added a commit
to ptomato/test262
that referenced
this issue
Oct 28, 2022
Parse-time syntax for RegExp literals is already tested. These two files test runtime RegExp compilation, with respect to duplicate named capture groups. See: tc39#3704
ptomato
added a commit
to ptomato/test262
that referenced
this issue
Oct 28, 2022
These tests should cover the full functionality of the .groups object (and the .indices.groups object, in the case of the /d flag) for RegExp.p.exec and String.p.match: - Matched DNCG has a result - Unmatched DNCG is present and undefined - DNCG matched in previous iteration but not in current iteration is treated as unmatched - Iteration order of properties corresponds with source order See: tc39#3704
Ms2ger
pushed a commit
to ptomato/test262
that referenced
this issue
Nov 2, 2022
Parse-time syntax for RegExp literals is already tested. These two files test runtime RegExp compilation, with respect to duplicate named capture groups. See: tc39#3704
Ms2ger
pushed a commit
to ptomato/test262
that referenced
this issue
Nov 2, 2022
These tests should cover the full functionality of the .groups object (and the .indices.groups object, in the case of the /d flag) for RegExp.p.exec and String.p.match: - Matched DNCG has a result - Unmatched DNCG is present and undefined - DNCG matched in previous iteration but not in current iteration is treated as unmatched - Iteration order of properties corresponds with source order See: tc39#3704
Ms2ger
pushed a commit
that referenced
this issue
Nov 2, 2022
Parse-time syntax for RegExp literals is already tested. These two files test runtime RegExp compilation, with respect to duplicate named capture groups. See: #3704
Ms2ger
pushed a commit
that referenced
this issue
Nov 2, 2022
These tests should cover the full functionality of the .groups object (and the .indices.groups object, in the case of the /d flag) for RegExp.p.exec and String.p.match: - Matched DNCG has a result - Unmatched DNCG is present and undefined - DNCG matched in previous iteration but not in current iteration is treated as unmatched - Iteration order of properties corresponds with source order See: #3704
Ms2ger
added a commit
that referenced
this issue
Nov 3, 2022
Ms2ger
added a commit
that referenced
this issue
Nov 3, 2022
Ms2ger
added a commit
that referenced
this issue
Nov 3, 2022
Ms2ger
added a commit
that referenced
this issue
Nov 3, 2022
Ms2ger
added a commit
that referenced
this issue
Nov 4, 2022
Ms2ger
added a commit
that referenced
this issue
Nov 4, 2022
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Duplicate named capture groups reached Stage 3 in July 2022.
Summary
The proposal in a nutshell:
(?<name>regexp)
is the existing regexp syntax for "named capture group". Prior to this proposal it was forbidden to have two named capturing groups with the samename
in one regexp. Now, it is permissible to have capturing groups with the samename
, as long as they are in separate alternatives within the regexp. In other words:(?<a>x)|(?<a>y)
(?<a>x)(?<a>y)
Testing plan
I'll make some tables. The checkmarks (☑) show tests that already exist.
There are syntax changes and behavioural changes to test.
Syntax
The syntax tests are basically:
(?<a>x)|(?<a>y)
) should compile correctly(?<a>x)(?<a>y)
) should not compileThere are three entry points that compile RegExp syntax:
new RegExp()
(at runtime)These tests are pretty simple. The tests for RegExp literals already exist. To me it makes sense to have similar ones for compiling a RegExp from a string. If we test it for new RegExp() it's likely redundant for R.p.compile() but it's a small amount of work.
In table form:
Behaviour
Behaviour that should be tested:
\k<name>
is the existing syntax for backreference to a named capture group) contains the reference to the alternative which actually matched (e.g.(?:(?<a>x)|(?<a>y))\k<a>
matchesxx
,yy
, notxy
,yx
)^(?:(?<a>x)|(?<a>y)|z)\k<a>$
matchesxx
,yy
, andz
, notzz
)(?<a>x)|\k<a>
is treated as if the capture group didn't match (i.e., the backreference matches unconditionally)+
,{2}
, etc.), which didn't match but did match on a previous iteration, is still treated as if it didn't match (e.g.^(?:(?<a>x)|(?<a>y)|z){2}\k<a>$
matchesxz
,yz
, notxzx
,yzy
)The entry points where we could test behaviour of a RegExp with duplicate named capture groups:
I'd say that from a perspective of usefulness to engines, it would be diminishing returns to test all of these entry points. After all, an implementation using different regular expression engines for different entry points doesn't seem a likely failure mode. @Ms2ger and I discussed this and we think it makes sense to write tests for the first item (basic functionality) for all non-symbol methods, and pick one RegExp and one String method to test the other four items (RegExp.p.exec and String.p.match).
In table form:
Results on match objects
On returned match objects, there are some additional things that should be tested. For each named capture group, there's a property with the same name on the match object's
groups
object, and if the RegExp had thed
flag, on the match object'sindices.groups
object as well.For each of these objects (
match.groups
,match.indices.groups
) we should test:undefined
The entry points that return match objects that can be influenced by RegExps with duplicate named capture groups:
Same here, I think it's sufficient to test this functionality for RegExp.p.exec and String.p.match.
In table form:
groups
groups
groups
groups
indices.groups
indices.groups
indices.groups
indices.groups
cc @bakkot
The text was updated successfully, but these errors were encountered: