Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add regular expression type to language #1132

Closed
jclark opened this issue Jun 26, 2022 · 3 comments
Closed

Add regular expression type to language #1132

jclark opened this issue Jun 26, 2022 · 3 comments
Assignees
Labels
Area/Lang Relates to the Ballerina language specification Type/Improvement Enhancement to language design

Comments

@jclark
Copy link
Collaborator

jclark commented Jun 26, 2022

This is part of #130.

@jclark
Copy link
Collaborator Author

jclark commented Jun 26, 2022

This depends to some extent on whether we do #1098 first. If we do, then they would be a special case of this. But in either case, most things work the same:

  • the regular expression type (which I will refer to as RegExp) is a subtype of anydata (we need camel case: JavaScript uses RegExp; neither RegEx nor Regex seems great to me)
  • not a subtype of json
  • RegExp behaves similarly to a new basic type (although Support for well-known datatypes like date/time #1098 will allow other things in that basic type)
  • users can refer to RegExp type using string:RegExp
  • RegExp is a subtype of readonly
  • RegExp does not have storage identity (so is like string in this respect)
  • == equality is based on the original string from which the regexp was parsed
  • === is the same as ==
  • ordering is not defined (or should we define it based on string?)
    • there is a langlib module lang.regexp (could instead call it lang.re); method call syntax r.foo() when r has type RegExp calls regexp:foo(r) similarly to other basic types
  • regexp:RegExp also refers to this new basic type (probably string:RegExp is defined as regexp:RegExp)
  • RegExp values can be created using a regexp-constructor-expr that uses backtick syntax
    • tag is TBD: could be regexp, re or r; current preferences is for re
    • this expression is a const expression (so const R = re`x|y`; needs to work without requiring type to be specified)
    • this uses a two-phase parse, like with xml constructor
    • regexp syntax errors are detected at compile time
    • second phase is parsed against grammar given in Specify regular expression syntax and semantics #1125
    • insertions are allowed at the same point that an atom is allowed; conceptually insertions happen after parsing; performing insertions must not lead to an invalid regular expression
    • insertions are wrapped in a non-capturing group (?:re)
    • XXX handling of flags for insertions is TBD: should they apply to inserted regexp or not?
  • RegExp values can also be created completely dynamically using fromString
  • toString on RegExp type works analogously to how it works on xml type
  • fromJsonWithType and toJson handle RegExp type analagously to xml type

@lasinicl
Copy link
Contributor

lasinicl commented Aug 3, 2022

  • this expression is a const expression (so const R = re`x|y`; needs to work without requiring type to be specified)

Constant expressions are limitedly supported by the compiler ATM.
ballerina-platform/ballerina-lang#13944

@jclark
Copy link
Collaborator Author

jclark commented Aug 3, 2022

Constant expressions are limitedly supported by the compiler ATM.

That's why I added this bullet point...

@jclark jclark self-assigned this Sep 13, 2022
@jclark jclark added Type/Improvement Enhancement to language design Area/Lang Relates to the Ballerina language specification labels Sep 13, 2022
@jclark jclark added this to the Swan Lake Update 3 milestone Sep 13, 2022
@jclark jclark added the status/inprogress Fixes are in the process of being added label Sep 13, 2022
jclark added a commit that referenced this issue Sep 13, 2022
jclark added a commit that referenced this issue Oct 7, 2022
jclark added a commit that referenced this issue Oct 8, 2022
@jclark jclark closed this as completed in 1d4b83e Oct 10, 2022
@jclark jclark removed the status/inprogress Fixes are in the process of being added label Oct 10, 2022
jclark added a commit that referenced this issue Oct 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area/Lang Relates to the Ballerina language specification Type/Improvement Enhancement to language design
Projects
None yet
Development

No branches or pull requests

2 participants