Skip to content

Commit

Permalink
Add Regex link to all regex functions. (#10825)
Browse files Browse the repository at this point in the history
Adds links to Regex documentation for all regex functions.
  • Loading branch information
jdunkerley authored Aug 15, 2024
1 parent 422fa8c commit d6ca3ea
Show file tree
Hide file tree
Showing 10 changed files with 101 additions and 40 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,7 @@ type Decimal
`Float`, or `Text` value to a `Decimal`. `dec` does not attach a
warning.

! Error and Warning Conditions in creating and using `Decimal`s
! Error Conditions

- If a `Text` argument is incorrectly formatted, a `Number_Parse_Error`
is thrown.
Expand Down Expand Up @@ -198,7 +198,7 @@ type Decimal
`Float`, or `Text` value to a `Decimal`. `dec` does not attach a
warning.

! Error and Warning Conditions in creating and using `Decimal`s
! Error Conditions

- If a `Text` argument is incorrectly formatted, a `Number_Parse_Error`
is thrown.
Expand Down Expand Up @@ -260,7 +260,7 @@ type Decimal
`Float`, or `Text` value to a `Decimal`. `dec` does not attach a
warning.

! Error and Warning Conditions in creating and using `Decimal`s
! Error Conditions

- If a `Text` argument is incorrectly formatted, a `Number_Parse_Error`
is thrown.
Expand Down Expand Up @@ -330,7 +330,7 @@ type Decimal
`Float`, or `Text` value to a `Decimal`. `dec` does not attach a
warning.

! Error and Warning Conditions in creating and using `Decimal`s
! Error Conditions

- If a `Text` argument is incorrectly formatted, a `Number_Parse_Error`
is thrown.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -243,14 +243,17 @@ Text.characters self =
Find the regular expression `pattern` in `self`, returning the first match
if present or `Nothing` if not found.

For details on Enso's Regex syntax, see the [Help Documentation](https://help.enso.org/docs/using-enso/regular-expressions).

Arguments:
- pattern: The pattern to match `self` against.
- case_sensitivity: Specifies if the text values should be compared case
sensitively.

If an empty regex is used, `find` throws an `Illegal_Argument` error.
! Error Conditions

If a non-default locale is used, `find` throws an `Illegal_Argument` error.
- If an empty regex is used, `find` throws an `Illegal_Argument` error.
- If a non-default locale is used, `find` throws an `Illegal_Argument` error.

> Example
Find the first substring matching the regex.
Expand All @@ -262,7 +265,7 @@ Text.characters self =
## This matches `aBc` @ character 11
"aabbbbccccaaBcaaaa".find "a[ab]c" Case_Sensitivity.Insensitive
Text.find : Text -> Case_Sensitivity -> Match | Nothing ! Regex_Syntax_Error | Illegal_Argument
Text.find self pattern=".*" case_sensitivity=Case_Sensitivity.Sensitive =
Text.find self pattern:(Regex|Text)=".*" case_sensitivity=Case_Sensitivity.Sensitive =
case_insensitive = case_sensitivity.is_case_insensitive_in_memory
compiled_pattern = Regex.compile pattern case_insensitive=case_insensitive
compiled_pattern.match self
Expand All @@ -273,14 +276,17 @@ Text.find self pattern=".*" case_sensitivity=Case_Sensitivity.Sensitive =
Finds all the matches of the regular expression `pattern` in `self`,
returning a Vector. If not found, will be an empty Vector.

For details on Enso's Regex syntax, see the [Help Documentation](https://help.enso.org/docs/using-enso/regular-expressions).

Arguments:
- pattern: The pattern to match `self` against.
- case_sensitivity: Specifies if the text values should be compared case
sensitively.

If an empty regex is used, `find_all` throws an `Illegal_Argument` error.
! Error Conditions

If a non-default locale is used, `find_all` throws an `Illegal_Argument` error.
- If an empty regex is used, `find` throws an `Illegal_Argument` error.
- If a non-default locale is used, `find` throws an `Illegal_Argument` error.

> Example
Find the substring matching the regex.
Expand All @@ -303,14 +309,17 @@ Text.find_all self pattern=".*" case_sensitivity=Case_Sensitivity.Sensitive =

Checks if the whole text in `self` matches a provided `pattern`.

For details on Enso's Regex syntax, see the [Help Documentation](https://help.enso.org/docs/using-enso/regular-expressions).

Arguments:
- pattern: The pattern to match `self` against.
- case_sensitivity: Specifies if the text values should be compared case
sensitively.

If an empty regex is used, `match` throws an `Illegal_Argument` error.
! Error Conditions

If a non-default locale is used, `match` throws an `Illegal_Argument` error.
- If an empty regex is used, `find` throws an `Illegal_Argument` error.
- If a non-default locale is used, `find` throws an `Illegal_Argument` error.

> Example
Checks if whole text matches a basic email regex.
Expand Down Expand Up @@ -462,17 +471,19 @@ Text.tokenize self pattern="." case_sensitivity=Case_Sensitivity.Sensitive =
$n: the nth group
$<foo>: Named group `foo`

For details on Enso's Regex syntax, see the [Help Documentation](https://help.enso.org/docs/using-enso/regular-expressions).

Arguments:
- term: The `Text` or `Regex` to find.
- replacement: The text to replace matches with.
- case_sensitivity: Specifies if the text values should be compared case
sensitively.
- only_first: If True, only replace the first match.

If an empty regex is used, `replace` throws an `Illegal_Argument` error.
! Error Conditions

If a non-default locale is used with a regex, `replace` throws an
Illegal_Argument error.
- If an empty regex is used, `find` throws an `Illegal_Argument` error.
- If a non-default locale is used, `find` throws an `Illegal_Argument` error.

> Example
Replace letters in the text "aaa".
Expand Down
37 changes: 22 additions & 15 deletions distribution/lib/Standard/Base/0.0.0-dev/src/Data/Text/Regex.enso
Original file line number Diff line number Diff line change
Expand Up @@ -39,23 +39,30 @@ type Regex
Compile the provided `expression` into a `Regex` that can be used for
matching.

For details on Enso's Regex syntax, see the [Help Documentation](https://help.enso.org/docs/using-enso/regular-expressions).

Arguments:
- expression: The text representing the regular expression that you want to
compile. Must be non-empty.
- case_insensitive: Enables or disables case-insensitive matching. Case
insensitive matching behaves as if it normalises the case of all input
text before matching on it.

If an empty regex is used, `compile` throws an Illegal_Argument error.
! Error Conditions

- If an empty regex is used, `find` throws an `Illegal_Argument` error.
compile : Text -> Boolean -> Regex ! Regex_Syntax_Error | Illegal_Argument
compile expression:Text case_insensitive:Boolean=False =
if expression == '' then Error.throw (Illegal_Argument.Error "Regex cannot be the empty string") else
options_string = if case_insensitive == True then "usgi" else "usg"
compile expression:(Regex|Text) case_insensitive:Boolean=False = case expression of
_ : Regex -> if case_insensitive == expression.case_insensitive then expression else
expression.recompile (if case_insensitive then Case_Sensitivity.In_Sensitive else Case_Sensitivity.Sensitive)
_ : Text ->
if expression == '' then Error.throw (Illegal_Argument.Error "Regex cannot be the empty string") else
options_string = if case_insensitive == True then "usgi" else "usg"

internal_regex_object = Panic.catch Syntax_Error (Prim_Text_Helper.compile_regex expression options_string) caught_panic->
Error.throw (Regex_Syntax_Error.Error (caught_panic.payload.message))
internal_regex_object = Panic.catch Syntax_Error (Prim_Text_Helper.compile_regex expression options_string) caught_panic->
Error.throw (Regex_Syntax_Error.Error (caught_panic.payload.message))

Regex.Value case_insensitive internal_regex_object
Regex.Value case_insensitive internal_regex_object

## PRIVATE

Expand Down Expand Up @@ -162,13 +169,13 @@ type Regex
ICON split
Splits the `input` text based on the pattern described by `self`.

This method will _always_ return a vector. If no splits take place, the
vector will contain a single element (equal to the original string).

Arguments:
- input: The text to split based on the pattern described by `self`.
- only_first: If true, only split at the first occurrence.

This method will _always_ return a vector. If no splits take place, the
vector will contain a single element (equal to the original string).

> Example
Split on the first instance of the pattern.
pattern = Regex.compile "cd"
Expand Down Expand Up @@ -236,11 +243,6 @@ type Regex
Replace all occurrences of the pattern described by `self` in the `input`
with the specified `replacement`.

Arguments:
- input: The text in which to perform the replacement(s).
- replacement: The literal text with which to replace any matches.
- only_first: If True, only replace the first match.

If this method performs no replacements it will return the `input` text
unchanged.

Expand All @@ -251,6 +253,11 @@ type Regex
$n: the nth group
$<foo>: Named group `foo`

Arguments:
- input: The text in which to perform the replacement(s).
- replacement: The literal text with which to replace any matches.
- only_first: If True, only replace the first match.

> Example
Replace letters in the text "aa".

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,19 +26,19 @@ type Named_Pattern
## Matches any character that is not in the ASCII range (0x00-0x7F).
Non_ASCII

## Matches a tab character.
## Matches any tab characters.
Tabs

## Matches any single alphabetic character (both lowercase and uppercase).
## Matches any alphabetic characters (both lowercase and uppercase).
Letters

## Matches any single digit.
## Matches any digits.
Numbers

## Matches any single punctuation character from the set: comma, period, exclamation mark, question mark, colon, semicolon, single quote, double quote, parenthesis.
## Matches any punctuation characters from the set: comma, period, exclamation mark, question mark, colon, semicolon, single quote, double quote, parenthesis.
Punctuation

## Matches any single character that is not an alphabetic character, digit, or whitespace.
## Matches any characters that are not an alphabetic character, digit, or whitespace.
Symbols

## PRIVATE
Expand Down
10 changes: 10 additions & 0 deletions distribution/lib/Standard/Database/0.0.0-dev/src/DB_Column.enso
Original file line number Diff line number Diff line change
Expand Up @@ -1420,6 +1420,16 @@ type DB_Column
This method follows the exact replacement semantics of the
`Text.replace` method.

If regex is used the replacement string can contain references to groups
matched. The following syntaxes are supported:
$0: the entire match string
$&: the entire match string
$n: the nth group
$<foo>: Named group `foo`

The exact syntax of the regular expression is dependent on the database
engine.

Arguments:
- term: The term to find.
- replacement: The text to replace matches with.
Expand Down
10 changes: 10 additions & 0 deletions distribution/lib/Standard/Database/0.0.0-dev/src/DB_Table.enso
Original file line number Diff line number Diff line change
Expand Up @@ -2906,6 +2906,16 @@ type DB_Table

This method follows the exact replacement semantics of `Text.replace`.

If regex is used the replacement string can contain references to groups
matched. The following syntaxes are supported:
$0: the entire match string
$&: the entire match string
$n: the nth group
$<foo>: Named group `foo`

The exact syntax of the regular expression is dependent on the database
engine.

Arguments:
- columns: Specifies columns by a name, index or regular expression to
match names, or a Vector of these.
Expand Down
9 changes: 9 additions & 0 deletions distribution/lib/Standard/Table/0.0.0-dev/src/Column.enso
Original file line number Diff line number Diff line change
Expand Up @@ -1426,6 +1426,15 @@ type Column
This method follows the exact replacement semantics of the
`Text.replace` method.

If regex is used the replacement string can contain references to groups
matched. The following syntaxes are supported:
$0: the entire match string
$&: the entire match string
$n: the nth group
$<foo>: Named group `foo`

For details on Enso's Regex syntax, see the [Help Documentation](https://help.enso.org/docs/using-enso/regular-expressions).

Arguments:
- term: The term to find. Can be `Text`, `Regex`, or a `Column` of
strings.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
import Standard.Base.Errors.Unimplemented.Unimplemented
import Standard.Base.System.File.Generic.Writable_File.Writable_File
from Standard.Base.Metadata import make_single_choice, Widget
from Standard.Base.Widget_Helpers import make_regex_text_widget

import project.Errors.Invalid_JSON_Format
import project.Internal.Expand_Objects_Helpers
Expand Down Expand Up @@ -109,6 +110,8 @@ Table.from_objects value (fields : Vector | Nothing = Nothing) =
(with the column name taken from the group name if the group is named in the
regex).

For details on Enso's Regex syntax, see the [Help Documentation](https://help.enso.org/docs/using-enso/regular-expressions).

Arguments:
- pattern: The regular expression as either `Text` or `Regex` to search within the text.
- case_sensitivity: Specifies if the text values should be compared case
Expand All @@ -121,8 +124,9 @@ Table.from_objects value (fields : Vector | Nothing = Nothing) =
If the marked groups are named, the names will be used otherwise the column
will be named `Column <N>` where `N` is the number of the marked group.
(Group 0 is not included.)
@pattern make_regex_text_widget
Text.parse_to_table : Text | Regex -> Case_Sensitivity -> Boolean -> Problem_Behavior -> Table ! Type_Error | Regex_Syntax_Error | Illegal_Argument
Text.parse_to_table self (pattern : Text | Regex) case_sensitivity=Case_Sensitivity.Sensitive parse_values=True on_problems:Problem_Behavior=..Report_Warning =
Text.parse_to_table self (pattern : Text | Regex) case_sensitivity:Case_Sensitivity=..Sensitive parse_values=True on_problems:Problem_Behavior=..Report_Warning =
Parse_To_Table.parse_text_to_table self pattern case_sensitivity parse_values on_problems

## PRIVATE
Expand Down
17 changes: 15 additions & 2 deletions distribution/lib/Standard/Table/0.0.0-dev/src/Table.enso
Original file line number Diff line number Diff line change
Expand Up @@ -94,8 +94,8 @@ polyglot java import org.enso.table.parsing.problems.ParseProblemAggregator

## Represents a column-oriented table data structure.
type Table
## GROUP Standard.Base.Constants
ICON data_output
## GROUP Standard.Base.Input
ICON data_input
Creates a new table from a vector of `[name, items]` pairs.

Arguments:
Expand Down Expand Up @@ -1451,6 +1451,8 @@ type Table
The new columns will be named with the name of the input column with a
incrementing number after.

For details on Enso's Regex syntax, see the [Help Documentation](https://help.enso.org/docs/using-enso/regular-expressions).

Arguments:
- column: The name or index of the column to tokenize the text of.
- pattern: The pattern used to find within the text.
Expand All @@ -1476,6 +1478,8 @@ type Table
together; otherwise the whole match is returned.
The values of other columns are repeated for the new rows.

For details on Enso's Regex syntax, see the [Help Documentation](https://help.enso.org/docs/using-enso/regular-expressions).

Arguments:
- column: The name or index of the column to tokenize the text of.
- pattern: The pattern used to find within the text.
Expand Down Expand Up @@ -2958,6 +2962,15 @@ type Table
This method follows the exact replacement semantics of the
`Text.replace` method.

If regex is used the replacement string can contain references to groups
matched. The following syntaxes are supported:
$0: the entire match string
$&: the entire match string
$n: the nth group
$&lt;foo&gt;: Named group `foo`

For details on Enso's Regex syntax, see the [Help Documentation](https://help.enso.org/docs/using-enso/regular-expressions).

Arguments:
- columns: The column(s) to replace values on.
- term: The term to find. Can be `Text`, `Regex`, or a `Column` of
Expand Down
3 changes: 0 additions & 3 deletions test/Base_Tests/src/Data/Text/Regex_Spec.enso
Original file line number Diff line number Diff line change
Expand Up @@ -48,14 +48,11 @@ add_specs suite_builder =

group_builder.specify "passing a non-string should fail with a type error" <|
Test.expect_panic_with (Regex.compile 12) Type_Error
p = Regex.compile "[a-z]"
Test.expect_panic_with (Regex.compile p) Type_Error

suite_builder.group "Escape" group_builder->
group_builder.specify "should escape an expression for use as a literal" <|
Regex.escape "[a-z\d]+" . should_equal '\\[a-z\\d\\]\\+'


suite_builder.group "Pattern.matches" group_builder->
group_builder.specify "should return True when the pattern matches against the input" <|
pattern = Regex.compile "(.. .. )(?<letters>.+)()??(?<empty>)??"
Expand Down

0 comments on commit d6ca3ea

Please sign in to comment.