Skip to content

Commit

Permalink
A library developer should be able to select matching names given a l…
Browse files Browse the repository at this point in the history
…ist (#3220)
  • Loading branch information
radeusgd authored Jan 20, 2022
1 parent ed0e918 commit 107128a
Show file tree
Hide file tree
Showing 8 changed files with 342 additions and 1 deletion.
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
from Standard.Base import all

import Standard.Base.Error.Warnings

## Specifies how to handle problems.
type Problem_Behavior
## UNSTABLE
Ignore the problem and attempt to complete the operation
type Ignore

## UNSTABLE
Report the problem as a warning and attempt to complete the operation
type Report_Warning

## UNSTABLE
Report the problem as a dataflow error and abort the operation
type Report_Error

## UNSTABLE
Attaches an error-value to the given value according to the expected problem
behavior.

If the problem behavior is set to Ignore, the value is returned as-is.
If it is set to Report_Warning, the value is returned with the error-value
attached as a warning.
If it is set to Report_Error, the error-value is returned in the form of a
dataflow error.

TODO: the Warning_System argument is temporary, as the warning system is
mocked until the real implementation is shipped. It will be removed soon.
attach_as_needed : Any -> Problem_Behavior -> Vector -> Warning_System -> Any
attach_as_needed decorated_value problem_behavior ~payload warnings=Warnings.default =
case problem_behavior of
Ignore ->
decorated_value
Report_Warning ->
warnings.attach decorated_value payload
Report_Error ->
case decorated_value of
_ -> Error.throw payload

## UNSTABLE
Attaches issues to the given value according to the expected problem
behavior.

If the problem behavior is set to Ignore, the value is returned as-is.
If it is set to Report_Warning, the value is returned with the issues
attached as warnings.
If it is set to Report_Error, the first issue is returned in the form of a
dataflow error.

TODO: the Warning_System argument is temporary, as the warning system is
mocked until the real implementation is shipped. It will be removed soon.
attach_issues_as_needed : Any -> Problem_Behavior -> Vector -> Warning_System -> Any
attach_issues_as_needed decorated_value problem_behavior issues warnings=Warnings.default =
issues.fold decorated_value value-> issue->
here.attach_as_needed value problem_behavior issue warnings=warnings
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
from Standard.Base import all

## PRIVATE
A placeholder for reporting warnings. It should be replaced once the warning
mechanism is designed and implemented.
type Warning_System
type Warning_System (warning_callback : Any -> Nothing)

## UNSTABLE
Attaches a warning to a value.

If the warning argument holds a dataflow error, the error is also
inherited by the decorated value.
attach : Any -> Any -> Any
attach decorated_value warning_payload =
case decorated_value of
_ ->
case warning_payload of
_ ->
this.warning_callback warning_payload
decorated_value


default : Warning_System
default = Warning_System warning->
IO.println "[WARNING] "+warning.to_display_text
157 changes: 157 additions & 0 deletions distribution/lib/Standard/Table/0.2.32-SNAPSHOT/src/Data/Matching.enso
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
from Standard.Base import all
import Standard.Base.Data.Locale
import Standard.Base.Data.Text.Regex as Regex_Module

from Standard.Base.Error.Problem_Behavior as Problem_Behavior_Module import all
from Standard.Base.Error.Warnings import all

## Strategy for matching names.
type Matching_Strategy
## UNSTABLE
Exact name matching.

A name is matched if its exact name is provided.
type Exact (case_sensitivity : (True | Case_Insensitive) = True)

## UNSTABLE
Regex-based name matching.

A name is matched if its name matches the provided regular expression.
type Regex (case_sensitivity : (True | Case_Insensitive) = True)

## UNSTABLE
Specifies that the operation should ignore case.

TODO: Since case-sensitive can be locale dependent, in the future this may
be extended with a `locale` setting.
type Case_Insensitive


## UNSTABLE
An error indicating that some criteria did not match any names in the input.
type No_Matches_Found (criteria : Vector Text)

No_Matches_Found.to_display_text : Text
No_Matches_Found.to_display_text =
"The criteria "+this.criteria.to_text+" did not match any names in the input."


## UNSTABLE
Selects objects from an input list that match any of the provided criteria.

Arguments:
- objects: A list of objects to be matched.
- criteria: A list of texts representing the matching criteria. Their meaning
depends on the matching strategy.
- reorder: Specifies whether to reorder the matched objects according to the
order of the matching criteria.
If `False`, the matched entries are returned in the same order as in the
input.
If `True`, the matched entries are returned in the order of the criteria
matching them. If a single object has been matched by multiple criteria, it
is placed in the group belonging to the first matching criterion on the
list.
If a single criterion's group has more than one element, their relative
order is the same as in the input.
- name_mapper: A function mapping a provided object to its name, which will
then be matched with the criteria. It is set to the identity function by
default, thus allowing the input to be a list of names to match. But it can
be overridden to enable matching more complex objects.
- matching_strategy: A `Matching_Strategy` instance specifying how to
interpret the criterion.
- on_problems: Specifies the behavior when a problem occurs during the
function.
By default, a warning is issued, but the operation proceeds.
If set to `Report_Error`, the operation fails with a dataflow error.
If set to `Ignore`, the operation proceeds without errors or warnings.
- warnings: A Warning_System instance specifying how to handle warnings. This
is a temporary workaround to allow for testing the warning mechanism. Once
the proper warning system is implemented, this argument will become
obsolete and will be removed. No user code should use this argument, as it
will be removed in the future.

> Example
Selects objects matching one of the provided patterns, preserving the input order.

Matching.match_criteria ["foo", "foobar", "quux", "baz", "Foo"] [".*ba.*", "f.*"] matching_strategy=(Regex case_sensitivity=True) == ["foo", "foobar", "baz"]

> Example
Selects pairs matching their first element with the provided criteria and
ordering the result according to the order of criteria that matched them.

Matching.match_criteria [Pair "foo" 42, Pair "bar" 33, Pair "baz" 10, Pair "foo" 0, Pair 10 10] ["bar", "foo"] reorder=True name_mapper=_.name == [Pair "bar" 33, Pair "foo" 42, Pair "foo" 0]
match_criteria : Vector Any -> Vector Text -> Boolean -> (Any -> Text) -> Matching_Strategy -> Problem_Behavior -> Warning_System -> Vector Any ! No_Matches_Found
match_criteria objects criteria reorder=False name_mapper=(x->x) matching_strategy=(Exact case_sensitivity=True) on_problems=Report_Warning warnings=Warnings.default = Panic.recover <|
[objects, criteria, reorder, name_mapper, matching_strategy, on_problems, warnings] . each Panic.rethrow

# match_matrix . at i . at j specifies whether objects.at i matches criteria.at j
match_matrix = objects.map obj->
criteria.map criterion->
name = name_mapper obj
here.match_single_criterion name criterion matching_strategy

# Checks if the ith object is matched by any criterion.
is_object_matched_by_anything : Integer -> Boolean
is_object_matched_by_anything i =
match_matrix.at i . any x->x

# Checks if the ith criterion matches any columns.
does_criterion_match_anything : Integer -> Boolean
does_criterion_match_anything i =
match_matrix.map (col -> col.at i) . any x->x

# Selects object indices which satisfy the provided predicate.
select_matching_indices : (Integer -> Boolean) -> Vector Text
select_matching_indices matcher =
0.up_to objects.length . to_vector . filter matcher

# Check consistency
checked_criteria = criteria.map_with_index j-> criterion->
has_matches = does_criterion_match_anything j
Pair has_matches criterion
unmatched_criteria = checked_criteria.filter (p -> p.first.not) . map .second

selected_indices = case reorder of
True ->
nested_indices = 0.up_to criteria.length . map j->
is_object_matched_by_this_criterion i =
match_matrix.at i . at j
select_matching_indices is_object_matched_by_this_criterion
nested_indices.flat_map x->x . distinct
False ->
select_matching_indices is_object_matched_by_anything

result = selected_indices.map objects.at
issues = if unmatched_criteria.is_empty then [] else [No_Matches_Found unmatched_criteria]
Problem_Behavior_Module.attach_issues_as_needed result on_problems issues warnings=warnings


## UNSTABLE
Checks if a name matches the provided criterion according to the specified
matching strategy.

Arguments:
- name: A `Text` representing the name being matched.
- criterion: A `Text` representing the matching criterion. It can be a simple
name or a regular expression; its meaning depends on the value of
`matching_strategy`.
- matching_strategy: A `Matching_Strategy` instance specifying how the
criterion should be interpreted.

> Example
Check if the provided name matches a regular expression.

Matching.match_single_criterion "Foobar" "f.*" (Regex case_sensitivity=Case_Insensitive) == True
match_single_criterion : Text -> Text -> Matching_Strategy -> Boolean
match_single_criterion name criterion matching_strategy = case matching_strategy of
Exact case_sensitivity -> case case_sensitivity of
True ->
name == criterion
Case_Insensitive ->
name.equals_ignore_case criterion
Regex case_sensitivity ->
insensitive = case case_sensitivity of
True -> False
Case_Insensitive -> True
re = Regex_Module.compile criterion case_insensitive=insensitive
re.matches name
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
import Standard.Table.Error.Column_Missing

from Standard.Table.Error.Column_Missing export all
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
from Standard.Base import all

## An error indicating that no column matching the provided criterion has
been found in the input.
type Column_Missing (criterion : Text)


Column_Missing.to_display_text : Text
Column_Missing.to_display_text =
"The criterion ["+this.criterion+"] did not match any columns."
3 changes: 2 additions & 1 deletion distribution/lib/Standard/Test/0.2.32-SNAPSHOT/src/Main.enso
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,8 @@ fail message = Panic.throw (Failure message)
Examples.throw_error . should_fail_with Examples.My_Error
Any.should_fail_with : Any -> Assertion
Any.should_fail_with matcher =
here.fail ("Expected an error " + matcher.to_text + " but none occurred.")
loc = Meta.get_source_location 1
here.fail ("Expected an error " + matcher.to_text + " but none occurred (at " + loc + ").")

## Expect a function to fail with the provided dataflow error.

Expand Down
2 changes: 2 additions & 0 deletions test/Table_Tests/src/Main.enso
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ from Standard.Base import all

import Standard.Test

import project.Matching_Spec
import project.Model_Spec
import project.Column_Spec
import project.Csv_Spec
Expand All @@ -15,4 +16,5 @@ main = Test.Suite.run_main <|
Json_Spec.spec
Spreadsheet_Spec.spec
Table_Spec.spec
Matching_Spec.spec
Model_Spec.spec
85 changes: 85 additions & 0 deletions test/Table_Tests/src/Matching_Spec.enso
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
from Standard.Base import all

from Standard.Table.Data.Matching import all
from Standard.Table.Error as Error_Module import all
import Standard.Base.Error.Problem_Behavior
import Standard.Base.Error.Warnings
import Standard.Test

type Foo_Error

spec = Test.group 'Matching Helper' <|
## These are workarounds to #1600 - default arguments do not work properly
on Atom constructors.

Once this is fixed, the tests should be updated accordingly.
exact = Exact case_sensitivity=True
regex = Regex case_sensitivity=True
Test.specify 'Should match a single name with a single exact criterion' <|
Matching.match_single_criterion "foo" "foo" exact . should_equal True
Matching.match_single_criterion "foo" "f.*" exact . should_equal False
Matching.match_single_criterion "foo" "Foo" exact . should_equal False

Test.specify 'Should correctly handle Unicode folding with exact matching' <|
Matching.match_single_criterion '\u00E9' '\u0065\u{301}' exact . should_equal True
Matching.match_single_criterion 'é' '\u00E9' exact . should_equal True
Matching.match_single_criterion 'é' 'ę' exact . should_equal False

Test.specify 'Should match a single name with a single regex criterion' <|
Matching.match_single_criterion "foo" "foo" regex . should_equal True
Matching.match_single_criterion "foo" "f.*" regex . should_equal True
Matching.match_single_criterion "foo" "F.*" regex . should_equal False

Test.specify 'Should support case-insensitive matching' <|
Matching.match_single_criterion "foo" "F.*" (Regex case_sensitivity=Case_Insensitive) . should_equal True
Matching.match_single_criterion "foo" "Foo" (Exact case_sensitivity=Case_Insensitive) . should_equal True

Matching.match_single_criterion "foo" "fF.*" (Regex case_sensitivity=Case_Insensitive) . should_equal False
Matching.match_single_criterion "foo" "Foos" (Exact case_sensitivity=Case_Insensitive) . should_equal False

## TODO this may not be how we want this to work, but this test is
included to explicitly illustrate how the current implementation
behaves in such corner cases
Matching.match_single_criterion "β" "B" (Exact case_sensitivity=Case_Insensitive) . should_equal False

Test.specify 'Should match a list of names with a list of criteria, correctly handling reordering' <|
Matching.match_criteria ["foo", "bar", "baz"] ["baz", "foo"] reorder=True . should_equal ["baz", "foo"]
Matching.match_criteria ["foo", "bar", "baz"] ["baz", "foo"] reorder=False . should_equal ["foo", "baz"]

Test.specify 'Should allow multiple matches to a single criterion (Regex)' <|
Matching.match_criteria ["foo", "bar", "baz", "quux"] ["b.*"] reorder=True matching_strategy=regex . should_equal ["bar", "baz"]
Matching.match_criteria ["foo", "bar", "baz", "quux"] ["b.*", "foo"] reorder=False matching_strategy=regex . should_equal ["foo", "bar", "baz"]

Test.specify 'Should include the object only with the first criterion that matched it, avoiding duplication' <|
Matching.match_criteria ["foo", "bar", "baz", "zap"] [".*z.*", "b.*"] reorder=True matching_strategy=regex . should_equal ["baz", "zap", "bar"]
Matching.match_criteria ["foo", "bar", "baz", "zap"] [".*z.*", "b.*"] reorder=False matching_strategy=regex . should_equal ["bar", "baz", "zap"]

Test.specify 'Should correctly handle criteria which did not match anything' <|
Matching.match_criteria ["foo", "bar", "baz"] ["baz", "unknown_column"] reorder=True on_problems=Problem_Behavior.Report_Error . should_fail_with No_Matches_Found
result = Matching.match_criteria ["foo", "bar", "baz"] ["baz", "unknown_column_1", "unknown_column_2"] reorder=False on_problems=Problem_Behavior.Report_Error . catch
result . should_equal <| No_Matches_Found ["unknown_column_1", "unknown_column_2"]

warnings_builder = Vector.new_builder
report_warning warning =
warnings_builder.append warning
warning_system = Warnings.Warning_System report_warning
Matching.match_criteria ["foo", "bar", "baz"] ["baz", "unknown_column_1", "unknown_column_2"] reorder=True on_problems=Problem_Behavior.Report_Warning warnings=warning_system . should_equal ["baz"]
reported = warnings_builder.to_vector
reported.length . should_equal 1
reported.first . should_equal <| No_Matches_Found ["unknown_column_1", "unknown_column_2"]

Test.specify 'Should correctly work with complex object using a function extracting their names' <|
pairs = [Pair "foo" 42, Pair "bar" 33, Pair "baz" 10, Pair "foo" 0, Pair 10 10]
selected = [Pair "bar" 33, Pair "foo" 42, Pair "foo" 0]
Matching.match_criteria pairs ["bar", "foo"] reorder=True name_mapper=_.first . should_equal selected

Matching.match_criteria [1, 2, 3] ["2"] name_mapper=_.to_text . should_equal [2]

Test.specify 'Should correctly forward errors' <|
Matching.match_criteria (Error.throw Foo_Error) [] . should_fail_with Foo_Error
Matching.match_criteria [] (Error.throw Foo_Error) . should_fail_with Foo_Error
Matching.match_criteria [] [] (Error.throw Foo_Error) . should_fail_with Foo_Error
Matching.match_criteria ["a"] ["a"] name_mapper=(_-> Error.throw Foo_Error) . should_fail_with Foo_Error
Matching.match_criteria ["a"] ["a"] name_mapper=_.nonexistent_function . should_fail_with No_Such_Method_Error

main = Test.Suite.run_main here.spec

0 comments on commit 107128a

Please sign in to comment.