Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cross_join support to Database Table #7234

Merged
merged 14 commits into from
Jul 10, 2023
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -508,6 +508,7 @@
- [Added `replace` to in-memory table. Changed replace for `Text`, in-memory
`Column`, and in-memory `Table` to take a `Regex` in addition to a `Text`.]
[7223]
- [Added `cross_join` support to database tables.][7234]

[debug-shortcuts]:
https://github.com/enso-org/enso/blob/develop/app/gui/docs/product/shortcuts.md#debug
Expand Down Expand Up @@ -735,6 +736,7 @@
[7166]: https://github.com/enso-org/enso/pull/7166
[7174]: https://github.com/enso-org/enso/pull/7174
[7223]: https://github.com/enso-org/enso/pull/7223
[7234]: https://github.com/enso-org/enso/pull/7234

#### Enso Compiler

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ import Standard.Table.Data.Expression.Expression
import Standard.Table.Data.Expression.Expression_Error
import Standard.Table.Data.Join_Condition.Join_Condition
import Standard.Table.Data.Join_Kind.Join_Kind
import Standard.Table.Data.Join_Kind_Cross.Join_Kind_Cross
import Standard.Table.Data.Match_Columns as Match_Columns_Helpers
import Standard.Table.Data.Report_Unmatched.Report_Unmatched
import Standard.Table.Data.Row.Row
Expand Down Expand Up @@ -1007,13 +1008,21 @@ type Table
table.join other on=[Join_Condition.Equals "A" "A", Join_Condition.Equals "B" "B"]
@on Widget_Helpers.make_join_condition_selector
join : Table -> Join_Kind -> Vector (Join_Condition | Text) | Text -> Text -> Problem_Behavior -> Table
join self right join_kind=Join_Kind.Left_Outer on=[Join_Condition.Equals self.column_names.first] right_prefix="Right " on_problems=Report_Warning =
join self right join_kind=Join_Kind.Left_Outer on=(default_join_condition self join_kind) right_prefix="Right " on_problems=Report_Warning =
GregoryTravis marked this conversation as resolved.
Show resolved Hide resolved
self.join_or_cross_join right join_kind on right_prefix on_problems

## PRIVATE
Implementation of both `join` and `cross_join`.
join_or_cross_join : Table -> Join_Kind | Join_Kind_Cross -> Vector (Join_Condition | Text) | Text -> Text -> Problem_Behavior -> Table
join_or_cross_join self right join_kind on right_prefix on_problems =
GregoryTravis marked this conversation as resolved.
Show resolved Hide resolved
can_proceed = if Table_Helpers.is_table right . not then Error.throw (Type_Error.Error Table right "right") else
same_backend = case right of
_ : Table -> True
_ -> False
join_conditions_ok = join_kind != Join_Kind_Cross.Cross || on == []
if same_backend . not then Error.throw (Illegal_Argument.Error "Currently cross-backend joins are not supported. You need to upload the in-memory table before joining it with a database one, or materialize this table.") else
True
if join_conditions_ok . not then Error.throw (Illegal_Argument.Error "Cross join does not allow join conditions") else
True
if can_proceed then
left = self
new_table_name = left.name + "_" + right.name
Expand Down Expand Up @@ -1051,6 +1060,7 @@ type Table
Join_Kind.Full -> SQL_Join_Kind.Full
Join_Kind.Left_Exclusive -> SQL_Join_Kind.Left
Join_Kind.Right_Exclusive -> SQL_Join_Kind.Right
Join_Kind_Cross.Cross -> SQL_Join_Kind.Cross

problem_builder.attach_problems_before on_problems <|
new_from = From_Spec.Join sql_join_kind left_setup.subquery right_setup.subquery on_expressions
Expand Down Expand Up @@ -1086,15 +1096,16 @@ type Table

? Result Ordering

Rows in the result are first ordered by the order of the corresponding
rows from the left table and then the order of rows from the right
table. This applies only if the order of the rows was specified (for
example, by sorting the table; in-memory tables will keep the memory
layout order while for database tables the order may be unspecified).
The ordering of rows in the resulting table is not specified.
GregoryTravis marked this conversation as resolved.
Show resolved Hide resolved
cross_join : Table -> Integer | Nothing -> Text -> Problem_Behavior -> Table
cross_join self right right_row_limit=100 right_prefix="Right " on_problems=Report_Warning =
_ = [right, right_row_limit, right_prefix, on_problems]
Error.throw (Unsupported_Database_Operation.Error "Table.cross_join is not implemented yet for the Database backends.")
if check_db_table "right" right then
limit_problems = case right_row_limit.is_nothing.not && (right.row_count > right_row_limit) of
GregoryTravis marked this conversation as resolved.
Show resolved Hide resolved
True ->
[Cross_Join_Row_Limit_Exceeded.Error right_row_limit right.row_count]
False -> []
on_problems.attach_problems_before limit_problems <|
self.join_or_cross_join right join_kind=Join_Kind_Cross.Cross on=[] right_prefix on_problems

## ALIAS Join By Row Position
Joins two tables by zipping rows from both tables table together - the
Expand Down Expand Up @@ -2114,3 +2125,10 @@ check_db_table arg_name table =
False ->
Error.throw (Illegal_Argument.Error "Currently cross-backend operations are not supported. Materialize the table using `.read` before mixing it with an in-memory Table.")
True -> True

## By default, join on the first column, unless it's a cross join, in which
case there are no join conditions.
default_join_condition : Table -> Join_Kind | Join_Kind_Cross -> Join_Condition
default_join_condition table join_kind = case join_kind of
Join_Kind_Cross.Cross -> []
_ -> [Join_Condition.Equals table.column_names.first]
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
from Standard.Base import all

type Join_Kind_Cross
## Cartesian product: each row of the left table is paired with each row of
the right table.
Cross

## PRIVATE
Returns the SQL representation of this join kind as text.
to_sql : Text
to_sql self = "INNER JOIN"
GregoryTravis marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -1,25 +1,24 @@
from Standard.Base import all

import Standard.Base.Errors.Illegal_Argument.Illegal_Argument
import Standard.Base.Errors.Illegal_State.Illegal_State
import Standard.Table.Data.Join_Kind_Cross.Join_Kind_Cross
import Standard.Test.Extensions

from Standard.Database.Errors import Unsupported_Database_Operation
from Standard.Table import all hiding Table
from Standard.Table.Errors import all

from Standard.Database.Errors import Unsupported_Database_Operation

from Standard.Test import Test, Problems
import Standard.Test.Extensions

from project.Common_Table_Operations.Util import expect_column_names, run_default_backend


main = run_default_backend spec

spec setup =
prefix = setup.prefix
table_builder = setup.table_builder
materialize = setup.materialize
db_todo = if setup.is_database.not then Nothing else "Table.cross_join is still WIP for the DB backend."
Test.group prefix+"Table.cross_join" pending=db_todo <|
Test.group prefix+"Table.cross_join" <|
Test.specify "should allow to create a cross product of two tables in the right order" <|
t1 = table_builder [["X", [1, 2]], ["Y", [4, 5]]]
t2 = table_builder [["Z", ['a', 'b']], ["W", ['c', 'd']]]
Expand Down Expand Up @@ -76,16 +75,18 @@ spec setup =

Test.specify "should ensure 1-1 mapping even with duplicate rows" <|
t1 = table_builder [["X", [2, 1, 2, 2]], ["Y", [5, 4, 5, 5]]]
t2 = table_builder [["Z", ['a', 'a']]]
t2 = table_builder [["Z", ['a', 'b', 'a', 'b']]]

t3 = t1.cross_join t2
expect_column_names ["X", "Y", "Z"] t3
t3.row_count . should_equal 8
t3.row_count . should_equal 16
r = materialize t3 . rows . map .to_vector
r.length . should_equal 8
r.length . should_equal 16
r1 = [2, 5, 'a']
r2 = [1, 4, 'a']
expected_rows = [r1, r1, r2, r2, r1, r1, r1, r1]
r3 = [2, 5, 'b']
r4 = [1, 4, 'b']
expected_rows = [r1, r3, r1, r3, r2, r4, r2, r4, r1, r3, r1, r3, r1, r3, r1, r3]
case setup.is_database of
True -> r.should_contain_the_same_elements_as expected_rows
False -> r.should_equal expected_rows
Expand Down Expand Up @@ -149,4 +150,13 @@ spec setup =
r4 = [100, 4, 'b', 'd']
r5 = [100, 4, 'a', 'x']
expected_rows = [r0, r1, r2, r3, r4, r5]
r.should_equal expected_rows
case setup.is_database of
True -> r.should_contain_the_same_elements_as expected_rows
False -> r.should_equal expected_rows

if setup.is_database then
Test.specify "Cross join via a direct call to .join should not allow join conditions" <|
t1 = table_builder [["X", [1, 2]], ["Y", [4, 5]]]
t2 = table_builder [["Z", ['a', 'b']], ["W", ['c', 'd']]]
j = t1.join t2 join_kind=Join_Kind_Cross.Cross on=[Join_Condition.Equals 'X' 'Z']
j . should_fail_with Illegal_Argument