-
Notifications
You must be signed in to change notification settings - Fork 323
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sorting Tables #1471
Sorting Tables #1471
Changes from 3 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
from Base import all | ||
|
||
type Order_Rule | ||
## A rule used for sorting table-like structures. | ||
|
||
Arguments: | ||
- column: a value representing the underlying storage this rule is | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please wrap directly under the bullet, not at a new level of indentation. |
||
sorting by. This type does not specify the underlying | ||
representation of a column, assuming that the sorting engine | ||
defines its own column representation. | ||
- comparator: a function taking two elements of the underlying column | ||
and returning an `Ordering`. The function may be | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. "Underlying column" -> "data being sorted by" or similar. |
||
`Nothing`, in which case a natural ordering will be used. | ||
Note that certain table backends (such us database | ||
connectors) may choose to ignore this field. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe instead What I'm trying to convey is that the backend should check this field and fail with an error if it cannot support the operation, instead of silently ignoring its value (which could be super-confusing). |
||
- order: specifies whether the table should be sorted in an ascending | ||
or descending order. The default value of `Nothing` delegates | ||
the decision to the sorting function. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should say something about it being the |
||
- missing_last: whether the missing values should be placed at the | ||
beginning or end of the sorted table. Note that this | ||
argument is independent from `order`, i.e. missing | ||
values will always be sorted according to this rule, | ||
ignoring the ascending / descending setting. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd add here the note as above, about |
||
type Order_Rule column comparator=Nothing order=Nothing missing_last=Nothing | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Explain what |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,8 +2,10 @@ from Base import all | |
import Table.Io.Csv | ||
import Table.Data.Column | ||
import Base.System.Platform | ||
from Table.Data.Order_Rule as Order_Rule_Module import Order_Rule | ||
|
||
polyglot java import org.enso.table.data.table.Table as Java_Table | ||
polyglot java import org.enso.table.operations.OrderBuilder | ||
|
||
## Represents a column-oriented table data structure. | ||
type Table | ||
|
@@ -165,6 +167,107 @@ type Table | |
group by=Nothing = | ||
Aggregate_Table (this.java_table.group by) | ||
|
||
## Sorts the table according to the specified rules. | ||
|
||
Arguments: | ||
- by: specifies the columns used for reordering the table. This | ||
argument may be one of: | ||
- a text: the text is treated as a column name | ||
- a column: any column, that may or may not belong to this | ||
table. Sorting by a column will result in reordering the | ||
rows of this table in a way that would result in sorting | ||
the given column. | ||
- an order rule: specifies both the sorting column and | ||
additional settings, that will take precedence over the | ||
global parameters of this sort operation. The `column` field | ||
of the rule may be a text or a column, with the semantics | ||
described above. | ||
- a vector of any of the above: this will result in | ||
a hierarchical sorting, such that the first rule is applied | ||
first, the second is used for breaking ties, etc. | ||
- order: specifies the default sort order for this operation. All the | ||
rules specified in the `by` argument will default to this | ||
setting, unless specified in the rule. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Say that this is |
||
- missing_last: specifies the default placement of missing values when | ||
compared to non-missing ones. This setting may be | ||
overriden by the particular rules of the `by` argument. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This should contain more details (like for |
||
|
||
> Example | ||
Sorting `table` in ascending order by the value in column `'Quantity'` | ||
table.sort by='Quantity' | ||
|
||
> Example | ||
Sorting `table` in descending order by the value in column `'Quantity'`, | ||
placing missing values at the top of the table. | ||
table.sort by='Quantity' order=Sort_Order.Descending missing_last=False | ||
|
||
> Example | ||
Sorting `table` in ascending order by the value in column `'Quantity'`, | ||
using the value in column `'Rating'` for breaking ties. | ||
table.sort by=['Quantity', 'Rating'] | ||
|
||
> Example | ||
Sorting `table` in ascending order by the value in column `'Quantity'`, | ||
using the value in column `'Rating'` in descending order for breaking | ||
ties. | ||
table.sort by=['Quantity', Order_Rule 'Rating' (order=Sort_Order.Descending)] | ||
|
||
> Example | ||
Sorting `table` in ascending order by the value in an externally | ||
computed column, using the value in column `'Rating'` for breaking | ||
ties. | ||
quality_ratio = table.at 'Rating' / table.at 'Price' | ||
table.sort by=[quality_ratio, 'Rating'] | ||
|
||
> Sorting `table` in ascending order, by the value in column | ||
`'position'`, using a custom comparator function. | ||
manhattan_comparator a b = (a.x.abs + a.y.abs) . compare_to (b.x.abs + b.y.abs) | ||
table.sort by=(Order_Rule 'position' comparator=manhattan_comparator) | ||
sort : Text | Column.Column | Order_Rule | Vector.Vector (Text | Column.Column | Order_Rule) -> Sort_Order -> Boolean -> Table | ||
sort by order=Sort_Order.Ascending missing_last=True = | ||
rules = this.build_java_order_rules by order missing_last | ||
fallback_cmp = here.comparator_to_java .compare_to | ||
mask = OrderBuilder.buildOrderMask rules.to_array fallback_cmp | ||
new_table = this.java_table.applyMask mask | ||
Table new_table | ||
|
||
## PRIVATE | ||
build_java_order_rules rules order missing_last = case rules of | ||
Text -> [this.build_java_order_rule rules order missing_last] | ||
Column.Column _ -> [this.build_java_order_rule rules order missing_last] | ||
Order_Rule _ _ _ _ -> [this.build_java_order_rule rules order missing_last] | ||
Vector.Vector _ -> rules.map (this.build_java_order_rule _ order missing_last) | ||
|
||
## PRIVATE | ||
build_java_order_rule rule order missing_last = | ||
order_bool = case order of | ||
Sort_Order.Ascending -> True | ||
_ -> False | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't it better to just add If for some weird reason we get something else here, won't it be more meaningful to fail with inexhaustive pattern match saying that the argument was unexpected instead of silently falling back to some default? |
||
case rule of | ||
Text -> | ||
column = this.at rule | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Since |
||
OrderBuilder.OrderRule.new column.java_column Nothing order_bool missing_last | ||
Column.Column c -> | ||
OrderBuilder.OrderRule.new c Nothing order_bool missing_last | ||
Order_Rule col_ref cmp rule_order rule_nulls_last -> | ||
c = case col_ref of | ||
Text -> this.at col_ref . java_column | ||
Column.Column c -> c | ||
o = case rule_order of | ||
Nothing -> order_bool | ||
Sort_Order.Ascending -> True | ||
_ -> False | ||
nulls = case rule_nulls_last of | ||
Nothing -> missing_last | ||
_ -> rule_nulls_last | ||
java_cmp = case cmp of | ||
Nothing -> Nothing | ||
c -> here.comparator_to_java c | ||
OrderBuilder.OrderRule.new c java_cmp o nulls | ||
|
||
## PRIVATE | ||
comparator_to_java cmp x y = cmp x y . to_sign | ||
|
||
## Represents a table with grouped rows. | ||
type Aggregate_Table | ||
type Aggregate_Table java_table | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"The data by which x is being sorted" or similar.