Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sorting Tables #1471

Merged
merged 4 commits into from
Feb 11, 2021
Merged

Sorting Tables #1471

merged 4 commits into from
Feb 11, 2021

Conversation

kustosz
Copy link
Contributor

@kustosz kustosz commented Feb 11, 2021

Pull Request Description

Introduces sorting operations on tables.

Important Notes

Checklist

Please include the following checklist in your PR:

  • The documentation has been updated if necessary.
  • All code conforms to the Scala, Java, and Rust style guides.
  • All documentation and configuration conforms to the markdown and YAML style guides.
  • All code has been tested where possible.

Copy link
Contributor

@iamrecursion iamrecursion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you consider this API to be stable? If not, please tag the unstable parts of it with the new tag UNSTABLE.

## A rule used for sorting table-like structures.

Arguments:
- column: a value representing the underlying storage this rule is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The data by which x is being sorted" or similar.

## A rule used for sorting table-like structures.

Arguments:
- column: a value representing the underlying storage this rule is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please wrap directly under the bullet, not at a new level of indentation.

Comment on lines 11 to 12
- comparator: a function taking two elements of the underlying column
and returning an `Ordering`. The function may be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Underlying column" -> "data being sorted by" or similar.

Comment on lines 16 to 18
- order: specifies whether the table should be sorted in an ascending
or descending order. The default value of `Nothing` delegates
the decision to the sorting function.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should say something about it being the Ordering type in the standard library.

argument is independent from `order`, i.e. missing
values will always be sorted according to this rule,
ignoring the ascending / descending setting.
type Order_Rule column comparator=Nothing order=Nothing missing_last=Nothing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explain what Nothing means for missing_last.

Comment on lines 188 to 190
- order: specifies the default sort order for this operation. All the
rules specified in the `by` argument will default to this
setting, unless specified in the rule.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say that this is Bases Ordering.

Comment on lines 191 to 193
- missing_last: specifies the default placement of missing values when
compared to non-missing ones. This setting may be
overriden by the particular rules of the `by` argument.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should contain more details (like for Order_Rule).

Copy link
Member

@radeusgd radeusgd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great.

I have some comments on docs etc.

Comment on lines 14 to 15
Note that certain table backends (such us database
connectors) may choose to ignore this field.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead choose to ignore it should say not support this field being set to anything other than Nothing?

What I'm trying to convey is that the backend should check this field and fail with an error if it cannot support the operation, instead of silently ignoring its value (which could be super-confusing).

Comment on lines 19 to 23
- missing_last: whether the missing values should be placed at the
beginning or end of the sorted table. Note that this
argument is independent from `order`, i.e. missing
values will always be sorted according to this rule,
ignoring the ascending / descending setting.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd add here the note as above, about Nothing delegating this decision to the sorting function.

Comment on lines 243 to 245
order_bool = case order of
Sort_Order.Ascending -> True
_ -> False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it better to just add Sort_Order.Descending here?

If for some weird reason we get something else here, won't it be more meaningful to fail with inexhaustive pattern match saying that the argument was unexpected instead of silently falling back to some default?

_ -> False
case rule of
Text ->
column = this.at rule
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since at is used here in this way, I'd suggest changing its signature to Text -> Column ! UnknownColumnError and making at throw if column is not found instead of returning Nothing. Otherwise this code may fail in a strange way.

* Returns a new storage, ordered according to the rules specified in a mask.
*
* @param mask an order mask specifying the reordering
* @return a storage resulting from applying the reordering rules
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It returns an Index, not a Storage.

@kustosz kustosz merged commit 93b6680 into main Feb 11, 2021
@kustosz kustosz deleted the wip/mk/sort_table branch February 11, 2021 15:50
iamrecursion pushed a commit that referenced this pull request Feb 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p-highest Should be completed ASAP
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants