Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvement to the Natural Order Sort #3276

Merged
merged 11 commits into from
Feb 16, 2022
Original file line number Diff line number Diff line change
Expand Up @@ -26,55 +26,72 @@ compare text1 text2 =
iter2 = BreakIterator.getCharacterInstance
iter2.setText text2

is_digit=c -> c>=48 && c<=57
## check if a single character is between '0' and '9'
ascii_code_zero = 48
ascii_code_nine = 57
is_digit=character -> character>=ascii_code_zero && character<=ascii_code_nine

## Find the end of a number and then return the substring, value and new prev and next values
jdunkerley marked this conversation as resolved.
Show resolved Hide resolved
get_number text prev next iter =
jdunkerley marked this conversation as resolved.
Show resolved Hide resolved
find_number text next iter =
## Find end of number and return pair of index and flag if reached end
loop text next iter =
new_next = iter.next
if (new_next == -1) then -next else
if (new_next == -1) then (Pair next True) else
substring = Text_Utils.substring text next new_next
c = Text_Utils.get_chars substring . at 0
if (is_digit c).not then next else
@Tail_Call find_number text new_next iter
character = Text_Utils.get_chars substring . at 0
if (is_digit character).not then (Pair next False) else
@Tail_Call loop text new_next iter

n = find_number text next iter
s = Text_Utils.substring text prev n.abs
pair = loop text next iter
substring = Text_Utils.substring text prev pair.first

## TODO [RW] Currently there is no `Integer.parse` method, so we
parse a decimal and convert it to an integer. Once
https://www.pivotaltracker.com/story/show/181176522 is
implemented, this should be changed to use `Integer.parse`.
d = Decimal.parse s . floor
implemented, this should be changed to use `Integer.parse`.
decimal = Decimal.parse substring . floor

i = if n < 0 then -1 else iter.current
[s, d, n, i]
next_index = if pair.second then -1 else iter.current
[substring, decimal, pair.first, next_index]


## Ordering: Nothing < Number < Text
order prev1 next1 prev2 next2 =
case (next1 == -1) of
True ->
if (next2 == -1) then Ordering.Equal else Ordering.Less
False ->
if (next2 == -1) then Ordering.Greater else
s1 = Text_Utils.substring text1 prev1 next1
c1 = Text_Utils.get_chars s1 . at 0

s2 = Text_Utils.substring text2 prev2 next2
c2 = Text_Utils.get_chars s2 . at 0

case (is_digit c1) of
True ->
if (is_digit c2).not then Ordering.Less else
a1 = get_number text1 prev1 next1 iter1
a2 = get_number text2 prev2 next2 iter2

if (a1.at 1) != (a2.at 1) then (a1.at 1).compare_to (a2.at 1) else
if (a1.at 0) != (a2.at 0) then (a1.at 0).compare_to (a2.at 0) else
@Tail_Call order (a1.at 2) (a1.at 3) (a2.at 2) (a2.at 3)
False ->
if (is_digit c2) then Ordering.Greater else
if s2 != s1 then s1.compare_to s2 else
case (Pair (next1 == -1) (next2 == -1)) of
Pair True True -> Ordering.Equal
Pair True False -> Ordering.Less
Pair False True -> Ordering.Greater
Pair False False ->
substring1 = Text_Utils.substring text1 prev1 next1
first_char_1 = Text_Utils.get_chars substring1 . at 0

substring2 = Text_Utils.substring text2 prev2 next2
first_char_2 = Text_Utils.get_chars substring2 . at 0

tmp = Pair (is_digit first_char_1) (is_digit first_char_2)
## ToDo: Move to case on second block
Appears to be an issue using a nested case statement on a pair
https://www.pivotaltracker.com/story/show/181280737
if (tmp.first && tmp.second.not) then Ordering.Less else
if (tmp.first.not && tmp.second) then Ordering.Greater else
case tmp.first.not of
True ->
text_comparison = substring1.compare_to substring2
if text_comparison != Ordering.Equal then text_comparison else
@Tail_Call order next1 iter1.next next2 iter2.next
False ->
parsed1 = get_number text1 prev1 next1 iter1
num_text1 = parsed1.at 0
value1 = parsed1.at 1

parsed2 = get_number text2 prev2 next2 iter2
num_text2 = parsed2.at 0
value2 = parsed2.at 1

value_comparison = value1.compare_to value2
if value_comparison != Ordering.Equal then value_comparison else
text_comparison = num_text1.compare_to num_text2
if text_comparison != Ordering.Equal then text_comparison else
@Tail_Call order (parsed1.at 2) (parsed1.at 3) (parsed2.at 2) (parsed2.at 3)

order 0 iter1.next 0 iter2.next
2 changes: 1 addition & 1 deletion distribution/lib/Standard/Test/0.0.0-dev/src/Faker.enso
Original file line number Diff line number Diff line change
Expand Up @@ -36,4 +36,4 @@ make_string template generator =
0.up_to template.length . each i->
a = template.at i
output.set_at i (a.at (generator.nextInt a.length))
Text_Utils.from_utf_8 output
Text_Utils.from_utf_8 output
6 changes: 5 additions & 1 deletion test/Benchmarks/src/Natural_Order_Sort.enso
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,11 @@ main =
l = Faker.upper_case_letters
n = Faker.numbers
template = [l, l, l, n, n, n, n, n, l]
random_generator = Faker.make_generator 1644575867

## No specific significance to this constant, just fixed to make generated set deterministic
fixed_random_seed = 1644575867
random_generator = Faker.make_generator fixed_random_seed

unsorted = 0.up_to here.vector_size . map _->(Faker.make_string template random_generator)

Bench.measure (unsorted.sort by=Natural_Order.compare) "Natural Order" here.iter_size here.num_iterations