Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use insertion sort to sort short ByteString #123

Closed
wants to merge 1 commit into from

Conversation

watashi
Copy link

@watashi watashi commented Apr 22, 2017

The constant overhead of counting sort for sorting short ByteString is quite a lot and insertion sort performs much better.

Here is the benchmark results of comparing insertion sort and counting sort when the length is 22. Insertion sort still outperforms counting sort even for worst test case ("0ZYXWVUTSMLKJIHGFEDCBA").

benchmarking ZYXWVUTSRLKJIHGFEDCBA0/insertion
time                 269.2 ns   (225.5 ns .. 297.4 ns)
                     0.906 R²   (0.884 R² .. 0.932 R²)
mean                 221.8 ns   (201.5 ns .. 246.4 ns)
std dev              70.75 ns   (57.65 ns .. 81.58 ns)
variance introduced by outliers: 99% (severely inflated)

benchmarking ZYXWVUTSRLKJIHGFEDCBA0/counting
time                 760.5 ns   (722.7 ns .. 799.7 ns)
                     0.968 R²   (0.949 R² .. 0.982 R²)
mean                 793.8 ns   (750.9 ns .. 857.7 ns)
std dev              178.1 ns   (107.6 ns .. 249.6 ns)
variance introduced by outliers: 98% (severely inflated)

benchmarking 0ZYXWVUTSMLKJIHGFEDCBA/insertion
time                 659.8 ns   (624.7 ns .. 700.6 ns)
                     0.970 R²   (0.957 R² .. 0.982 R²)
mean                 684.7 ns   (650.9 ns .. 741.5 ns)
std dev              143.3 ns   (93.84 ns .. 213.2 ns)
variance introduced by outliers: 98% (severely inflated)

benchmarking 0ZYXWVUTSMLKJIHGFEDCBA/counting
time                 684.3 ns   (676.3 ns .. 695.1 ns)
                     0.995 R²   (0.992 R² .. 0.997 R²)
mean                 689.9 ns   (683.0 ns .. 702.2 ns)
std dev              30.65 ns   (21.07 ns .. 47.53 ns)
variance introduced by outliers: 61% (severely inflated)

benchmarking 0000000000000000000000/insertion
time                 183.7 ns   (172.8 ns .. 196.4 ns)
                     0.966 R²   (0.950 R² .. 0.983 R²)
mean                 187.5 ns   (177.9 ns .. 203.0 ns)
std dev              40.73 ns   (27.58 ns .. 58.50 ns)
variance introduced by outliers: 98% (severely inflated)

benchmarking 0000000000000000000000/counting
time                 632.5 ns   (597.0 ns .. 672.4 ns)
                     0.971 R²   (0.959 R² .. 0.983 R²)
mean                 644.8 ns   (614.4 ns .. 702.6 ns)
std dev              138.4 ns   (79.85 ns .. 230.4 ns)
variance introduced by outliers: 98% (severely inflated)

benchmarking 0101010101010101010101/insertion
time                 265.1 ns   (260.9 ns .. 269.6 ns)
                     0.995 R²   (0.992 R² .. 0.997 R²)
mean                 274.0 ns   (265.8 ns .. 309.9 ns)
std dev              47.83 ns   (11.83 ns .. 106.4 ns)
variance introduced by outliers: 97% (severely inflated)

benchmarking 0101010101010101010101/counting
time                 644.0 ns   (608.3 ns .. 688.9 ns)
                     0.961 R²   (0.938 R² .. 0.978 R²)
mean                 678.3 ns   (635.8 ns .. 748.3 ns)
std dev              175.4 ns   (119.1 ns .. 263.6 ns)
variance introduced by outliers: 99% (severely inflated)

benchmarking 0900908393591323446961/insertion
time                 354.7 ns   (335.8 ns .. 374.7 ns)
                     0.973 R²   (0.962 R² .. 0.983 R²)
mean                 355.2 ns   (337.7 ns .. 379.9 ns)
std dev              64.54 ns   (45.91 ns .. 87.50 ns)
variance introduced by outliers: 97% (severely inflated)

benchmarking 0900908393591323446961/counting
time                 639.1 ns   (632.9 ns .. 645.7 ns)
                     0.993 R²   (0.987 R² .. 0.997 R²)
mean                 686.7 ns   (646.0 ns .. 782.6 ns)
std dev              191.0 ns   (27.06 ns .. 337.4 ns)
variance introduced by outliers: 99% (severely inflated)

benchmarking MUHHTJUPIDXRVEHAXZOSPR/insertion
time                 331.3 ns   (313.2 ns .. 350.5 ns)
                     0.975 R²   (0.966 R² .. 0.985 R²)
mean                 338.5 ns   (323.5 ns .. 359.4 ns)
std dev              64.28 ns   (46.29 ns .. 90.01 ns)
variance introduced by outliers: 97% (severely inflated)

benchmarking MUHHTJUPIDXRVEHAXZOSPR/counting
time                 760.6 ns   (720.3 ns .. 809.1 ns)
                     0.962 R²   (0.932 R² .. 0.982 R²)
mean                 781.1 ns   (735.6 ns .. 847.3 ns)
std dev              183.7 ns   (130.1 ns .. 260.8 ns)
variance introduced by outliers: 98% (severely inflated)

@watashi
Copy link
Author

watashi commented Apr 22, 2017

@dcoutts
Copy link
Contributor

dcoutts commented Sep 1, 2017

Wow. I didn't think anyone used this function! @watashi can I ask you in what sort of use case is this useful and that the performance matters? I'm really curious!

In principle I have no objection (assuming tests are passing etc). I'll look at the code properly. Anyone else free to do so too of course.

Copy link
Member

@sjakobi sjakobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sort appears to be very well tested:

bytestring/tests/Properties.hs

Lines 1167 to 1177 in 95fe6bd

prop_sort1BB xs = sort xs == (P.unpack . P.sort . P.pack) xs
prop_sort2BB xs = (not (null xs)) ==> (P.head . P.sort . P.pack $ xs) == minimum xs
prop_sort3BB xs = (not (null xs)) ==> (P.last . P.sort . P.pack $ xs) == maximum xs
prop_sort4BB xs ys =
(not (null xs)) ==>
(not (null ys)) ==>
(P.head . P.sort) (P.append (P.pack xs) (P.pack ys)) == min (minimum xs) (minimum ys)
prop_sort5BB xs ys =
(not (null xs)) ==>
(not (null ys)) ==>
(P.last . P.sort) (P.append (P.pack xs) (P.pack ys)) == max (maximum xs) (maximum ys)

Of course I'm also curious what sort is used for, but that shouldn't be a reason to hold off merging any longer! :)

withForeignPtr input (\x -> insertionSort p (x `plusPtr` s) l)
where
-- threshold is where we find insertion sort still outperforms counting
-- sort even for worst case.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice to clarify, what is the worst case. Could you possibly commit benchmarks, which you used to derive this threshold?

@sjakobi sjakobi added this to the 0.10.12.0 milestone Jul 2, 2020
@sjakobi
Copy link
Member

sjakobi commented Jul 14, 2020

Ping @watashi. I'd like to include this PR in the next minor release. Could you address @Bodigrim's review above?

@sjakobi sjakobi modified the milestones: 0.10.12.0, Soon Aug 19, 2020
@Bodigrim
Copy link
Contributor

Superseded by #267, closing.

@Bodigrim Bodigrim closed this Aug 25, 2020
@Bodigrim Bodigrim modified the milestones: Soon, 0.11.0.0 Aug 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants