Summary
Large sort got several well deserved performance improvements and bug fixes. This time I've updated a bit out of my comfort zone of code readability to make it more performant. Nevertheless now it performs almost as fast as the native sort
command that's build in C
.
Performance Improvements
Most of the improvements has been done on the Merge operation. The strategy has always been to reduce the number of comparisons and operations. This time I also considered the what is the most likely scenario after merging a record.
There were two major areas of improvements:
- During the binary search I've reduced to do less comparisons and removed comparison that was not the most common case.
- While getting merging the stream reader records now it's opportunistic assumes that the next record is within the current stream thus it only re-index the reader only when necessary.
- Removed
array.length - x
for comparison using dedicate variables that get updated when it's necessary. - The last stream is now processed linearly without any re-index logic.
Bug Fixes
- There was a bug while deleting the temporary files that did not clean the temporary folder.
- Thanks to @cupix-eddy-lee who found and fixed a bug that affected the last stream synchronously as it didn't finish writing.
- Thanks to @ssideleau who found a bug while handling listening before exiting the process listening to
SIGKILL
instead ofSIGINT
andSIGTERM