-
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Small performance enhancements for cut: bytes cutting #357
Conversation
Get rid of half filled heuristic and use unsafe array indexing because the indices should always be correct.
@@ -73,7 +75,8 @@ impl<R: Reader> BufReader<R> { | |||
if buffer_used == 0 { return bytes_consumed; } | |||
|
|||
for idx in range(self.start, self.end) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this loop be for byte in self.buffer.slice_from(self.start, self.end)
, and thus be safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just did a benchmark of the different solutions (with -b 1
):
- 0.073622189 seconds for the current pull, which uses unsafe indexing.
- 0.096472123 seconds for the current
uutils/coreutils
master, which uses safe indexing. - 0.093891360 seconds for
for (idx, byte) in self.buffer.slice(self.start, self.end).iter().enumerate()
Forgoing the index of the newline byte is not possible as it is needed to update the start of the filled buffer.
The choice is between "safety" and all out performance. The safety guaranteed by the compiler is only to fail
for invalid/out of bounds indices. This means it does an start <= end && end <= len
assert for every byte and just crashes if it fails. This does nothing to guarantee that the indexing in your implementation of the algorithm isn't off by one, but comes with a large runtime cost.
If you rather not have unsafe blocks in the utilities then the buffer.slice().iter().enumerate()
is a good solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's alright to use unsafe
code so long as:
- The code is in actuality safe (e.g. won't go out of bounds).
- The code is noticeably faster.
unsafe
is used sparingly.
This seems to fit those requirements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, however
- The code is in actuality safe (e.g. won't go out of bounds).
actually requires verification, and easily can be verified incorrectly: human error is killer.
For this code, the correct way to verify is ensuring that self.start
and self.end
are always in bounds. That's not obvious, you have to go and track down/follow all uses of the self.start
/self.end
through out this whole module, as they're not local to this function. (And is even worse below, where you need to be tracking max_segment_len
too.)
Furthermore, there's a pile of rules that need to be upheld inside unsafe
: breaking any of them is undefined behaviour and can lead to an arbitrarily broken program. unsafe
should be your absolute last resort when optimising a program.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In particular, I think rust-lang/rust#11751 may be relevant and will likely make my suggested iterator-based code faster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
human error is killer
Yep, that's why we have Rust. :)
unsafe
should be your absolute last resort when optimising a program.
I realize that. However, sometimes it is necessary to drop down to unsafe
code to get as much speed as possible.
In particular, I think rust-lang/rust#11751 may be relevant and will likely make my suggested iterator-based code faster.
Hopefully this is the case. We can then switch to using the iterator-based code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there's the risk of optimising prematurely for issues that should really be fixed upstream; leaving coreutils with a significant amount of unsafe
code, even after Rust has fixed the issues, allowing equally/more performant safe code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point (probably soon after Rust hits 1.0) I'm planning to check all of the unsafe
s in the codebase to make sure that they're really necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do see your point, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@huonw I agree that using unsafe
here may be premature. I will keep an eye on rust-lang/llvm#14 and when it is integrated into rustc
I will check the performance again, and hopefully switch to the iterator.
Small performance enhancements for cut: bytes cutting
Small performance enhancements for cut: bytes cutting
Several small performance enhancements.
For very large, or infinite ranges,
uutils cut
bytes cutting is now 3 times faster thanGNU cut
.For relative performance improvement see issue #345.