Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Small performance enhancements for cut: bytes cutting #357

Merged
merged 1 commit into from
Jul 15, 2014
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 10 additions & 6 deletions cut/buffer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,9 @@ impl<R: Reader> BufReader<R> {
}
}

#[inline]
fn read(&mut self) -> IoResult<uint> {
let buf_len = self.buffer.len();
let buffer_fill = self.buffer.mut_slice(self.end, buf_len);
let buffer_fill = self.buffer.mut_slice_from(self.end);

match self.reader.read(buffer_fill) {
Ok(nread) => {
Expand All @@ -53,9 +53,11 @@ impl<R: Reader> BufReader<R> {
if self.end == self.start {
self.start = 0;
self.end = 0;
}

if self.end <= 2048 { self.read() } else { Ok(0) }
self.read()
} else {
Ok(0)
}
}

pub fn consume_line(&mut self) -> uint {
Expand All @@ -73,7 +75,8 @@ impl<R: Reader> BufReader<R> {
if buffer_used == 0 { return bytes_consumed; }

for idx in range(self.start, self.end) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this loop be for byte in self.buffer.slice_from(self.start, self.end), and thus be safe?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just did a benchmark of the different solutions (with -b 1):

  • 0.073622189 seconds for the current pull, which uses unsafe indexing.
  • 0.096472123 seconds for the current uutils/coreutils master, which uses safe indexing.
  • 0.093891360 seconds for for (idx, byte) in self.buffer.slice(self.start, self.end).iter().enumerate()

Forgoing the index of the newline byte is not possible as it is needed to update the start of the filled buffer.

The choice is between "safety" and all out performance. The safety guaranteed by the compiler is only to fail for invalid/out of bounds indices. This means it does an start <= end && end <= len assert for every byte and just crashes if it fails. This does nothing to guarantee that the indexing in your implementation of the algorithm isn't off by one, but comes with a large runtime cost.

If you rather not have unsafe blocks in the utilities then the buffer.slice().iter().enumerate() is a good solution.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's alright to use unsafe code so long as:

  1. The code is in actuality safe (e.g. won't go out of bounds).
  2. The code is noticeably faster.
  3. unsafe is used sparingly.

This seems to fit those requirements.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, however

  1. The code is in actuality safe (e.g. won't go out of bounds).

actually requires verification, and easily can be verified incorrectly: human error is killer.

For this code, the correct way to verify is ensuring that self.start and self.end are always in bounds. That's not obvious, you have to go and track down/follow all uses of the self.start/self.end through out this whole module, as they're not local to this function. (And is even worse below, where you need to be tracking max_segment_len too.)

Furthermore, there's a pile of rules that need to be upheld inside unsafe: breaking any of them is undefined behaviour and can lead to an arbitrarily broken program. unsafe should be your absolute last resort when optimising a program.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, I think rust-lang/rust#11751 may be relevant and will likely make my suggested iterator-based code faster.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

human error is killer

Yep, that's why we have Rust. :)

unsafe should be your absolute last resort when optimising a program.

I realize that. However, sometimes it is necessary to drop down to unsafe code to get as much speed as possible.

In particular, I think rust-lang/rust#11751 may be relevant and will likely make my suggested iterator-based code faster.

Hopefully this is the case. We can then switch to using the iterator-based code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's the risk of optimising prematurely for issues that should really be fixed upstream; leaving coreutils with a significant amount of unsafe code, even after Rust has fixed the issues, allowing equally/more performant safe code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point (probably soon after Rust hits 1.0) I'm planning to check all of the unsafes in the codebase to make sure that they're really necessary.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do see your point, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@huonw I agree that using unsafe here may be premature. I will keep an eye on rust-lang/llvm#14 and when it is integrated into rustc I will check the performance again, and hopefully switch to the iterator.

if self.buffer[idx] == b'\n' {
// the indices are always correct, use unsafe for speed
if unsafe { *self.buffer.unsafe_ref(idx) } == b'\n' {
self.start = idx + 1;
return bytes_consumed + idx + 1;
}
Expand Down Expand Up @@ -108,7 +111,8 @@ impl<R: Reader> Bytes::Select for BufReader<R> {
};

for idx in range(self.start, self.start + max_segment_len) {
if self.buffer[idx] == b'\n' {
// the indices are always correct, use unsafe for speed
if unsafe { *self.buffer.unsafe_ref(idx) } == b'\n' {
let segment = self.buffer.slice(self.start, idx + 1);

self.start = idx + 1;
Expand Down