You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ByteSlice::trim (and related) are not competitive with libstd's in the case that the whitespace is ASCII.
The difference is as much as 50%, and is something I noticed when moving some code to use bstr, as a dip in that code's benchmarks.
I'm not an expert, but my understanding is that ASCII whitespace is much more common than non-ASCII whitespace in pretty much all scripts, so it's probably a good idea to optimize for.
Here are two benchmarks that demonstrate the issue: https://gist.github.com/thomcc/d017dec2bf7fbfd017e4f34cfd4db6f8 — it's a gist as it's a bit too long to really be great as a code block. It also contains a diff you can apply to insert them directly into bstrs existing benchmark code (2nd file in the gist).
The first (trim/source-lines) measures the time to trim a bunch of lines of source code (specifically, every line in ext_slice.rs — chosen arbitrarily), and is close to my real use case where, I saw an issues using bstr.
The second (trim/large-ascii-padded) is completely artificial, and just trims a huge string starting and ending with tons of ascii whitespace (with only a single non-whitespace character between it all to ensure both trim_start and trim_end are measured). It's focused on the specific issue, so probably better as a benchmark, but it doesn't reflect a real use case.
The results here show that for the current benchmark (trim/tiny), std and bstr are roughly the same performance, but that std is substantially faster on the the new benchmarks
pubfnwhitespace_len_fwd(slice:&[u8]) -> usize{letmut i = 0;while i < slice.len() && slice[i].is_ascii_whitespace(){
i += 1;}if i == slice.len() || slice[i].is_ascii(){
i
}else{WHITESPACE_ANCHORED_FWD.find_at(slice, i).unwrap_or(i)}}
Which helped the improve the benchmarks above by about 30%, although it hurt the existing benchmark by around 10%.
I couldn't quite avoid off by one errors in the _rev version (and I'm not 100% certain I've avoided them in the _fwd, tbh — there are probably bugs in the transition between ascii and unicode there). I'm not really sure this is an ideal approach anyway, so I figured I'd just report the issue rather than spend more time debugging it.
@thomcc Thanks for diving into this! I don't quite have the bandwidth to dive into this right now. I will at least do it whenever I get back around to releasing 1.0 in order to minimize context switching. If you do want to submit a PR with your current work, then I think that sounds good to me given the 10% loss but 30% gain. But more broadly, I certainly agree that optimizing for the ASCII case makes sense. I'd be happy to do that even if it makes the fully general Unicode case a good deal worse.
ByteSlice::trim
(and related) are not competitive with libstd's in the case that the whitespace is ASCII.The difference is as much as 50%, and is something I noticed when moving some code to use bstr, as a dip in that code's benchmarks.
I'm not an expert, but my understanding is that ASCII whitespace is much more common than non-ASCII whitespace in pretty much all scripts, so it's probably a good idea to optimize for.
Here are two benchmarks that demonstrate the issue: https://gist.github.com/thomcc/d017dec2bf7fbfd017e4f34cfd4db6f8 — it's a gist as it's a bit too long to really be great as a code block. It also contains a diff you can apply to insert them directly into bstrs existing benchmark code (2nd file in the gist).
The first (
trim/source-lines
) measures the time to trim a bunch of lines of source code (specifically, every line inext_slice.rs
— chosen arbitrarily), and is close to my real use case where, I saw an issues using bstr.The second (
trim/large-ascii-padded
) is completely artificial, and just trims a huge string starting and ending with tons of ascii whitespace (with only a single non-whitespace character between it all to ensure both trim_start and trim_end are measured). It's focused on the specific issue, so probably better as a benchmark, but it doesn't reflect a real use case.The results here show that for the current benchmark (
trim/tiny
), std and bstr are roughly the same performance, but that std is substantially faster on the the new benchmarksThe text was updated successfully, but these errors were encountered: