Improve .chars().count() #37888

bluss · 2016-11-19T23:12:53Z

Use a simpler loop to count the char of a string: count the
number of non-continuation bytes. Use count += <conditional> which the
compiler understands well and can apply loop optimizations to.

benchmark descriptions and results for two configurations:

ascii: ascii text
cy: cyrillic text
jp: japanese text
words ascii: counting each split_whitespace item from the ascii text
words jp: counting each split_whitespace item from the jp text

x86-64 rustc -Copt-level=3
 name               orig_ ns/iter      cmov_ ns/iter      diff ns/iter   diff % 
 count_ascii        1,453 (1755 MB/s)  1,398 (1824 MB/s)           -55   -3.79% 
 count_cy           5,990 (856 MB/s)   2,545 (2016 MB/s)        -3,445  -57.51% 
 count_jp           3,075 (1169 MB/s)  1,772 (2029 MB/s)        -1,303  -42.37% 
 count_words_ascii  4,157 (521 MB/s)   1,797 (1205 MB/s)        -2,360  -56.77% 
 count_words_jp     3,337 (1071 MB/s)  1,772 (2018 MB/s)        -1,565  -46.90%

x86-64 rustc -Ctarget-feature=+avx -Copt-level=3
 name               orig_ ns/iter      cmov_ ns/iter      diff ns/iter   diff % 
 count_ascii        1,444 (1766 MB/s)  763 (3343 MB/s)            -681  -47.16% 
 count_cy           5,871 (874 MB/s)   1,527 (3360 MB/s)        -4,344  -73.99% 
 count_jp           2,874 (1251 MB/s)  1,073 (3351 MB/s)        -1,801  -62.67% 
 count_words_ascii  4,131 (524 MB/s)   1,871 (1157 MB/s)        -2,260  -54.71% 
 count_words_jp     3,253 (1099 MB/s)  1,331 (2686 MB/s)        -1,922  -59.08%

I briefly explored a more involved blocked algorithm (looking at 8 or more bytes at a time),
but the code in this PR was always winning count_words_ascii in particular (counting
many small strings); this solution is an improvement without tradeoffs.

Use a simpler loop to count the `char` of a string: count the number of non-continuation bytes. Use `count += <conditional>` which the compiler understands well and can apply loop optimizations to.

rust-highfive · 2016-11-19T23:13:04Z

r? @brson

(rust_highfive has picked a reviewer for you, use r? to override)

bluss · 2016-11-19T23:13:42Z

Benchmark file https://gist.github.com/bluss/8da7bbb160050299292ab19ed36424ed

alexcrichton · 2016-11-20T19:38:53Z

@bors: r+

Nice wins!

bors · 2016-11-20T19:38:54Z

📌 Commit 5a3aa2f has been approved by alexcrichton

bors · 2016-11-20T23:06:53Z

⌛ Testing commit 5a3aa2f with merge fc2373c...

Improve .chars().count() Use a simpler loop to count the `char` of a string: count the number of non-continuation bytes. Use `count += <conditional>` which the compiler understands well and can apply loop optimizations to. benchmark descriptions and results for two configurations: - ascii: ascii text - cy: cyrillic text - jp: japanese text - words ascii: counting each split_whitespace item from the ascii text - words jp: counting each split_whitespace item from the jp text ``` x86-64 rustc -Copt-level=3 name orig_ ns/iter cmov_ ns/iter diff ns/iter diff % count_ascii 1,453 (1755 MB/s) 1,398 (1824 MB/s) -55 -3.79% count_cy 5,990 (856 MB/s) 2,545 (2016 MB/s) -3,445 -57.51% count_jp 3,075 (1169 MB/s) 1,772 (2029 MB/s) -1,303 -42.37% count_words_ascii 4,157 (521 MB/s) 1,797 (1205 MB/s) -2,360 -56.77% count_words_jp 3,337 (1071 MB/s) 1,772 (2018 MB/s) -1,565 -46.90% x86-64 rustc -Ctarget-feature=+avx -Copt-level=3 name orig_ ns/iter cmov_ ns/iter diff ns/iter diff % count_ascii 1,444 (1766 MB/s) 763 (3343 MB/s) -681 -47.16% count_cy 5,871 (874 MB/s) 1,527 (3360 MB/s) -4,344 -73.99% count_jp 2,874 (1251 MB/s) 1,073 (3351 MB/s) -1,801 -62.67% count_words_ascii 4,131 (524 MB/s) 1,871 (1157 MB/s) -2,260 -54.71% count_words_jp 3,253 (1099 MB/s) 1,331 (2686 MB/s) -1,922 -59.08% ``` I briefly explored a more involved blocked algorithm (looking at 8 or more bytes at a time), but the code in this PR was always winning `count_words_ascii` in particular (counting many small strings); this solution is an improvement without tradeoffs.

bors · 2016-11-21T02:31:45Z

llogiq · 2016-11-28T23:21:13Z

I'm curious – bytecount is much faster than anything else at counting bytes, and should be adaptable to this situation (count bytes lower than 128) without perf loss.

bluss · 2016-11-28T23:29:13Z

Go ahead and experiment. My comment was

I briefly explored a more involved blocked algorithm (looking at 8 or more bytes at a time),
but the code in this PR was always winning count_words_ascii in particular (counting
many small strings); this solution is an improvement without tradeoffs.

I'm leaving the door open to such improvements, but I suggest looking out for the small-input case as well.

bluss · 2016-11-28T23:32:49Z

Oh by the way @llogiq did you see this comment? I wanted to tell you, due to possible appication in bytecount, that it can be beneficial (it was to me) to use this kind of raw pointer solution instead of computing separate slice parts up front. (Edit: Oh I now see why you couldn't possibly see that comment).

bluss · 2016-11-29T00:14:19Z

By the way, it's not counting just bytes lower than 128, but any (non-)continuation byte.

str: Improve .chars().count()

5a3aa2f

Use a simpler loop to count the `char` of a string: count the number of non-continuation bytes. Use `count += <conditional>` which the compiler understands well and can apply loop optimizations to.

rust-highfive assigned brson Nov 19, 2016

bors merged commit 5a3aa2f into rust-lang:master Nov 21, 2016

bluss deleted the chars-count branch November 21, 2016 10:05

brson added the relnotes Marks issues that should be documented in the release notes of the next release. label Nov 22, 2016

fflorent mentioned this pull request May 28, 2017

Calculating columns is slow fflorent/nom_locate#4

Closed

CAD97 mentioned this pull request Dec 20, 2017

Add method to count UTF-8 chars llogiq/bytecount#12

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve .chars().count() #37888

Improve .chars().count() #37888

bluss commented Nov 19, 2016 •

edited

Loading

rust-highfive commented Nov 19, 2016

bluss commented Nov 19, 2016

alexcrichton commented Nov 20, 2016

bors commented Nov 20, 2016

bors commented Nov 20, 2016

bors commented Nov 21, 2016

llogiq commented Nov 28, 2016

bluss commented Nov 28, 2016 •

edited

Loading

bluss commented Nov 28, 2016 •

edited

Loading

bluss commented Nov 29, 2016 •

edited

Loading

Improve .chars().count() #37888

Improve .chars().count() #37888

Conversation

bluss commented Nov 19, 2016 • edited Loading

rust-highfive commented Nov 19, 2016

bluss commented Nov 19, 2016

alexcrichton commented Nov 20, 2016

bors commented Nov 20, 2016

bors commented Nov 20, 2016

bors commented Nov 21, 2016

llogiq commented Nov 28, 2016

bluss commented Nov 28, 2016 • edited Loading

bluss commented Nov 28, 2016 • edited Loading

bluss commented Nov 29, 2016 • edited Loading

bluss commented Nov 19, 2016 •

edited

Loading

bluss commented Nov 28, 2016 •

edited

Loading

bluss commented Nov 28, 2016 •

edited

Loading

bluss commented Nov 29, 2016 •

edited

Loading