-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unicode/utf8: optimize utf8.Valid with AVX2 instructions on AMD64 #63347
Comments
This is not an API change, so taking it out of the proposal process. |
The https://go.dev/wiki/AssemblyPolicy page is relevant here. |
Probably related to #47120 |
Thanks for refining the category @ianlancetaylor, and for pointing out the Assembly Policy (somehow I had not found it); the document is extremely useful to help answer out some of the questions I had:
Based on that example, it seems that the change would be worth it; ~900% increase in throughput is measurable and likely beneficial to any application that uses
The implementation was generated with Avo, I will look for examples of how/if Avo has been used in the stdlib (for example, I'm not sure if we should add the Avo generator to the stdlib code somewhere). If anyone has material to share on this topic it will be very useful.
This pretty much covers all the questions I had about how to shape the implementation and what kind of testing will be expected 👍
|
cc @golang/security |
@achille-roussel avo has already been used in crypto/internal for example. |
Thank @arl, this is exactly the example I was looking for 👍 |
Change https://go.dev/cl/535838 mentions this issue: |
It would be interesting to instrument utf8.IsValid to build a histogram of the lengths of strings passed to it, run a handful of applications that use IsValid, and multiply each bucket by its expected benefit according to the graph above. If all strings are short, the benefit will be negative, but it wouldn't take many longer strings to make the balance significantly positive. My guess is that this call in protobuf alone might justify the new implementation. |
That's a great suggestion! Note that the graph I posted above is a bit outdated, in the CL https://go-review.googlesource.com/c/go/+/535838 the implementation was improved and doesn't suffer regressions on small strings anymore. @pelletier do you have an updated graph that would be more representative of the latest code version? |
Here's an updated graph generated on my machine: If anyone is interested in reproducing the results, I've uploaded a small python script I've used to generate this plot: https://gist.github.com/pelletier/46320448776c9ba9163270130ef556aa For context, the graph initially posted on this issue showed the AVX2 version being slower for pure Go for inputs shorter than 32 bytes. This was caused because the initial implementation called the pure Go version of Technically the benchmark above reports the AVX2 version being slower for empty inputs, and I haven't benchmarked it on inputs of sizes 1 through 7. Here are the first few lines of the benchstat output:
|
UTF-8 validation is employed in many applications using text formats such as JSON.
An implementation of Ridiculously fast unicode (UTF-8) validation has been available in https://github.com/segmentio/asm/tree/main/utf8 since early 2022; authored by @pelletier and myself, it was published under MIT license.
The algorithm described in the research paper yields up to 10x throughput on UTF-8 validation of medium to large strings, as shown in this preliminary comparison:
The algorithm also scales much better than the current implementation of
utf8.Valid
as the size of input grows, as shown by this graph plotting the compute time for the two versions for input sizes of 0 to 400 bytes:The main trade-off here is complexity; while the code has been fuzz-tested against the current version of
utf8.Valid
, as well as successfully used on production workloads, a bug will always be possible in this implementation or on other platforms that would be added later on.I looked for prior discussions on improving the performance of
unicode/utf8
, but could not find much content on the topic. This proposal will likely set a precedent for porting the algorithm to other architectures, if accepted.Part of the requirements to integrate it with the standard library could be to expand test coverage in the
unicode/utf8
package to ensure the correctness of this implementation and hypothetical future additions for other architectures.The text was updated successfully, but these errors were encountered: