-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unicode/utf8: potential performance improvement of UTF8 validation #69915
Comments
Related Issues and Documentation
(Emoji vote if this was helpful or unhelpful; more detailed feedback welcome in this discussion.) |
I don't see any API change here, so taking this out of the proposal process. |
You've gotten performance improvements but I don't see any description of the changes you made to get those performance improvements (unless the changes are all in that screenshot?). It would be nice to post a CL, together with some Benchmarks that demonstrate the improvements (if those benchmarks are not already present in the utf8 package). |
@randall77 See the linked repo, which contains the patched code and the benchmark results. The diff is hard to see (it's not even in the history of the repo), but it really is just this: @@ -473,16 +473,22 @@ func Valid(p []byte) bool {
return false // Short or invalid.
}
accept := acceptRanges[x>>4]
+ // The change is here. Instead of reusing `size` constants 2, 3 and 4
+ // are used which leads to performance improvement of up to 41%.
if c := p[i+1]; c < accept.lo || accept.hi < c {
return false
} else if size == 2 {
+ i += 2
} else if c := p[i+2]; c < locb || hicb < c {
return false
} else if size == 3 {
+ i += 3
} else if c := p[i+3]; c < locb || hicb < c {
return false
+ } else {
+ i += 4
}
- i += size
+
}
return true
}
@@ -519,16 +525,21 @@ func ValidString(s string) bool {
return false // Short or invalid.
}
accept := acceptRanges[x>>4]
+ // The change is here. Instead of reusing `size` constants 2, 3 and 4
+ // are used which leads to performance improvement of up to 41%.
if c := s[i+1]; c < accept.lo || accept.hi < c {
return false
} else if size == 2 {
+ i += 2
} else if c := s[i+2]; c < locb || hicb < c {
return false
} else if size == 3 {
+ i += 3
} else if c := s[i+3]; c < locb || hicb < c {
return false
+ } else {
+ i += 4
}
- i += size
}
return true
} |
I guess the speedup comes from https://cs.opensource.google/go/go/+/refs/tags/go1.23.2:src/unicode/utf8/utf8.go;l=508 size := int(x & 7) and the compiler can't know that 5,6,7 are never supposed to happen (vs hardcoding += 4) nor if si < RuneSelf { |
I think you should make a CL but also add comments to explain why it works / why +4 is the only possible case in the else also I wonder if you could benchmark that vs a switch (which would be more readable and equivalent I think) |
DFA implementation from #47120 looks much simpler than existing one. |
I'm not sure I understand how that change makes things any faster, but if there are benchmarks that demonstrate it I'm ok with it. An aside, feel free to ignore...
|
Proposal Details
I've been experimenting with different kinds of optimization techniques in order to speed up utf8.Valid and utf8.ValidString and I'd like to share my findings that appear to be low-hanging fruit, at least on ARM64:
The change is minimal but yields performance improvements of up to 41% on Apple M1 Max for all-UTF8 inputs while not showing any significant negative side-effects. Though on AMD Ryzen 7 5700x the geomean is significantly lower at just -1.43%, with some benchmarks even showing significant performance degradation⚠️ .
The test setup with benchmarks can be found here: https://github.com/romshark/utf8valid
Relation to Other Proposals
It might be related to #47120 but I haven't had the opportunity to dig into the details of Russ' proposal. At first glance it looks like Russ tackled exactly that problem and my proposal may just be a duplicate.
Compared to #63347 this proposal doesn't require changes of significant complexity, it's a 9-line code change.
The text was updated successfully, but these errors were encountered: