-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: don't allocate for non-escaping conversions to interface{} #8618
Comments
This thread explains how this issue causes 2 allocs on every call to os.(*File).Write. |
Dmitry started this in CL 3503. Note that this requires improved escape analysis. |
CL https://golang.org/cl/35554 mentions this issue. |
For the reference until this is fixed some of us use ad-hoc printf-style mini language to do text formatting in hot codepaths without allocations. For example if in fmt speak you have s := fmt.Sprintf("hello %d %s %x", 1, "world", []byte("data")) the analog would be buf := xfmt.Buffer{}
buf .S("hello ") .D(1) .C(' ') .S("world") .C(' ') .Xb([]byte("data"))
s := buf.Bytes() It is a bit uglier but runs faster and without allocations:
Details: https://lab.nexedi.com/kirr/go123/commit/1aa677c8 |
Note that when the arguments are constants, they no longer allocate on tip, so this is a bit better than it was. |
@josharian thanks for feedback. For the reference the above benchmark was for tip ( |
What are the status about this issue ? Is anyone working on ? |
@bnjjj I'm not aware of anybody actively working on this. |
EncodeEscapedChar (which is called in EncodeSQLStringWithFlags) is pretty optimized, but for escaping a multibyte character it was using fmt.FPrintf, which means every multibyte character ended up on the heap due to golang/go#8618. This had a noticeable impact in changefeed benchmarking. This commit just hand-compiles the two formatting strings that were being used into reasonably efficient go, eliminating the allocs. Benchmark encoding the first 10000 runes shows a 4x speedup: Before: BenchmarkEncodeNonASCIISQLString-16 944 1216130 ns/op After: BenchmarkEncodeNonASCIISQLString-16 3468 300777 ns/op Release note: None
EncodeEscapedChar (which is called in EncodeSQLStringWithFlags) is pretty optimized, but for escaping a multibyte character it was using fmt.FPrintf, which means every multibyte character ended up on the heap due to golang/go#8618. This had a noticeable impact in changefeed benchmarking. This commit just hand-compiles the two formatting strings that were being used into reasonably efficient go, eliminating the allocs. Benchmark encoding the first 10000 runes shows a 4x speedup: Before: BenchmarkEncodeNonASCIISQLString-16 944 1216130 ns/op After: BenchmarkEncodeNonASCIISQLString-16 3468 300777 ns/op Release note: None
88425: colexec: use tree.DNull when projection is called on null input r=DrewKimball a=DrewKimball Most projections skip rows for which one or more arguments are null, and just output a null for those rows. However, some projections can actually accept null arguments. Previously, we were using the values from the vec even when the `Nulls` bitmap was set for that row, which invalidates the data in the vec for that row. This could cause a non-null value to be unexpectedly concatenated to an array when an argument was null (nothing should be added to the array in this case). This commit modifies the projection operators that operate on datum-backed vectors to explicitly set the argument to `tree.DNull` in the case when the `Nulls` bitmap is set. This ensures that the projection is not performed with the invalid (and arbitrary) value in the datum vec at that index. Fixes #87919 Release note (bug fix): Fixed a bug in `Concat` projection operators for arrays that could cause non-null values to be added to the array when one of the arguments was null. 88671: util: avoid allocations when escaping multibyte characters r=[miretskiy,yuzefovich] a=HonoreDB EncodeEscapedChar (which is called in EncodeSQLStringWithFlags) is pretty optimized, but for escaping a multibyte character it was using fmt.FPrintf, which means every multibyte character ended up on the heap due to golang/go#8618. This had a noticeable impact in changefeed benchmarking. This commit just hand-compiles the two formatting strings that were being used into reasonably efficient go, eliminating the allocs. Benchmark encoding the first 10000 runes shows a 4x speedup: Before: BenchmarkEncodeNonASCIISQLString-16 944 1216130 ns/op After: BenchmarkEncodeNonASCIISQLString-16 3468 300777 ns/op Release note: None Co-authored-by: DrewKimball <[email protected]> Co-authored-by: Aaron Zinger <[email protected]>
EncodeEscapedChar (which is called in EncodeSQLStringWithFlags) is pretty optimized, but for escaping a multibyte character it was using fmt.FPrintf, which means every multibyte character ended up on the heap due to golang/go#8618. This had a noticeable impact in changefeed benchmarking. This commit just hand-compiles the two formatting strings that were being used into reasonably efficient go, eliminating the allocs. Benchmark encoding the first 10000 runes shows a 4x speedup: Before: BenchmarkEncodeNonASCIISQLString-16 944 1216130 ns/op After: BenchmarkEncodeNonASCIISQLString-16 3468 300777 ns/op Release note: None
EncodeEscapedChar (which is called in EncodeSQLStringWithFlags) is pretty optimized, but for escaping a multibyte character it was using fmt.FPrintf, which means every multibyte character ended up on the heap due to golang/go#8618. This had a noticeable impact in changefeed benchmarking. This commit just hand-compiles the two formatting strings that were being used into reasonably efficient go, eliminating the allocs. Benchmark encoding the first 10000 runes shows a 4x speedup: Before: BenchmarkEncodeNonASCIISQLString-16 944 1216130 ns/op After: BenchmarkEncodeNonASCIISQLString-16 3468 300777 ns/op Release note: None
FWIW, I was curious what the current gap is for avoiding allocations with the fmt print function arguments. As far as I could see, it looks like as of Go 1.21, there are ~6 reasons val := 1000
fmt.Sprintf("%d", val) I did an exploratory pass on some possible solutions a couple of months ago, which I recently sent as stack of CLs. Some of the CLs are likely wrong, but part of my hope is that sometimes if someone helps sketch the contours of a problem, then frequently other people are better able to jump in with solutions or their own explorations (whether those "other people" are experts, or just other people who are curious). Brief summary of the ~6 reasons:
By the end of my first-cut stack (as of CL 528538), arguments to Sprintf like the Point struct here no longer get heap allocated: type Point struct {x, y int}
p := Point{1, 2}
fmt.Sprintf("%v", p) Some of those CLs are more-or-less shots in the dark, but I tried to put some context in the CL descriptions and elsewhere in the hopes of other gophers jumping in with alternative solutions (or suggestions for corner cases to test, or questions, or ideas for improvement and so on...). The CLs all pass all.bash and pass the older TryBots (but hit some LUCI-specific issues with the new TryBots). If you are new to the escape analysis code:
|
Change https://go.dev/cl/524938 mentions this issue: |
Change https://go.dev/cl/528535 mentions this issue: |
Change https://go.dev/cl/530095 mentions this issue: |
Change https://go.dev/cl/530096 mentions this issue: |
Change https://go.dev/cl/530097 mentions this issue: |
Change https://go.dev/cl/524944 mentions this issue: |
Change https://go.dev/cl/524945 mentions this issue: |
Change https://go.dev/cl/528539 mentions this issue: |
Change https://go.dev/cl/524937 mentions this issue: |
Change https://go.dev/cl/529575 mentions this issue: |
@thepudds I noticed this problem 10 years ago, and I really appreciate your work on it now today! |
This is part of a series of CLs that aim to reduce how often interface arguments escape for the print functions in fmt. Currently, method values are one of two reasons reflect.Value.Interface always escapes its reflect.Value. Our later CLs modify behavior around method values, so we add some tests of function formatting (including method values) to help reduce the chances of breaking behavior later. We also add in some allocation tests focused on interface arguments for the print functions. These currently do not show any improvements compared to Go 1.21. These tests were originally in a later CL in our stack (CL 528538), but we split them out into this CL and moved them earlier in the stack. Updates #8618 Change-Id: Iec51abc3b7f86a2711e7497fc2fb7a678b9f8f73 Reviewed-on: https://go-review.googlesource.com/c/go/+/529575 Reviewed-by: Carlos Amedee <[email protected]> Auto-Submit: Ian Lance Taylor <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]> Reviewed-by: Ian Lance Taylor <[email protected]>
Are there any interim brute-force solutions to this? I see one here. As another option, could we use generics? For example, instead of a variadic func SprintfN1[T1 any](format string, x1 T1) string { ... }
func SprintfN2[T1, T2 any](format string, x1 T1, x2 T2) string { ... }
...
func SprintfN10 // or any practical limit on the number of parameters Which does everything doPrint does, except it has a concrete list of arguments taken by value, so no escaping would be incurred. This function is at the core of so many logging/tracing invocations in production systems (and responsible for a good chunk of allocations) that having a brute-force but working solution could be acceptable, until this problem is solved fundamentally. |
The text was updated successfully, but these errors were encountered: