Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zap.Any: Reduce stack size with generics #1310

Merged
merged 1 commit into from
Aug 1, 2023

Conversation

abhinav
Copy link
Collaborator

@abhinav abhinav commented Jul 30, 2023

Yet another attempt at reducing the stack size of zap.Any,
borrowing from #1301, #1303, #1304, #1305, #1307, and #1308.

This approach defines a generic data type for field constructors
of a specific type. This is similar to the lookup map in #1307,
minus the map lookup, the interface match, or reflection.

type anyFieldC[T any] func(string, T) Field

The generic data type provides a non-generic method
matching the interface:

interface{ Any(string, any) Field }

Stack size:
The stack size of zap.Any following this change is 0xc0 (192 bytes).

% go build -gcflags -S 2>&1 | grep ^go.uber.org/zap.Any
go.uber.org/zap.Any STEXT size=5861 args=0x20 locals=0xc0 funcid=0x0 align=0x0

This is just 8 bytes more than #1305,
which is the smallest stack size of all other attempts.

Allocations:
Everything appears to get inlined with no heap escapes:

% go build -gcflags -m 2>&1 |
  grep field.go |
  perl -n -e 'next unless m{^./field.go:(\d+)}; print if ($1 >= 413)' |
  grep 'escapes'
[no output]

(Line 413 declares anyFieldC)

Besides that, the output of -m for the relevant section of code
consists of almost entirely:

./field.go:415:6: can inline anyFieldC[go.shape.bool].Any
./field.go:415:6: can inline anyFieldC[go.shape.[]bool].Any
./field.go:415:6: can inline anyFieldC[go.shape.complex128].Any
[...]
./field.go:415:6: inlining call to anyFieldC[go.shape.complex128].Any
./field.go:415:6: inlining call to anyFieldC[go.shape.[]bool].Any
./field.go:415:6: inlining call to anyFieldC[go.shape.bool].Any

Followed by:

./field.go:428:10: leaking param: key
./field.go:428:22: leaking param: value

Maintainability:
Unlike some of the other approaches, this variant is more maintainable.
The zap.Any function looks roughly the same.
Adding new branches there is obvious, and requires no duplication.

Performance:
This is a net improvement against master on BenchmarkAny's
log-go checks that log inside a new goroutine.

name                           old time/op    new time/op    delta
Any/string/field-only/typed      25.2ns ± 1%    25.6ns ± 2%     ~     (p=0.460 n=5+5)
Any/string/field-only/any        56.9ns ± 3%    79.4ns ± 0%  +39.55%  (p=0.008 n=5+5)
Any/string/log/typed             1.47µs ± 0%    1.49µs ± 4%   +1.58%  (p=0.016 n=4+5)
Any/string/log/any               1.53µs ± 2%    1.55µs ± 1%   +1.37%  (p=0.016 n=5+5)
Any/string/log-go/typed          5.97µs ± 6%    5.99µs ± 1%     ~     (p=0.151 n=5+5)
Any/string/log-go/any            10.9µs ± 0%     6.2µs ± 0%  -43.32%  (p=0.008 n=5+5)
Any/stringer/field-only/typed    25.3ns ± 1%    25.5ns ± 1%   +1.09%  (p=0.008 n=5+5)
Any/stringer/field-only/any      85.5ns ± 1%   124.5ns ± 0%  +45.66%  (p=0.008 n=5+5)
Any/stringer/log/typed           1.43µs ± 1%    1.42µs ± 2%     ~     (p=0.175 n=4+5)
Any/stringer/log/any             1.50µs ± 1%    1.56µs ± 6%   +4.20%  (p=0.008 n=5+5)
Any/stringer/log-go/typed        5.94µs ± 0%    5.92µs ± 0%   -0.40%  (p=0.032 n=5+5)
Any/stringer/log-go/any          11.1µs ± 2%     6.3µs ± 0%  -42.93%  (p=0.008 n=5+5)

name                           old alloc/op   new alloc/op   delta
Any/string/field-only/typed       0.00B          0.00B          ~     (all equal)
Any/string/field-only/any         0.00B          0.00B          ~     (all equal)
Any/string/log/typed              64.0B ± 0%     64.0B ± 0%     ~     (all equal)
Any/string/log/any                64.0B ± 0%     64.0B ± 0%     ~     (all equal)
Any/string/log-go/typed            112B ± 0%      112B ± 0%     ~     (all equal)
Any/string/log-go/any              128B ± 0%      128B ± 0%     ~     (all equal)
Any/stringer/field-only/typed     0.00B          0.00B          ~     (all equal)
Any/stringer/field-only/any       0.00B          0.00B          ~     (all equal)
Any/stringer/log/typed            64.0B ± 0%     64.0B ± 0%     ~     (all equal)
Any/stringer/log/any              64.0B ± 0%     64.0B ± 0%     ~     (all equal)
Any/stringer/log-go/typed          112B ± 0%      112B ± 0%     ~     (all equal)
Any/stringer/log-go/any            128B ± 0%      128B ± 0%     ~     (all equal)

name                           old allocs/op  new allocs/op  delta
Any/string/field-only/typed        0.00           0.00          ~     (all equal)
Any/string/field-only/any          0.00           0.00          ~     (all equal)
Any/string/log/typed               1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Any/string/log/any                 1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Any/string/log-go/typed            3.00 ± 0%      3.00 ± 0%     ~     (all equal)
Any/string/log-go/any              3.00 ± 0%      3.00 ± 0%     ~     (all equal)
Any/stringer/field-only/typed      0.00           0.00          ~     (all equal)
Any/stringer/field-only/any        0.00           0.00          ~     (all equal)
Any/stringer/log/typed             1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Any/stringer/log/any               1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Any/stringer/log-go/typed          3.00 ± 0%      3.00 ± 0%     ~     (all equal)
Any/stringer/log-go/any            3.00 ± 0%      3.00 ± 0%     ~     (all equal)

It causes a regression in "field-only"
which calls the field constructor and discards the result
without using it in a logger.
I believe this is acceptable because that's not a real use case;
we expect the result to be used with a logger.

@codecov
Copy link

codecov bot commented Jul 30, 2023

Codecov Report

Merging #1310 (66c34c2) into master (ae3953e) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master    #1310   +/-   ##
=======================================
  Coverage   98.08%   98.09%           
=======================================
  Files          50       50           
  Lines        3242     3249    +7     
=======================================
+ Hits         3180     3187    +7     
  Misses         53       53           
  Partials        9        9           
Files Changed Coverage Δ
field.go 100.00% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@abhinav abhinav requested review from prashantv and rabbbit July 30, 2023 17:50
field.go Show resolved Hide resolved
field.go Show resolved Hide resolved
Copy link
Contributor

@rabbbit rabbbit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice. I did wonder if generics could help here, but this is clearly beyond me.

For the fun of it, I'll try running this later on arm and amd to compare against previous numbers. I'll also try running the current head against gotip just to verify that this didn't change and that go1.22 doesn't change anything.

After Prashant suggestion,I also planned pulling the benchmarks into a separate PR and adding some other cases. Stay tuned.

rabbbit added a commit that referenced this pull request Jul 30, 2023
This is a prefactor for #1301, #1302, #1304, #1305, #1307, #1308 and #1310.

We're writing various approaches to reduce the stock size and it's
painful to keep copy-pasting the tests between PRs. This was suggested
in @prashantv in #1307.

The tests are mostly based on tests in #1303, but made "more generic",
as #1307 we might want to test across more than just a single type.
It does make the tests a bit harder to setup. Some of the setup is
inconvenient (we duplicate the value in both `typed` and `any` version
of the tests) but hopefully okay to understand. A fully non-duplicated
alternative would likely require something like #1310 itself.

For #1307 in particular a test against interface type would likely be
needed, so adding it here too.

The tests compare two code paths, with the same arguments, one using a
strongly typed method and second using `zap.Any`. We have:
- a simple "create field" case for a baseline
- a "create and log" case for a realistic case (we typically log the fields)
- a "create and log in a goroutine" case for the pathological case
  we're trying to solve for.
- a "create and long in goroutine in a pre-warmed system" that does the
  above, but before tries to affect the starting goroutine stack size
  to provide an realistic example.
  Without this, for any tests with 2+ goroutines, the cost of `zap.Any`
  is not visible, as we always end up expanding the stack even in the
  strongly typed methods.

The test results are:
```
goos: darwin
goarch: arm64
pkg: go.uber.org/zap
BenchmarkAny/string-typ-no-logger               166879518                6.988 ns/op           0 B/op          0 allocs/op
BenchmarkAny/string-typ-no-logger-12            167398297                6.973 ns/op           0 B/op          0 allocs/op
BenchmarkAny/string-any-no-logger               87669631                13.97 ns/op            0 B/op          0 allocs/op
BenchmarkAny/string-any-no-logger-12            86760837                14.11 ns/op            0 B/op          0 allocs/op
BenchmarkAny/string-typ-logger                   3059485               395.5 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string-typ-logger-12                3141176               379.7 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string-any-logger                   2995699               401.3 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string-any-logger-12                3071046               391.1 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string-typ-logger-go                 784323              1351 ns/op             146 B/op          2 allocs/op
BenchmarkAny/string-typ-logger-go-12             2000835               613.9 ns/op            96 B/op          2 allocs/op
BenchmarkAny/string-any-logger-go                 477486              2479 ns/op             117 B/op          2 allocs/op
BenchmarkAny/string-any-logger-go-12             1830955               680.0 ns/op           112 B/op          2 allocs/op
BenchmarkAny/string-typ-logger-go-stack           841566              1328 ns/op              96 B/op          2 allocs/op
BenchmarkAny/string-typ-logger-go-stack-12       2625226               479.6 ns/op            96 B/op          2 allocs/op
BenchmarkAny/string-any-logger-go-stack           486084              2493 ns/op             112 B/op          2 allocs/op
BenchmarkAny/string-any-logger-go-stack-12       2658640               667.9 ns/op           112 B/op          2 allocs/op
BenchmarkAny/stringer-typ-no-logger             147314238                8.034 ns/op           0 B/op          0 allocs/op
BenchmarkAny/stringer-typ-no-logger-12          157857937                7.436 ns/op           0 B/op          0 allocs/op
BenchmarkAny/stringer-any-no-logger             58872349                20.19 ns/op            0 B/op          0 allocs/op
BenchmarkAny/stringer-any-no-logger-12          60532305                20.27 ns/op            0 B/op          0 allocs/op
BenchmarkAny/stringer-typ-logger                 3094204               411.7 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer-typ-logger-12              3163489               383.7 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer-any-logger                 2981940               427.2 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer-any-logger-12              2777792               394.0 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer-typ-logger-go               911761              1335 ns/op              96 B/op          2 allocs/op
BenchmarkAny/stringer-typ-logger-go-12           2006440               605.2 ns/op            96 B/op          2 allocs/op
BenchmarkAny/stringer-any-logger-go               467934              2518 ns/op             112 B/op          2 allocs/op
BenchmarkAny/stringer-any-logger-go-12           1786076               683.1 ns/op           112 B/op          2 allocs/op
BenchmarkAny/stringer-typ-logger-go-stack         855794              1316 ns/op              96 B/op          2 allocs/op
BenchmarkAny/stringer-typ-logger-go-stack-12     2598783               434.5 ns/op            96 B/op          2 allocs/op
BenchmarkAny/stringer-any-logger-go-stack         473282              2474 ns/op             112 B/op          2 allocs/op
BenchmarkAny/stringer-any-logger-go-stack-12     2020183               651.9 ns/op           112 B/op          2 allocs/op
PASS
ok      go.uber.org/zap 53.516s
```
@rabbbit rabbbit mentioned this pull request Jul 30, 2023
rabbbit added a commit that referenced this pull request Aug 1, 2023
This is a prefactor for #1301, #1302, #1304, #1305, #1307, #1308 and
#1310.

We're writing various approaches to reduce the stock size and it's
painful to keep copy-pasting the tests between PRs. This was suggested
in @prashantv in #1307.

The tests are mostly based on tests in #1303, but made "more generic",
as #1307 we might want to test across more than just a single type. It
does make the tests a bit harder to setup. Some of the setup is
inconvenient (we duplicate the value in both `typed` and `any` version
of the tests) but hopefully okay to understand. A fully non-duplicated
alternative would likely require something like #1310 itself.

For #1307 in particular a test against interface type would likely be
needed, so adding it here too.

The tests compare two code paths, with the same arguments, one using a
strongly typed method and second using `zap.Any`. We have:
- a simple "create field" case for a baseline
- a "create and log" case for a realistic case (we typically log the
fields)
- a "create and log in a goroutine" case for the pathological case we're
trying to solve for.
- -a "create and long in goroutine in a pre-warmed system" that does the
above- we decided it's not worth the complication.

The test results are:
```
❯  go test -bench BenchmarkAny -benchmem -cpu 1
goos: darwin
goarch: arm64
pkg: go.uber.org/zap
BenchmarkAny/string/field-only/typed    161981473                7.374 ns/op           0 B/op          0 allocs/op
BenchmarkAny/string/field-only/any      82343354                14.67 ns/op            0 B/op          0 allocs/op
BenchmarkAny/string/log/typed            2965648               416.1 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string/log/any              2920292               418.8 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string/log-go/typed         1000000              1158 ns/op             112 B/op          3 allocs/op
BenchmarkAny/string/log-go/any            553144              2152 ns/op             128 B/op          3 allocs/op
BenchmarkAny/stringer/field-only/typed  160509367                7.548 ns/op           0 B/op          0 allocs/op
BenchmarkAny/stringer/field-only/any    51330402                23.45 ns/op            0 B/op          0 allocs/op
BenchmarkAny/stringer/log/typed          3221404               377.0 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer/log/any            2726443               393.5 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer/log-go/typed       1000000              1129 ns/op             112 B/op          3 allocs/op
BenchmarkAny/stringer/log-go/any          558602              2147 ns/op             128 B/op          3 allocs/op
PASS
ok      go.uber.org/zap 19.426s
```

On gotip:
```
❯  gotip test -bench BenchmarkAny -benchmem -cpu 1

goos: darwin
goarch: arm64
pkg: go.uber.org/zap
BenchmarkAny/string/field-only/typed    155084869                7.603 ns/op           0 B/op          0 allocs/op
BenchmarkAny/string/field-only/any      82740788                14.55 ns/op            0 B/op          0 allocs/op
BenchmarkAny/string/log/typed            2800495               411.6 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string/log/any              2896258               411.7 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string/log-go/typed         1000000              1155 ns/op             112 B/op          3 allocs/op
BenchmarkAny/string/log-go/any            551599              2168 ns/op             128 B/op          3 allocs/op
BenchmarkAny/stringer/field-only/typed  159505488                7.578 ns/op           0 B/op          0 allocs/op
BenchmarkAny/stringer/field-only/any    51406354                23.78 ns/op            0 B/op          0 allocs/op
BenchmarkAny/stringer/log/typed          3011210               388.6 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer/log/any            3010370               395.1 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer/log-go/typed       1000000              1161 ns/op             112 B/op          3 allocs/op
BenchmarkAny/stringer/log-go/any          553860              2161 ns/op             128 B/op          3 allocs/op
PASS
ok      go.uber.org/zap 19.391s
```

on amd64 (similar, 2x worse stack growth impact)
```
 % go test -bench BenchmarkAny -benchmem -cpu 1
goos: linux
goarch: amd64
pkg: go.uber.org/zap
cpu: AMD EPYC 7B13
BenchmarkAny/string/field-only/typed    47534053                25.23 ns/op            0 B/op          0 allocs/op
BenchmarkAny/string/field-only/any      36913526                32.57 ns/op            0 B/op          0 allocs/op
BenchmarkAny/string/log/typed            1693508               725.7 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string/log/any              1576172               765.3 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string/log-go/typed          516832              2343 ns/op             112 B/op          3 allocs/op
BenchmarkAny/string/log-go/any            243692              4404 ns/op             128 B/op          3 allocs/op
BenchmarkAny/stringer/field-only/typed  48735537                24.73 ns/op            0 B/op          0 allocs/op
BenchmarkAny/stringer/field-only/any    26115684                47.24 ns/op            0 B/op          0 allocs/op
BenchmarkAny/stringer/log/typed          1761630               677.5 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer/log/any            1646913               705.4 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer/log-go/typed        534187              2275 ns/op             112 B/op          3 allocs/op
BenchmarkAny/stringer/log-go/any          273787              4348 ns/op             128 B/op          3 allocs/op
PASS
ok      go.uber.org/zap 18.890s
```
@rabbbit
Copy link
Contributor

rabbbit commented Aug 1, 2023

I ran the new benchmarks:

❯  go test -bench BenchmarkAny -benchmem -cpu 1

goos: darwin
goarch: arm64
pkg: go.uber.org/zap
BenchmarkAny/string/field-only/typed    157439328                7.418 ns/op           0 B/op          0 allocs/op
BenchmarkAny/string/field-only/any      57899338                20.86 ns/op            0 B/op          0 allocs/op
BenchmarkAny/string/log/typed            2960004               409.2 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string/log/any              2793056               436.6 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string/log-go/typed          967309              1228 ns/op             112 B/op          3 allocs/op
BenchmarkAny/string/log-go/any           1000000              1179 ns/op             128 B/op          3 allocs/op
BenchmarkAny/stringer/field-only/typed  160436690                7.584 ns/op           0 B/op          0 allocs/op
BenchmarkAny/stringer/field-only/any    36087421                33.24 ns/op            0 B/op          0 allocs/op
BenchmarkAny/stringer/log/typed          3102452               386.9 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer/log/any            2901717               429.1 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer/log-go/typed       1000000              1162 ns/op             112 B/op          3 allocs/op
BenchmarkAny/stringer/log-go/any         1002906              1153 ns/op             128 B/op          3 allocs/op
PASS
ok      go.uber.org/zap 20.298s
❯  gotip test -bench BenchmarkAny -benchmem -cpu 1

goos: darwin
goarch: arm64
pkg: go.uber.org/zap
BenchmarkAny/string/field-only/typed    156370705                7.367 ns/op           0 B/op          0 allocs/op
BenchmarkAny/string/field-only/any      57934047                21.47 ns/op            0 B/op          0 allocs/op
BenchmarkAny/string/log/typed            2716717               442.1 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string/log/any              2668359               465.0 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string/log-go/typed         1000000              1223 ns/op             112 B/op          3 allocs/op
BenchmarkAny/string/log-go/any            964839              1223 ns/op             128 B/op          3 allocs/op
BenchmarkAny/stringer/field-only/typed  157792915                7.782 ns/op           0 B/op          0 allocs/op
BenchmarkAny/stringer/field-only/any    35110689                34.05 ns/op            0 B/op          0 allocs/op
BenchmarkAny/stringer/log/typed          3024481               410.7 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer/log/any            2727921               445.9 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer/log-go/typed       1000000              1138 ns/op             112 B/op          3 allocs/op
BenchmarkAny/stringer/log-go/any         1000000              1216 ns/op             128 B/op          3 allocs/op
PASS
ok      go.uber.org/zap 19.557s
pawel@go(pawel.devpod-us-va) ~/zap
 % go test -bench BenchmarkAny -benchmem -cpu 1
goos: linux
goarch: amd64
pkg: go.uber.org/zap
cpu: AMD EPYC 7B13
BenchmarkAny/string/field-only/typed    47630637                25.28 ns/op            0 B/op          0 allocs/op
BenchmarkAny/string/field-only/any      25443076                47.87 ns/op            0 B/op          0 allocs/op
BenchmarkAny/string/log/typed            1578100               738.2 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string/log/any              1540040               784.3 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string/log-go/typed          484922              2332 ns/op             112 B/op          3 allocs/op
BenchmarkAny/string/log-go/any            489837              2414 ns/op             128 B/op          3 allocs/op
BenchmarkAny/stringer/field-only/typed  48447813                24.75 ns/op            0 B/op          0 allocs/op
BenchmarkAny/stringer/field-only/any    17729155                68.85 ns/op            0 B/op          0 allocs/op
BenchmarkAny/stringer/log/typed          1693166               681.6 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer/log/any            1609806               788.9 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer/log-go/typed        522900              2318 ns/op             112 B/op          3 allocs/op
BenchmarkAny/stringer/log-go/any          502071              2356 ns/op             128 B/op          3 allocs/op
PASS
ok      go.uber.org/zap 19.148s

Which (1) seems comparable with all the other PRs (2) shows improvement vs #1311 (apart from the field-only Any as you pointed out yourself)

Copy link
Contributor

@rabbbit rabbbit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved, but I don't think I can rebase this for you, can I?

@abhinav
Copy link
Collaborator Author

abhinav commented Aug 1, 2023

Rebased!

@@ -410,6 +410,14 @@ func Inline(val zapcore.ObjectMarshaler) Field {
}
}

type anyFieldC[T any] func(string, T) Field
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment about why we're doing all of this

ideally we'd have an upstream issue linked which describes the compiler issue

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added more details. I don't think there's an upstream compiler issue yet. I believe there are plans to write one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue now exists, adding a link in #1387

Yet another attempt at reducing the stack size of zap.Any,
borrowing from uber-go#1301, uber-go#1303, uber-go#1304, uber-go#1305, uber-go#1307, and 1308.

This approach defines a generic data type for field constructors
of a specific type. This is similar to the lookup map in uber-go#1307,
minus the map lookup, the interface match, or reflection.

    type anyFieldC[T any] func(string, T) Field

The generic data type provides a non-generic method
matching the interface:

    interface{ Any(string, any) Field }

Stack size:
The stack size of zap.Any following this change is 0xc0.

    % go build -gcflags -S 2>&1 | grep ^go.uber.org/zap.Any
    go.uber.org/zap.Any STEXT size=5861 args=0x20 locals=0xc0 funcid=0x0 align=0x0

This is just 8 bytes more than uber-go#1305,
which is the smallest stack size of all other attempts.

Allocations:
Everything appears to get inlined with no heap escapes:

    go build -gcflags -m 2>&1 |
      grep field.go |
      perl -n -e 'next unless m{^./field.go:(\d+)}; print if ($1 >= 413)' |
      grep 'escapes'

(Line 413 declares anyFieldC)

Besides that, the output of `-m` for the relevant section of code
consists of almost entirely:

    ./field.go:415:6: can inline anyFieldC[go.shape.bool].Any
    ./field.go:415:6: can inline anyFieldC[go.shape.[]bool].Any
    ./field.go:415:6: can inline anyFieldC[go.shape.complex128].Any
    [...]
    ./field.go:415:6: inlining call to anyFieldC[go.shape.complex128].Any
    ./field.go:415:6: inlining call to anyFieldC[go.shape.[]bool].Any
    ./field.go:415:6: inlining call to anyFieldC[go.shape.bool].Any

Followed by:

    ./field.go:428:10: leaking param: key
    ./field.go:428:22: leaking param: value

Maintainability:
Unlike some of the other approaches, this variant is more maintainable.
The `zap.Any` function looks roughly the same.
Adding new branches there is obvious, and requires no duplication.

Performance:
This is a net improvement on all BenchmarkAny calls
except "any-no-logger" which calls `zap.Any` and discards the result.

```
name                        old time/op    new time/op    delta
Any/str-no-logger-2           8.77ns ± 0%    8.75ns ± 1%     ~     (p=0.159 n=4+5)
Any/any-no-logger-2           54.1ns ± 0%    81.6ns ± 0%  +50.71%  (p=0.016 n=5+4)
Any/str-with-logger-2         1.38µs ± 3%    1.38µs ± 4%     ~     (p=0.841 n=5+5)
Any/any-with-logger-2         1.60µs ±22%    1.37µs ± 1%     ~     (p=0.151 n=5+5)
Any/str-in-go-2               3.41µs ± 1%    3.42µs ± 5%     ~     (p=0.905 n=4+5)
Any/any-in-go-2               5.98µs ± 1%    3.68µs ± 6%  -38.44%  (p=0.008 n=5+5)
Any/str-in-go-with-stack-2    3.42µs ± 2%    3.46µs ± 3%     ~     (p=0.421 n=5+5)
Any/any-in-go-with-stack-2    5.98µs ± 3%    3.65µs ± 3%  -38.95%  (p=0.008 n=5+5)

name                        old alloc/op   new alloc/op   delta
Any/str-no-logger-2            0.00B          0.00B          ~     (all equal)
Any/any-no-logger-2            0.00B          0.00B          ~     (all equal)
Any/str-with-logger-2          64.0B ± 0%     64.0B ± 0%     ~     (all equal)
Any/any-with-logger-2          64.0B ± 0%     64.0B ± 0%     ~     (all equal)
Any/str-in-go-2                88.5B ± 1%     88.0B ± 0%     ~     (p=0.429 n=4+4)
Any/any-in-go-2                88.0B ± 0%     88.0B ± 0%     ~     (all equal)
Any/str-in-go-with-stack-2     88.0B ± 0%     88.0B ± 0%     ~     (all equal)
Any/any-in-go-with-stack-2     88.0B ± 0%     88.0B ± 0%     ~     (all equal)

name                        old allocs/op  new allocs/op  delta
Any/str-no-logger-2             0.00           0.00          ~     (all equal)
Any/any-no-logger-2             0.00           0.00          ~     (all equal)
Any/str-with-logger-2           1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Any/any-with-logger-2           1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Any/str-in-go-2                 2.00 ± 0%      2.00 ± 0%     ~     (all equal)
Any/any-in-go-2                 2.00 ± 0%      2.00 ± 0%     ~     (all equal)
Any/str-in-go-with-stack-2      2.00 ± 0%      2.00 ± 0%     ~     (all equal)
Any/any-in-go-with-stack-2      2.00 ± 0%      2.00 ± 0%     ~     (all equal)
```

I believe this is acceptable because that's not a real use case;
we expect the result to be used with a logger.
@abhinav abhinav merged commit a93cd4a into uber-go:master Aug 1, 2023
@abhinav abhinav deleted the any-generic branch August 1, 2023 05:00
rabbbit added a commit that referenced this pull request Aug 2, 2023
This is a prefactor for #1301, #1302, #1304, #1305, #1307, #1308 and #1310.

We're writing various approaches to reduce the stack size and it's
painful to keep copy-pasting the tests between PRs. This was suggested
in @prashantv in #1307.

The tests are mostly based on tests in #1303, but made "more generic",
as #1307 we might want to test across more than just a single type.
It does make the tests a bit harder to setup. Some of the setup is
inconvenient (we duplicate the value in both `typed` and `any` version
of the tests) but hopefully okay to understand. A fully non-duplicated
alternative would likely require something like #1310 itself.

For #1307 in particular a test against interface type would likely be
needed, so adding it here too.

The tests compare two code paths, with the same arguments, one using a
strongly typed method and second using `zap.Any`. We have:
- a simple "create field" case for a baseline
- a "create and log" case for a realistic case (we typically log the fields)
- a "create and log in a goroutine" case for the pathological case
  we're trying to solve for.
- a "create and long in goroutine in a pre-warmed system" that does the
  above, but before tries to affect the starting goroutine stack size
  to provide an realistic example.
  Without this, for any tests with 2+ goroutines, the cost of `zap.Any`
  is not visible, as we always end up expanding the stack even in the
  strongly typed methods.

The test results are:
```
goos: darwin
goarch: arm64
pkg: go.uber.org/zap
BenchmarkAny/string-typ-no-logger               166879518                6.988 ns/op           0 B/op          0 allocs/op
BenchmarkAny/string-typ-no-logger-12            167398297                6.973 ns/op           0 B/op          0 allocs/op
BenchmarkAny/string-any-no-logger               87669631                13.97 ns/op            0 B/op          0 allocs/op
BenchmarkAny/string-any-no-logger-12            86760837                14.11 ns/op            0 B/op          0 allocs/op
BenchmarkAny/string-typ-logger                   3059485               395.5 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string-typ-logger-12                3141176               379.7 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string-any-logger                   2995699               401.3 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string-any-logger-12                3071046               391.1 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string-typ-logger-go                 784323              1351 ns/op             146 B/op          2 allocs/op
BenchmarkAny/string-typ-logger-go-12             2000835               613.9 ns/op            96 B/op          2 allocs/op
BenchmarkAny/string-any-logger-go                 477486              2479 ns/op             117 B/op          2 allocs/op
BenchmarkAny/string-any-logger-go-12             1830955               680.0 ns/op           112 B/op          2 allocs/op
BenchmarkAny/string-typ-logger-go-stack           841566              1328 ns/op              96 B/op          2 allocs/op
BenchmarkAny/string-typ-logger-go-stack-12       2625226               479.6 ns/op            96 B/op          2 allocs/op
BenchmarkAny/string-any-logger-go-stack           486084              2493 ns/op             112 B/op          2 allocs/op
BenchmarkAny/string-any-logger-go-stack-12       2658640               667.9 ns/op           112 B/op          2 allocs/op
BenchmarkAny/stringer-typ-no-logger             147314238                8.034 ns/op           0 B/op          0 allocs/op
BenchmarkAny/stringer-typ-no-logger-12          157857937                7.436 ns/op           0 B/op          0 allocs/op
BenchmarkAny/stringer-any-no-logger             58872349                20.19 ns/op            0 B/op          0 allocs/op
BenchmarkAny/stringer-any-no-logger-12          60532305                20.27 ns/op            0 B/op          0 allocs/op
BenchmarkAny/stringer-typ-logger                 3094204               411.7 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer-typ-logger-12              3163489               383.7 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer-any-logger                 2981940               427.2 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer-any-logger-12              2777792               394.0 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer-typ-logger-go               911761              1335 ns/op              96 B/op          2 allocs/op
BenchmarkAny/stringer-typ-logger-go-12           2006440               605.2 ns/op            96 B/op          2 allocs/op
BenchmarkAny/stringer-any-logger-go               467934              2518 ns/op             112 B/op          2 allocs/op
BenchmarkAny/stringer-any-logger-go-12           1786076               683.1 ns/op           112 B/op          2 allocs/op
BenchmarkAny/stringer-typ-logger-go-stack         855794              1316 ns/op              96 B/op          2 allocs/op
BenchmarkAny/stringer-typ-logger-go-stack-12     2598783               434.5 ns/op            96 B/op          2 allocs/op
BenchmarkAny/stringer-any-logger-go-stack         473282              2474 ns/op             112 B/op          2 allocs/op
BenchmarkAny/stringer-any-logger-go-stack-12     2020183               651.9 ns/op           112 B/op          2 allocs/op
PASS
ok      go.uber.org/zap 53.516s
```
rabbbit pushed a commit that referenced this pull request Aug 2, 2023
Yet another attempt at reducing the stack size of zap.Any,
borrowing from #1301, #1303, #1304, #1305, #1307, and #1308.

This approach defines a generic data type for field constructors
of a specific type. This is similar to the lookup map in #1307,
minus the map lookup, the interface match, or reflection.

    type anyFieldC[T any] func(string, T) Field

The generic data type provides a non-generic method
matching the interface:

    interface{ Any(string, any) Field }

**Stack size**:
The stack size of zap.Any following this change is 0xc0 (192 bytes).

    % go build -gcflags -S 2>&1 | grep ^go.uber.org/zap.Any
    go.uber.org/zap.Any STEXT size=5861 args=0x20 locals=0xc0 funcid=0x0 align=0x0

This is just 8 bytes more than #1305,
which is the smallest stack size of all other attempts.

**Allocations**:
Everything appears to get inlined with no heap escapes:

    % go build -gcflags -m 2>&1 |
      grep field.go |
      perl -n -e 'next unless m{^./field.go:(\d+)}; print if ($1 >= 413)' |
      grep 'escapes'
    [no output]

(Line 413 declares anyFieldC)

Besides that, the output of `-m` for the relevant section of code
consists of almost entirely:

    ./field.go:415:6: can inline anyFieldC[go.shape.bool].Any
    ./field.go:415:6: can inline anyFieldC[go.shape.[]bool].Any
    ./field.go:415:6: can inline anyFieldC[go.shape.complex128].Any
    [...]
    ./field.go:415:6: inlining call to anyFieldC[go.shape.complex128].Any
    ./field.go:415:6: inlining call to anyFieldC[go.shape.[]bool].Any
    ./field.go:415:6: inlining call to anyFieldC[go.shape.bool].Any

Followed by:

    ./field.go:428:10: leaking param: key
    ./field.go:428:22: leaking param: value

**Maintainability**:
Unlike some of the other approaches, this variant is more maintainable.
The `zap.Any` function looks roughly the same.
Adding new branches there is obvious, and requires no duplication.

**Performance**:
This is a net improvement against master on BenchmarkAny's
log-go checks that log inside a new goroutine.

```
name                           old time/op    new time/op    delta
Any/string/field-only/typed      25.2ns ± 1%    25.6ns ± 2%     ~     (p=0.460 n=5+5)
Any/string/field-only/any        56.9ns ± 3%    79.4ns ± 0%  +39.55%  (p=0.008 n=5+5)
Any/string/log/typed             1.47µs ± 0%    1.49µs ± 4%   +1.58%  (p=0.016 n=4+5)
Any/string/log/any               1.53µs ± 2%    1.55µs ± 1%   +1.37%  (p=0.016 n=5+5)
Any/string/log-go/typed          5.97µs ± 6%    5.99µs ± 1%     ~     (p=0.151 n=5+5)
Any/string/log-go/any            10.9µs ± 0%     6.2µs ± 0%  -43.32%  (p=0.008 n=5+5)
Any/stringer/field-only/typed    25.3ns ± 1%    25.5ns ± 1%   +1.09%  (p=0.008 n=5+5)
Any/stringer/field-only/any      85.5ns ± 1%   124.5ns ± 0%  +45.66%  (p=0.008 n=5+5)
Any/stringer/log/typed           1.43µs ± 1%    1.42µs ± 2%     ~     (p=0.175 n=4+5)
Any/stringer/log/any             1.50µs ± 1%    1.56µs ± 6%   +4.20%  (p=0.008 n=5+5)
Any/stringer/log-go/typed        5.94µs ± 0%    5.92µs ± 0%   -0.40%  (p=0.032 n=5+5)
Any/stringer/log-go/any          11.1µs ± 2%     6.3µs ± 0%  -42.93%  (p=0.008 n=5+5)

name                           old alloc/op   new alloc/op   delta
Any/string/field-only/typed       0.00B          0.00B          ~     (all equal)
Any/string/field-only/any         0.00B          0.00B          ~     (all equal)
Any/string/log/typed              64.0B ± 0%     64.0B ± 0%     ~     (all equal)
Any/string/log/any                64.0B ± 0%     64.0B ± 0%     ~     (all equal)
Any/string/log-go/typed            112B ± 0%      112B ± 0%     ~     (all equal)
Any/string/log-go/any              128B ± 0%      128B ± 0%     ~     (all equal)
Any/stringer/field-only/typed     0.00B          0.00B          ~     (all equal)
Any/stringer/field-only/any       0.00B          0.00B          ~     (all equal)
Any/stringer/log/typed            64.0B ± 0%     64.0B ± 0%     ~     (all equal)
Any/stringer/log/any              64.0B ± 0%     64.0B ± 0%     ~     (all equal)
Any/stringer/log-go/typed          112B ± 0%      112B ± 0%     ~     (all equal)
Any/stringer/log-go/any            128B ± 0%      128B ± 0%     ~     (all equal)

name                           old allocs/op  new allocs/op  delta
Any/string/field-only/typed        0.00           0.00          ~     (all equal)
Any/string/field-only/any          0.00           0.00          ~     (all equal)
Any/string/log/typed               1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Any/string/log/any                 1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Any/string/log-go/typed            3.00 ± 0%      3.00 ± 0%     ~     (all equal)
Any/string/log-go/any              3.00 ± 0%      3.00 ± 0%     ~     (all equal)
Any/stringer/field-only/typed      0.00           0.00          ~     (all equal)
Any/stringer/field-only/any        0.00           0.00          ~     (all equal)
Any/stringer/log/typed             1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Any/stringer/log/any               1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Any/stringer/log-go/typed          3.00 ± 0%      3.00 ± 0%     ~     (all equal)
Any/stringer/log-go/any            3.00 ± 0%      3.00 ± 0%     ~     (all equal)
```

It causes a regression in "field-only"
which calls the field constructor and discards the result
without using it in a logger.
I believe this is acceptable because that's not a real use case;
we expect the result to be used with a logger.
sywhang pushed a commit that referenced this pull request Aug 2, 2023
This is a prefactor for #1301, #1302, #1304, #1305, #1307, #1308 and #1310.

We're writing various approaches to reduce the stock size and it's
painful to keep copy-pasting the tests between PRs. This was suggested
in @prashantv in #1307.

The tests are mostly based on tests in #1303, but made "more generic",
as #1307 we might want to test across more than just a single type.
It does make the tests a bit harder to setup. Some of the setup is
inconvenient (we duplicate the value in both `typed` and `any` version
of the tests) but hopefully okay to understand. A fully non-duplicated
alternative would likely require something like #1310 itself.

For #1307 in particular a test against interface type would likely be
needed, so adding it here too.

The tests compare two code paths, with the same arguments, one using a
strongly typed method and second using `zap.Any`. We have:
- a simple "create field" case for a baseline
- a "create and log" case for a realistic case (we typically log the fields)
- a "create and log in a goroutine" case for the pathological case
  we're trying to solve for.
- a "create and long in goroutine in a pre-warmed system" that does the
  above, but before tries to affect the starting goroutine stack size
  to provide an realistic example.
  Without this, for any tests with 2+ goroutines, the cost of `zap.Any`
  is not visible, as we always end up expanding the stack even in the
  strongly typed methods.

The test results are:
```
goos: darwin
goarch: arm64
pkg: go.uber.org/zap
BenchmarkAny/string-typ-no-logger               166879518                6.988 ns/op           0 B/op          0 allocs/op
BenchmarkAny/string-typ-no-logger-12            167398297                6.973 ns/op           0 B/op          0 allocs/op
BenchmarkAny/string-any-no-logger               87669631                13.97 ns/op            0 B/op          0 allocs/op
BenchmarkAny/string-any-no-logger-12            86760837                14.11 ns/op            0 B/op          0 allocs/op
BenchmarkAny/string-typ-logger                   3059485               395.5 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string-typ-logger-12                3141176               379.7 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string-any-logger                   2995699               401.3 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string-any-logger-12                3071046               391.1 ns/op            64 B/op          1 allocs/op
BenchmarkAny/string-typ-logger-go                 784323              1351 ns/op             146 B/op          2 allocs/op
BenchmarkAny/string-typ-logger-go-12             2000835               613.9 ns/op            96 B/op          2 allocs/op
BenchmarkAny/string-any-logger-go                 477486              2479 ns/op             117 B/op          2 allocs/op
BenchmarkAny/string-any-logger-go-12             1830955               680.0 ns/op           112 B/op          2 allocs/op
BenchmarkAny/string-typ-logger-go-stack           841566              1328 ns/op              96 B/op          2 allocs/op
BenchmarkAny/string-typ-logger-go-stack-12       2625226               479.6 ns/op            96 B/op          2 allocs/op
BenchmarkAny/string-any-logger-go-stack           486084              2493 ns/op             112 B/op          2 allocs/op
BenchmarkAny/string-any-logger-go-stack-12       2658640               667.9 ns/op           112 B/op          2 allocs/op
BenchmarkAny/stringer-typ-no-logger             147314238                8.034 ns/op           0 B/op          0 allocs/op
BenchmarkAny/stringer-typ-no-logger-12          157857937                7.436 ns/op           0 B/op          0 allocs/op
BenchmarkAny/stringer-any-no-logger             58872349                20.19 ns/op            0 B/op          0 allocs/op
BenchmarkAny/stringer-any-no-logger-12          60532305                20.27 ns/op            0 B/op          0 allocs/op
BenchmarkAny/stringer-typ-logger                 3094204               411.7 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer-typ-logger-12              3163489               383.7 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer-any-logger                 2981940               427.2 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer-any-logger-12              2777792               394.0 ns/op            64 B/op          1 allocs/op
BenchmarkAny/stringer-typ-logger-go               911761              1335 ns/op              96 B/op          2 allocs/op
BenchmarkAny/stringer-typ-logger-go-12           2006440               605.2 ns/op            96 B/op          2 allocs/op
BenchmarkAny/stringer-any-logger-go               467934              2518 ns/op             112 B/op          2 allocs/op
BenchmarkAny/stringer-any-logger-go-12           1786076               683.1 ns/op           112 B/op          2 allocs/op
BenchmarkAny/stringer-typ-logger-go-stack         855794              1316 ns/op              96 B/op          2 allocs/op
BenchmarkAny/stringer-typ-logger-go-stack-12     2598783               434.5 ns/op            96 B/op          2 allocs/op
BenchmarkAny/stringer-any-logger-go-stack         473282              2474 ns/op             112 B/op          2 allocs/op
BenchmarkAny/stringer-any-logger-go-stack-12     2020183               651.9 ns/op           112 B/op          2 allocs/op
PASS
ok      go.uber.org/zap 53.516s
```

reviews

rename

review

farewell, my friend

dummy
sywhang pushed a commit that referenced this pull request Aug 2, 2023
Yet another attempt at reducing the stack size of zap.Any,
borrowing from #1301, #1303, #1304, #1305, #1307, and #1308.

This approach defines a generic data type for field constructors
of a specific type. This is similar to the lookup map in #1307,
minus the map lookup, the interface match, or reflection.

    type anyFieldC[T any] func(string, T) Field

The generic data type provides a non-generic method
matching the interface:

    interface{ Any(string, any) Field }

**Stack size**:
The stack size of zap.Any following this change is 0xc0 (192 bytes).

    % go build -gcflags -S 2>&1 | grep ^go.uber.org/zap.Any
    go.uber.org/zap.Any STEXT size=5861 args=0x20 locals=0xc0 funcid=0x0 align=0x0

This is just 8 bytes more than #1305,
which is the smallest stack size of all other attempts.

**Allocations**:
Everything appears to get inlined with no heap escapes:

    % go build -gcflags -m 2>&1 |
      grep field.go |
      perl -n -e 'next unless m{^./field.go:(\d+)}; print if ($1 >= 413)' |
      grep 'escapes'
    [no output]

(Line 413 declares anyFieldC)

Besides that, the output of `-m` for the relevant section of code
consists of almost entirely:

    ./field.go:415:6: can inline anyFieldC[go.shape.bool].Any
    ./field.go:415:6: can inline anyFieldC[go.shape.[]bool].Any
    ./field.go:415:6: can inline anyFieldC[go.shape.complex128].Any
    [...]
    ./field.go:415:6: inlining call to anyFieldC[go.shape.complex128].Any
    ./field.go:415:6: inlining call to anyFieldC[go.shape.[]bool].Any
    ./field.go:415:6: inlining call to anyFieldC[go.shape.bool].Any

Followed by:

    ./field.go:428:10: leaking param: key
    ./field.go:428:22: leaking param: value

**Maintainability**:
Unlike some of the other approaches, this variant is more maintainable.
The `zap.Any` function looks roughly the same.
Adding new branches there is obvious, and requires no duplication.

**Performance**:
This is a net improvement against master on BenchmarkAny's
log-go checks that log inside a new goroutine.

```
name                           old time/op    new time/op    delta
Any/string/field-only/typed      25.2ns ± 1%    25.6ns ± 2%     ~     (p=0.460 n=5+5)
Any/string/field-only/any        56.9ns ± 3%    79.4ns ± 0%  +39.55%  (p=0.008 n=5+5)
Any/string/log/typed             1.47µs ± 0%    1.49µs ± 4%   +1.58%  (p=0.016 n=4+5)
Any/string/log/any               1.53µs ± 2%    1.55µs ± 1%   +1.37%  (p=0.016 n=5+5)
Any/string/log-go/typed          5.97µs ± 6%    5.99µs ± 1%     ~     (p=0.151 n=5+5)
Any/string/log-go/any            10.9µs ± 0%     6.2µs ± 0%  -43.32%  (p=0.008 n=5+5)
Any/stringer/field-only/typed    25.3ns ± 1%    25.5ns ± 1%   +1.09%  (p=0.008 n=5+5)
Any/stringer/field-only/any      85.5ns ± 1%   124.5ns ± 0%  +45.66%  (p=0.008 n=5+5)
Any/stringer/log/typed           1.43µs ± 1%    1.42µs ± 2%     ~     (p=0.175 n=4+5)
Any/stringer/log/any             1.50µs ± 1%    1.56µs ± 6%   +4.20%  (p=0.008 n=5+5)
Any/stringer/log-go/typed        5.94µs ± 0%    5.92µs ± 0%   -0.40%  (p=0.032 n=5+5)
Any/stringer/log-go/any          11.1µs ± 2%     6.3µs ± 0%  -42.93%  (p=0.008 n=5+5)

name                           old alloc/op   new alloc/op   delta
Any/string/field-only/typed       0.00B          0.00B          ~     (all equal)
Any/string/field-only/any         0.00B          0.00B          ~     (all equal)
Any/string/log/typed              64.0B ± 0%     64.0B ± 0%     ~     (all equal)
Any/string/log/any                64.0B ± 0%     64.0B ± 0%     ~     (all equal)
Any/string/log-go/typed            112B ± 0%      112B ± 0%     ~     (all equal)
Any/string/log-go/any              128B ± 0%      128B ± 0%     ~     (all equal)
Any/stringer/field-only/typed     0.00B          0.00B          ~     (all equal)
Any/stringer/field-only/any       0.00B          0.00B          ~     (all equal)
Any/stringer/log/typed            64.0B ± 0%     64.0B ± 0%     ~     (all equal)
Any/stringer/log/any              64.0B ± 0%     64.0B ± 0%     ~     (all equal)
Any/stringer/log-go/typed          112B ± 0%      112B ± 0%     ~     (all equal)
Any/stringer/log-go/any            128B ± 0%      128B ± 0%     ~     (all equal)

name                           old allocs/op  new allocs/op  delta
Any/string/field-only/typed        0.00           0.00          ~     (all equal)
Any/string/field-only/any          0.00           0.00          ~     (all equal)
Any/string/log/typed               1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Any/string/log/any                 1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Any/string/log-go/typed            3.00 ± 0%      3.00 ± 0%     ~     (all equal)
Any/string/log-go/any              3.00 ± 0%      3.00 ± 0%     ~     (all equal)
Any/stringer/field-only/typed      0.00           0.00          ~     (all equal)
Any/stringer/field-only/any        0.00           0.00          ~     (all equal)
Any/stringer/log/typed             1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Any/stringer/log/any               1.00 ± 0%      1.00 ± 0%     ~     (all equal)
Any/stringer/log-go/typed          3.00 ± 0%      3.00 ± 0%     ~     (all equal)
Any/stringer/log-go/any            3.00 ± 0%      3.00 ± 0%     ~     (all equal)
```

It causes a regression in "field-only"
which calls the field constructor and discards the result
without using it in a logger.
I believe this is acceptable because that's not a real use case;
we expect the result to be used with a logger.
@rabbbit
Copy link
Contributor

rabbbit commented Aug 15, 2024

@abhinav do we want to revert this with

The compiler in Go 1.23 can now overlap the stack frame slots of local variables accessed in disjoint regions of a function, which reduces stack usage for Go applications.

I kinda like your code, though. After the initial 5 minutes of wth, I appreciate the elegance of it, every time.

Edit:

pre your change.

❯ git checkout 50b2db4fdecf067666874bd2e323ae601cf32559
Previous HEAD position was 249507a zap.Any: Reduce stack size with generics (#1310)
HEAD is now at 50b2db4 zap.Any add benchmarks
❯ gotip build -gcflags -S 2>&1 | grep ^go.uber.org/zap.Any
go.uber.org/zap.Any STEXT size=11808 args=0x20 locals=0x128 funcid=0x0 align=0x0

after the fix:

❯ git checkout 249507a8cf0294b844ad52d0707c83997d931205
Previous HEAD position was 50b2db4 zap.Any add benchmarks
HEAD is now at 249507a zap.Any: Reduce stack size with generics (#1310)
❯ gotip build -gcflags -S 2>&1 | grep ^go.uber.org/zap.Any
go.uber.org/zap.Any STEXT size=3712 args=0x20 locals=0xb8 funcid=0x0 align=0x0

❯ gotip version
go version devel go1.24-49adb8e4 2024-08-14 16:52:05 -0800 darwin/arm64

So your fix is still 110 bytes smaller than it would be if we reverted.

@abhinav
Copy link
Collaborator Author

abhinav commented Aug 15, 2024

I think it makes sense to keep it as-is until Go 1.24 is released, since Go 1.22 is still supported by Zap.
Once 1.23 is the minimum supported version, we can revert this if we want to.
(I don't have strong feelings on that. The code is not overly complex so the 110 bytes may be worth it.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants