Move benchmark-unrelated code out of the hot path #18

rockdaboot · 2023-12-21T09:30:02Z

As the title says. It should also obsolete #16.
Every string instantiation in key(i) caused a heap allocation, which tainted the benchmarks.

[UPDATE]
Meanwhile I added some more commits to move all code out of the hot path that is not directly related to benchmarking set/get. E.g. creation of keys and values including random number generation should not add up to the measurements of the cache API (except for ser/deser code that some cache APIs require while others don't require it or do it internal).

Before

$ go version
go version go1.21.4 linux/amd64
goos: linux
goarch: amd64
pkg: github.com/allegro/bigcache-bench
cpu: 12th Gen Intel(R) Core(TM) i7-12800H
BenchmarkMapSetForStruct-20                         4574           1050507 ns/op          697285 B/op      19746 allocs/op
BenchmarkSyncMapSetForStruct-20                     1512           3062503 ns/op         1828855 B/op      69662 allocs/op
BenchmarkOracamanMapSetForStruct-20                 3039           1593436 ns/op         1216733 B/op      20144 allocs/op
BenchmarkFreeCacheSetForStruct-20                   1172           4117920 ns/op         6987192 B/op      40287 allocs/op
BenchmarkBigCacheSetForStruct-20                    1362           3640569 ns/op         3771568 B/op      42459 allocs/op
BenchmarkMapSetForBytes-20                          2886           1722816 ns/op         2097688 B/op      29749 allocs/op
BenchmarkSyncMapSetForBytes-20                      1399           3438202 ns/op         3112061 B/op      79919 allocs/op
BenchmarkOracamanMapSetForBytes-20                  2138           2121852 ns/op         2943254 B/op      30175 allocs/op
BenchmarkFreeCacheSetForBytes-20                    1424           3446344 ns/op         7867129 B/op      30537 allocs/op
BenchmarkBigCacheSetForBytes-20                     1602           3034032 ns/op         4651572 B/op      32713 allocs/op
BenchmarkMapGetForStruct-20                     38352931               111.5 ns/op            23 B/op          1 allocs/op
BenchmarkSyncMapGetForStruct-20                 35730880               133.3 ns/op            23 B/op          1 allocs/op
BenchmarkOracamanMapGetForStruct-20             36561675               130.1 ns/op            23 B/op          1 allocs/op
BenchmarkFreeCacheGetForStruct-20                9007900               525.5 ns/op           271 B/op          8 allocs/op
BenchmarkBigCacheGetForStruct-20                 9274410               515.5 ns/op           287 B/op          9 allocs/op
BenchmarkMapGetForBytes-20                      42340047               111.4 ns/op            23 B/op          1 allocs/op
BenchmarkSyncMapGetForBytes-20                  35446474               135.0 ns/op            23 B/op          1 allocs/op
BenchmarkOracamanMapGetForBytes-20              34858657               135.3 ns/op            23 B/op          1 allocs/op
BenchmarkFreeCacheGetForBytes-20                22743499               207.3 ns/op           135 B/op          2 allocs/op
BenchmarkBigCacheGetForBytes-20                 25927096               186.1 ns/op           151 B/op          3 allocs/op
BenchmarkSyncMapSetParallelForStruct-20          8432505               687.5 ns/op            72 B/op          5 allocs/op
BenchmarkOracamanMapSetParallelForStruct-20     113340970               42.98 ns/op           30 B/op          2 allocs/op
BenchmarkFreeCacheSetParallelForStruct-20       84426818                54.65 ns/op           53 B/op          4 allocs/op
BenchmarkBigCacheSetParallelForStruct-20        62022183                74.63 ns/op          222 B/op          4 allocs/op
BenchmarkSyncMapSetParallelForBytes-20           7928991               647.5 ns/op           200 B/op          6 allocs/op
BenchmarkOracamanMapSetParallelForBytes-20      88186278                55.89 ns/op          143 B/op          3 allocs/op
BenchmarkFreeCacheSetParallelForBytes-20        84665589                52.74 ns/op          141 B/op          3 allocs/op
BenchmarkBigCacheSetParallelForBytes-20         57134434                90.22 ns/op          506 B/op          3 allocs/op
BenchmarkSyncMapGetParallelForStruct-20         189450904               25.36 ns/op           23 B/op          1 allocs/op
BenchmarkOracamanMapGetParallelForStruct-20     186230124               26.05 ns/op           23 B/op          1 allocs/op
BenchmarkFreeCacheGetParallelForStruct-20       45472926                99.74 ns/op          271 B/op          8 allocs/op
BenchmarkBigCacheGetParallelForStruct-20        43793809               109.5 ns/op           288 B/op          9 allocs/op
BenchmarkSyncMapGetParallelForBytes-20          206990751               23.17 ns/op           23 B/op          1 allocs/op
BenchmarkOracamanMapGetParallelForBytes-20      190093576               25.74 ns/op           23 B/op          1 allocs/op
BenchmarkFreeCacheGetParallelForBytes-20        91603290                48.53 ns/op          135 B/op          2 allocs/op
BenchmarkBigCacheGetParallelForBytes-20         90818000                47.96 ns/op          152 B/op          3 allocs/op
PASS
ok      github.com/allegro/bigcache-bench       194.972s

After

$ go version
go version go1.21.4 linux/amd64
$ go test -bench=. -benchmem -benchtime=4s ./... -timeout 30m
goos: linux
goarch: amd64
pkg: github.com/allegro/bigcache-bench
cpu: 12th Gen Intel(R) Core(TM) i7-12800H
BenchmarkMapSetForStruct-20                     738278595                6.465 ns/op           0 B/op          0 allocs/op
BenchmarkSyncMapSetForStruct-20                 73809016                68.92 ns/op           40 B/op          3 allocs/op
BenchmarkOracamanMapSetForStruct-20             179029155               27.51 ns/op            0 B/op          0 allocs/op
BenchmarkFreeCacheSetForStruct-20               58267056                71.22 ns/op            8 B/op          1 allocs/op
BenchmarkBigCacheSetForStruct-20                43320409               100.3 ns/op           128 B/op          1 allocs/op
BenchmarkMapSetForBytes-20                      145805667               32.41 ns/op          112 B/op          1 allocs/op
BenchmarkSyncMapSetForBytes-20                  41185538               112.0 ns/op           168 B/op          4 allocs/op
BenchmarkOracamanMapSetForBytes-20              73706602                61.97 ns/op          112 B/op          1 allocs/op
BenchmarkFreeCacheSetForBytes-20                45121453               100.9 ns/op           112 B/op          1 allocs/op
BenchmarkBigCacheSetForBytes-20                 29798607               143.3 ns/op           463 B/op          1 allocs/op
BenchmarkMapGetForStruct-20                     917468594                5.063 ns/op           0 B/op          0 allocs/op
BenchmarkSyncMapGetForStruct-20                 350010081               12.55 ns/op            0 B/op          0 allocs/op
BenchmarkOracamanMapGetForStruct-20             229678330               20.87 ns/op            0 B/op          0 allocs/op
BenchmarkFreeCacheGetForStruct-20               51396169                81.19 ns/op           32 B/op          2 allocs/op
BenchmarkBigCacheGetForStruct-20                81617512                58.30 ns/op           32 B/op          2 allocs/op
BenchmarkMapGetForBytes-20                      816561459                5.388 ns/op           0 B/op          0 allocs/op
BenchmarkSyncMapGetForBytes-20                  387219021               14.09 ns/op            0 B/op          0 allocs/op
BenchmarkOracamanMapGetForBytes-20              239492982               20.39 ns/op            0 B/op          0 allocs/op
BenchmarkFreeCacheGetForBytes-20                46738532               111.6 ns/op           136 B/op          2 allocs/op
BenchmarkBigCacheGetForBytes-20                 63684832                81.63 ns/op          136 B/op          2 allocs/op
BenchmarkSyncMapSetParallelForStruct-20         18331375               274.9 ns/op            41 B/op          2 allocs/op
BenchmarkOracamanMapSetParallelForStruct-20     169913868               27.97 ns/op            0 B/op          0 allocs/op
BenchmarkFreeCacheSetParallelForStruct-20       144847314               31.37 ns/op            8 B/op          1 allocs/op
BenchmarkBigCacheSetParallelForStruct-20        96001162                56.98 ns/op          192 B/op          1 allocs/op
BenchmarkSyncMapSetParallelForBytes-20          13546615               351.2 ns/op           170 B/op          4 allocs/op
BenchmarkOracamanMapSetParallelForBytes-20      121414638               37.81 ns/op          112 B/op          1 allocs/op
BenchmarkFreeCacheSetParallelForBytes-20        131029023               36.91 ns/op          112 B/op          1 allocs/op
BenchmarkBigCacheSetParallelForBytes-20         68330320                75.51 ns/op          482 B/op          1 allocs/op
BenchmarkSyncMapGetParallelForStruct-20         1000000000               3.839 ns/op           0 B/op          0 allocs/op
BenchmarkOracamanMapGetParallelForStruct-20     506150209                9.661 ns/op           0 B/op          0 allocs/op
BenchmarkFreeCacheGetParallelForStruct-20       222971008               24.01 ns/op           32 B/op          2 allocs/op
BenchmarkBigCacheGetParallelForStruct-20        348548276               14.01 ns/op           32 B/op          2 allocs/op
BenchmarkSyncMapGetParallelForBytes-20          1000000000               3.824 ns/op           0 B/op          0 allocs/op
BenchmarkOracamanMapGetParallelForBytes-20      495279145                9.952 ns/op           0 B/op          0 allocs/op
BenchmarkFreeCacheGetParallelForBytes-20        172124143               28.57 ns/op          136 B/op          2 allocs/op
BenchmarkBigCacheGetParallelForBytes-20         203385400               24.08 ns/op          136 B/op          2 allocs/op
PASS
ok      github.com/allegro/bigcache-bench       216.286s

cristaloleg · 2024-01-19T08:31:40Z

caches_bench_test.go

@@ -17,13 +19,13 @@ const maxEntrySize = 256
 const maxEntryCount = 10000

 type myStruct struct {
-	Id int `json:"id"`
+	Id int


Can you change to ID please?

NP. I'll do when back at my laptop.

cristaloleg · 2024-01-19T08:32:12Z

caches_bench_test.go

-		m := make(map[string]T, maxEntryCount)
-		for n := 0; n < maxEntryCount; n++ {
-			m[key(n)] = cs.Get(n)
+		if id >= maxEntryCount {


id = (id + 1) % maxEntryCount ?

You know that modulo is an IDIV operation which is one of the slowest CPU instructions on x86 (and other architectures as well)!? It can take up to 100 CPU cycles or so, while regular instructions can be down to 0.2 cycles. I try to avoid % and / in tight loops if possible.

I know but I bet that this op will dominate in the benchmark. + main idea of any benchmark is to be fair, if every benchmark func will do modulo this will not change benchmark result (raw numbers might be different but the ratio will remain the same 🤷 )

Whatever, feel free to leave as it is, I just proposed a shorter version 😉

rockdaboot added 4 commits December 21, 2023 10:17

Create/allocate keys outside the hot path

9493a69

Simplify and speed up ToBytes() and Parse()

b54f619

Create parallel keys in advance

f084a7d

Move rand() out of benchmark loop

85bcdfa

rockdaboot changed the title ~~Create/allocate keys outside the hot path~~ Move benchmark-unrelated code out of the hot path Dec 23, 2023

cristaloleg reviewed Jan 19, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move benchmark-unrelated code out of the hot path #18

Move benchmark-unrelated code out of the hot path #18

rockdaboot commented Dec 21, 2023 •

edited

Loading

cristaloleg Jan 19, 2024

rockdaboot Jan 19, 2024

cristaloleg Jan 19, 2024

rockdaboot Jan 19, 2024

cristaloleg Jan 19, 2024

cristaloleg Jan 19, 2024

Move benchmark-unrelated code out of the hot path #18

Are you sure you want to change the base?

Move benchmark-unrelated code out of the hot path #18

Conversation

rockdaboot commented Dec 21, 2023 • edited Loading

cristaloleg Jan 19, 2024

Choose a reason for hiding this comment

rockdaboot Jan 19, 2024

Choose a reason for hiding this comment

cristaloleg Jan 19, 2024

Choose a reason for hiding this comment

rockdaboot Jan 19, 2024

Choose a reason for hiding this comment

cristaloleg Jan 19, 2024

Choose a reason for hiding this comment

cristaloleg Jan 19, 2024

Choose a reason for hiding this comment

rockdaboot commented Dec 21, 2023 •

edited

Loading