-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move benchmark-unrelated code out of the hot path #18
base: master
Are you sure you want to change the base?
Conversation
@@ -17,13 +19,13 @@ const maxEntrySize = 256 | |||
const maxEntryCount = 10000 | |||
|
|||
type myStruct struct { | |||
Id int `json:"id"` | |||
Id int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change to ID
please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NP. I'll do when back at my laptop.
m := make(map[string]T, maxEntryCount) | ||
for n := 0; n < maxEntryCount; n++ { | ||
m[key(n)] = cs.Get(n) | ||
if id >= maxEntryCount { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
id = (id + 1) % maxEntryCount
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You know that modulo is an IDIV operation which is one of the slowest CPU instructions on x86 (and other architectures as well)!? It can take up to 100 CPU cycles or so, while regular instructions can be down to 0.2 cycles. I try to avoid % and / in tight loops if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know but I bet that this op will dominate in the benchmark. + main idea of any benchmark is to be fair, if every benchmark func will do modulo this will not change benchmark result (raw numbers might be different but the ratio will remain the same 🤷 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whatever, feel free to leave as it is, I just proposed a shorter version 😉
As the title says. It should also obsolete #16.
Every string instantiation in
key(i)
caused a heap allocation, which tainted the benchmarks.[UPDATE]
Meanwhile I added some more commits to move all code out of the hot path that is not directly related to benchmarking set/get. E.g. creation of keys and values including random number generation should not add up to the measurements of the cache API (except for ser/deser code that some cache APIs require while others don't require it or do it internal).
Before
After