-
-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance benchmarks and alternatives to map lookups #19
Comments
hey, thanks! I was able to repro 5x improvement in encoding. however, I still don't get it how you speed up decoding with arrays. Can you share code/PR/Gist/repo how you do it? so, far I don't see good way to switch to arrays based encoding/decoding without: 1) requiring users to specify values contigiously and starting from 0; and 2) adding reading numeric value from var/const declaration AST statement. So looks like this is no go. Feel free to commander or add your PR like this, or leave comments in PR by 2024-04-09. |
I was thinking about how I use enums. And all the time I use them to optimize two things: memory footprint and speed of comparisons. In my code I needed only uint8 and fast encoding/decoding to database/json and text. So I am thinking about extending stringer into an enumer that adds that functionality and wraps in Source code for stringer is interesting. Author decided arbitrary cutoff of more than 10 contiguous intervals when it uses map else it is comparing integer and using appropriate slice of string per interval.
# pseudo code
var A, B, C = enum{}, enum{1}, enum{2}
var enumStrings = []string{"a", "b", "c"}
var enumAll = []enum{A, B, C}
// stringer uses single string and indexes; performance might be slightly slower due to double lookup
var enumStringerString = "ABC"
var enumStringerIndexes = []int{0, 1, 2, 3}
func (e *enum) UnmarshalText(value []byte) error {
// TODO optimize len(value) == 0 => set Undefined if enabled
// TODO fast path len(value) > longest enum value => ErrBadValue
// TODO optimize skip Undefined value and start from 1
// NOTE s := string(value); and for (...) { if allValues[i] == s ... } is slower than type casting on every iteration
for (i := 0; i < len(allValues); i++) {
// if enumStringerString[enumStringerIndexes[i], enumStringerIndexes[i+1]] == string(value) {
if allValues[i] == string(value) {
*e = enumAll[i]
return nil
}
}
return ErrBadValue
}
EDIT: I am not sure if it could be made significantly faster than |
I could not reproduce this. in my benchmarks this solution (loop over array) leads to slightly faster encoding, but slower decoding.
from my previous experience dealing with encoding/decoding enums, for large enum sets (256 values) map is significantly better than loop. it literally becomes O(1) vs O(N). |
let's keep it map for now. but of course feel free to open PR that beats benchmarks. ideally keep in mind that code should be minimal. |
Just to clarify my point.
The fastest by a 8-9ns margin for 5 valid values without other optimizations is Using |
This is my follow up from r/golang Reddit I did some benchmarking.
Each unmarshal benchmark iteration parses all 5 values (4x 4-6 characters long and one
<empty>
) and one bad value (4 characters long) ie.abcd
,bcdef
,cdefg
,defg
,<empty>
andzabc
.Similar for each marshal benchmark iteration parses all 5 values.
Map lookup is significantly slower than loop for 5 items. Converting bytes to string and comparing is faster than comparing bytes using bytes.Equal because bytes.Equal converts both arguments to string anyway.
Array index is 30x faster than map lookup. Map lookup in MarshalText() and String() functions are even 40+% slower than UnmarshalText.
The text was updated successfully, but these errors were encountered: