-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize encoding of composites #761
Optimize encoding of composites #761
Conversation
Switch from CBOR map to CBOR array to improve speed and preserve ordering. Remove unnecessary field sorting when decoding. Remove old backwards-compatibility decoding code (decoding type ID). Add tests, including round-trip for decoding old format and encoding new format. Closes onflow#744
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Just one comment regarding the encoding of fields
runtime/interpreter/encode.go
Outdated
fieldPairs := v.Fields.pairs | ||
fieldLen := len(fieldPairs) | ||
|
||
// Sort field names lexicographically. | ||
fieldNames := make([]string, fieldLen) | ||
|
||
index := 0 | ||
for name := range fieldPairs { | ||
fieldNames[index] = name | ||
index++ | ||
} | ||
|
||
for pair := v.Fields.Oldest(); pair != nil; pair = pair.Next() { | ||
fieldName := pair.Key | ||
value := pair.Value | ||
sort.Strings(fieldNames) | ||
|
||
// Create fields (key and value) array, sorted by field name. | ||
fields := make(cborArray, fieldLen*2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can just keep the existing iteration over the ordered map, because that already results in a deterministic encoding. Sorting the fields also works, but is not necessary I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried the unsorted approach (it's faster and cleaner) but it causes this test to fail:
TestInterpretCompositeValueFieldEncodingOrder
in interpreter_test.go.
I strongly prefer your suggestion and am hoping the test might be obsolete or unneeded.
This is the code block incorporating the faster and cleaner code you suggested:
fields := make(cborArray, v.Fields.Len()*2)
i := 0
for pair := v.Fields.Oldest(); pair != nil; pair = pair.Next() {
fieldName := pair.Key
value := pair.Value
valuePath := append(path[:], fieldName)
prepared, err := e.prepare(value, valuePath, deferrals)
if err != nil {
return nil, err
}
fields[i], fields[i+1] = fieldName, prepared
i += 2
}
I suspect TestInterpretCompositeValueFieldEncodingOrder
tests if encoding composites with entries of different orderings is deterministic.
Test expects:
- CBOR data ["a", 1, "b", 2, "c", 3] when encoding composite with entries {"a": 1, "b": 2, "c": 3}.
- CBOR data ["a", 1, "b", 2, "c", 3] when encoding composite with entries {"b": 2, "c": 3, "a": 1}.
Without sorting, this implementation produces:
- CBOR data ["a", 1, "b", 2, "c", 3] when encoding composite with entries {"a": 1, "b": 2, "c": 3}.
- CBOR data ["b", 2, "c", 3, "a", 1] when encoding composite with entries {"b": 2, "c": 3, "a": 1}.
Any thoughts or suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test checks that no matter what the order of initialization is, the encoding is always the same. That's not required though, we only have to ensure the encoding order is always deterministic, and the ordered map for the fields ensures that.
I've adjusted the test case in f2a268b to prepare the expected encodings for all permutations of initializing the fields
runtime/interpreter/encode.go
Outdated
|
||
valuePath := append(path[:], fieldName) | ||
|
||
prepared, err := e.prepare(value, valuePath, deferrals) | ||
if err != nil { | ||
return nil, err | ||
} | ||
fields[fieldName] = prepared | ||
|
||
fields[i*2], fields[i*2+1] = fieldName, prepared |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👌
// Decode all fields in lexicographic order | ||
|
||
sort.Strings(fieldNames) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's great we don't need this anymore 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've switched sorting to iterating over the ordered map of fields in f2a268b.
I enabled CI for feature branches, including from forks, CI ran, and everything is green 👏
⛵
Closes #744
Updates #738
Description
Changes
Benchmark comparisons
benchstat
, so "50% faster" here means 2x speedup.Encoding speed improved by 44.35% and decoding speed improved by 34.56% for a 776,155 byte value from production data that contains both composites and dictionaries. Memory use also improved. Encoded/decoded data includes Cadence
Value
types that have not yet been optimized.Speed gains were only 15-27% for encoding and 16-24% for decoding for smaller production data that did not include dictionaries. However, these will gain more from optimizing remaining
Value
types (outside the scope of this PR).Click to expand:
Big Value with Composites and Dictionaries 🚀🚀
Encoding Comparisons
Decoding Comparisons
Values with Composites and No Dictionaries 🚀
Encoding Comparisons
Decoding Comparisons
Benchmarks used Go 1.15.10 on linux_amd64 using production data.
Special thanks to @turbolent for the state decoding tool in #759 and @ramtinms for production data in jsonl!
For contributor use:
feature/storage-optimizations
branchFiles changed
in the Github PR explorer