Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize encoding of dictionaries #743

Closed
7 tasks
turbolent opened this issue Mar 30, 2021 · 1 comment
Closed
7 tasks

Optimize encoding of dictionaries #743

turbolent opened this issue Mar 30, 2021 · 1 comment
Assignees

Comments

@turbolent
Copy link
Member

turbolent commented Mar 30, 2021

Issue To Be Solved

The encoding of dictionaries to CBOR values can be optimized.

Suggestion

  • Switch from CBOR map to CBOR array

Definition of Done

  • Optimize dictionary encoding
    • Handle deferred keys
  • Change decoding to only support this current format (i.e. remove old backwards-compatibility decoding code)
  • Tests
    • Encode new format
    • Decode new format
    • Decode old format, incl. round-trip, i.e. encoding should match new encoding
@fxamacker
Copy link
Member

fxamacker commented Mar 31, 2021

⚠️ NOTE: "faster" refers to time delta in benchstat, so "50% faster" here means 2x speedup.

Both encoding and decoding might be faster than we expected. With additional changes proposed here, speedups compared to master branch can be up to:

  • encoding 31-32%
  • decoding 23-25%

Memory use also improved. It wasn't my goal but stored data size might be reduced more than discussed.

During the phone call with @turbolent and @SupunS, changes I proposed were based largely on code I wrote on Monday morning (which assumed backward compatibility was required).

This task proposed by @turbolent during our call is a game-changer for optimizing the encoding of dictionaries:

  • Change decoding to only support this current format (i.e. remove old backwards-compatibility decoding code)

Proposed Changes

In addition to changing CBOR maps to CBOR arrays, DictionaryValue can be encoded as

[]interface{
encodedDictionaryValueKeysFieldKey:    keys,
encodedDictionaryValueEntriesFieldKey: values,
}
keys is []interface{}, comprised of DictionaryValue.Keys (same as before), and
values is []interface{}, comprised of DictionaryValue.Entries value.

Number of values is either the same as number of keys, or 0 if deferred.

This additional change is based on some assumptions I need to confirm with the team.

Assumptions to Confirm

  • DictionaryValue.Keys and DictionaryValue.Entries are always in sync.

  • Encoded entries are either one-to-one match with keys, or empty because values are deferred.

Speed and Memory Comparisons

New (still preliminary) benchmark comparisons on linux_amd64 using Go 1.15.10 look promising.

Encoding Speed and Memory Comparisons:

name                  old time/op    new time/op    delta
EncodingSmallValue-4     201µs ± 0%     139µs ± 0%  -31.15%  (p=0.000 n=10+10)
EncodingLargeValue-4    19.5ms ± 1%    13.3ms ± 0%  -32.16%  (p=0.000 n=9+10)

name                  old alloc/op   new alloc/op   delta
EncodingSmallValue-4    57.6kB ± 0%    36.2kB ± 0%  -37.13%  (p=0.000 n=10+9)
EncodingLargeValue-4    4.68MB ± 0%    3.55MB ± 0%  -24.04%  (p=0.000 n=8+10)

name                  old allocs/op  new allocs/op  delta
EncodingSmallValue-4     1.10k ± 0%     0.80k ± 0%  -27.46%  (p=0.000 n=10+10)
EncodingLargeValue-4     92.1k ± 0%     70.8k ± 0%  -23.12%  (p=0.000 n=9+10)

Decoding Speed and Memory Comparisons:

name                  old time/op    new time/op    delta
DecodingSmallValue-4     193µs ± 0%     147µs ± 0%  -23.88%  (p=0.000 n=10+9)
DecodingLargeValue-4    20.0ms ± 0%    14.8ms ± 1%  -25.68%  (p=0.000 n=10+10)

name                  old alloc/op   new alloc/op   delta
DecodingSmallValue-4    73.9kB ± 0%    60.7kB ± 0%  -17.84%  (p=0.000 n=10+10)
DecodingLargeValue-4    7.49MB ± 0%    5.75MB ± 0%  -23.23%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
DecodingSmallValue-4     1.57k ± 0%     1.36k ± 0%  -13.36%  (p=0.000 n=10+10)
DecodingLargeValue-4      143k ± 0%      122k ± 0%  -14.63%  (p=0.000 n=10+10)

Caveats

The new speed includes changes I didn't propose during the call, so I need to confirm my assumptions with @turbolent before opening a PR that includes this change.

I used the same benchmark data as #738 so using production data could yield different results.

fxamacker added a commit to fxamacker/cadence that referenced this issue Apr 1, 2021
Switch from CBOR map to CBOR array to improve speed and preserve
ordering.

Remove key strings from encoding.

Remove old backwards-compatibility decoding code.

Add tests, including round-trip for decoding old format and encoding new
format.

Closes onflow#743
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants