Optimize encoding of dictionaries #743

turbolent · 2021-03-30T19:22:07Z

Issue To Be Solved

The encoding of dictionaries to CBOR values can be optimized.

Suggestion

Switch from CBOR map to CBOR array

Definition of Done

Optimize dictionary encoding
- Handle deferred keys
Change decoding to only support this current format (i.e. remove old backwards-compatibility decoding code)
Tests
- Encode new format
- Decode new format
- Decode old format, incl. round-trip, i.e. encoding should match new encoding

fxamacker · 2021-03-31T16:01:44Z

⚠️ NOTE: "faster" refers to time delta in `benchstat`, so "50% faster" here means 2x speedup.

Both encoding and decoding might be faster than we expected. With additional changes proposed here, speedups compared to master branch can be up to:

encoding 31-32%
decoding 23-25%

Memory use also improved. It wasn't my goal but stored data size might be reduced more than discussed.

During the phone call with @turbolent and @SupunS, changes I proposed were based largely on code I wrote on Monday morning (which assumed backward compatibility was required).

This task proposed by @turbolent during our call is a game-changer for optimizing the encoding of dictionaries:

Change decoding to only support this current format (i.e. remove old backwards-compatibility decoding code)

Proposed Changes

In addition to changing CBOR maps to CBOR arrays, DictionaryValue can be encoded as

[]interface{
encodedDictionaryValueKeysFieldKey:    keys,
encodedDictionaryValueEntriesFieldKey: values,
}
keys is []interface{}, comprised of DictionaryValue.Keys (same as before), and
values is []interface{}, comprised of DictionaryValue.Entries value.

Number of values is either the same as number of keys, or 0 if deferred.

This additional change is based on some assumptions I need to confirm with the team.

Assumptions to Confirm

DictionaryValue.Keys and DictionaryValue.Entries are always in sync.
Encoded entries are either one-to-one match with keys, or empty because values are deferred.

Speed and Memory Comparisons

New (still preliminary) benchmark comparisons on linux_amd64 using Go 1.15.10 look promising.

Encoding Speed and Memory Comparisons:

name                  old time/op    new time/op    delta
EncodingSmallValue-4     201µs ± 0%     139µs ± 0%  -31.15%  (p=0.000 n=10+10)
EncodingLargeValue-4    19.5ms ± 1%    13.3ms ± 0%  -32.16%  (p=0.000 n=9+10)

name                  old alloc/op   new alloc/op   delta
EncodingSmallValue-4    57.6kB ± 0%    36.2kB ± 0%  -37.13%  (p=0.000 n=10+9)
EncodingLargeValue-4    4.68MB ± 0%    3.55MB ± 0%  -24.04%  (p=0.000 n=8+10)

name                  old allocs/op  new allocs/op  delta
EncodingSmallValue-4     1.10k ± 0%     0.80k ± 0%  -27.46%  (p=0.000 n=10+10)
EncodingLargeValue-4     92.1k ± 0%     70.8k ± 0%  -23.12%  (p=0.000 n=9+10)

Decoding Speed and Memory Comparisons:

name                  old time/op    new time/op    delta
DecodingSmallValue-4     193µs ± 0%     147µs ± 0%  -23.88%  (p=0.000 n=10+9)
DecodingLargeValue-4    20.0ms ± 0%    14.8ms ± 1%  -25.68%  (p=0.000 n=10+10)

name                  old alloc/op   new alloc/op   delta
DecodingSmallValue-4    73.9kB ± 0%    60.7kB ± 0%  -17.84%  (p=0.000 n=10+10)
DecodingLargeValue-4    7.49MB ± 0%    5.75MB ± 0%  -23.23%  (p=0.000 n=10+10)

name                  old allocs/op  new allocs/op  delta
DecodingSmallValue-4     1.57k ± 0%     1.36k ± 0%  -13.36%  (p=0.000 n=10+10)
DecodingLargeValue-4      143k ± 0%      122k ± 0%  -14.63%  (p=0.000 n=10+10)

Caveats

The new speed includes changes I didn't propose during the call, so I need to confirm my assumptions with @turbolent before opening a PR that includes this change.

I used the same benchmark data as #738 so using production data could yield different results.

Switch from CBOR map to CBOR array to improve speed and preserve ordering. Remove key strings from encoding. Remove old backwards-compatibility decoding code. Add tests, including round-trip for decoding old format and encoding new format. Closes onflow#743

turbolent added Feature Feedback labels Mar 30, 2021

turbolent mentioned this issue Mar 30, 2021

Optimize CBOR representation of data in Cadence storage layer #738

Closed

turbolent added Improvement and removed Feature Feedback labels Mar 30, 2021

turbolent assigned fxamacker Mar 30, 2021

fxamacker mentioned this issue Apr 1, 2021

Optimize encoding of dictionaries #752

Merged

6 tasks

turbolent closed this as completed in 792541a Apr 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize encoding of dictionaries #743

Optimize encoding of dictionaries #743

turbolent commented Mar 30, 2021 •

edited

Loading

fxamacker commented Mar 31, 2021 •

edited

Loading

Optimize encoding of dictionaries #743

Optimize encoding of dictionaries #743

Comments

turbolent commented Mar 30, 2021 • edited Loading

Issue To Be Solved

Suggestion

Definition of Done

fxamacker commented Mar 31, 2021 • edited Loading

⚠️ NOTE: "faster" refers to time delta in benchstat, so "50% faster" here means 2x speedup.

Proposed Changes

Assumptions to Confirm

Speed and Memory Comparisons

Caveats

turbolent commented Mar 30, 2021 •

edited

Loading

fxamacker commented Mar 31, 2021 •

edited

Loading

⚠️ NOTE: "faster" refers to time delta in `benchstat`, so "50% faster" here means 2x speedup.