Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shrank EncodedLevel to speed up step_in/step_out. #113

Merged
merged 1 commit into from
Jan 8, 2021

Conversation

zslayton
Copy link
Contributor

Description of changes:

The EncodedLevel struct is just large enough that variations of it (like Option<EncodedLevel> and Result<EncodedLevel, _>) can cross LLVM's threshold for using memcpy to move values around. This shows up in profiling as substantial overhead in the step_in and step_out functions, which move instances of EncodedLevel to and from a Vec of container levels.

In a particularly deeply nested test file (a single struct with ~800 levels of nested child structs), this caused the reader to be painfully slow. A 15MB file took ~230ms to read through with next()/step_in()/step_out(), loading each scalar value encountered.

This PR shrinks the EncodedLevel struct by:

  • Replacing a number of usize offsets (8 bytes apiece on x86_64) with u8 lengths from which offsets can be calculated if necessary.
  • Replacing the Vec of annotations on each EncodedLevel with a common Vec that lives on CursorState. Each EncodedLevel now tracks the number of annotations it has pushed onto that communal Vec, allowing the reader to use a single Vec/allocation across the entire stream. This dropped the size of EncodedLevel by a further 23 bytes.

Performance test

15MB binary Ion test file containing a single struct with 773 levels of nested values.

Before: 230ms
After: 125ms (-45.65%)

Memory layout

Before

print-type-size type: `binary::cursor::EncodedValue`: 120 bytes, alignment: 8 bytes
print-type-size     field `.index_at_depth`: 8 bytes
print-type-size     field `.field_id`: 16 bytes
print-type-size     field `.annotations`: 24 bytes
print-type-size     field `.parent_index`: 16 bytes
print-type-size     field `.field_id_offset`: 8 bytes
print-type-size     field `.annotations_offset`: 8 bytes
print-type-size     field `.header_offset`: 8 bytes
print-type-size     field `.value_offset`: 8 bytes
print-type-size     field `.value_length`: 8 bytes
print-type-size     field `.value_end`: 8 bytes
print-type-size     field `.ion_type`: 1 bytes
print-type-size     field `.header`: 3 bytes
print-type-size     field `.is_null`: 1 bytes
print-type-size     end padding: 3 bytes
print-type-size type: 

Note that depending on layout/alignment a size of 120 bytes means that, Option<EncodedValue> and Result<EncodedValue, _> can take 128 bytes even though they only add a single discriminator byte.

After

print-type-size type: `binary::cursor::EncodedValue`: 56 bytes, alignment: 8 bytes
print-type-size     field `.index_at_depth`: 8 bytes
print-type-size     field `.field_id`: 16 bytes
print-type-size     field `.header_offset`: 8 bytes
print-type-size     field `.value_length`: 8 bytes
print-type-size     field `.ion_type`: 1 bytes
print-type-size     field `.header`: 3 bytes
print-type-size     field `.is_null`: 1 bytes
print-type-size     field `.number_of_annotations`: 1 bytes
print-type-size     field `.field_id_length`: 1 bytes
print-type-size     field `.annotations_length`: 1 bytes
print-type-size     field `.header_length`: 1 bytes
print-type-size     end padding: 7 bytes

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@zslayton zslayton merged commit 9c2c742 into master Jan 8, 2021
@zslayton zslayton deleted the optimize-step-in-out branch January 8, 2021 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants