Reduce allocations, and other optimizations #121

benbrandt · 2024-03-24T05:45:44Z

Addresses #115

Since in binary search we need random access to the next sections, we have to allocate a Vec at some point. Rather do this for every chunk, this now reuses the same vec so we can reuse the allocated memory as often as possible.

Does two things: 1. Memoizes the output of `chunk_size`, since this can get called several times on the same chunk in the course of selecting a chunk. 2. Since we are doing this, we no longer need the `sizes` array which tried to do the same thing. 3. Levels in the next semantic chunks now also reuse an allocation. This isn't ideal because it didn't allocate at all before, but was necessary to allow a mutable reference. This does however set things up to also do binary search on the levels.

codecov · 2024-03-24T05:57:04Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.29%. Comparing base (7d321a2) to head (21e0dd9).

Additional details and impacted files

@@           Coverage Diff            @@
##             main     #121    +/-   ##
========================================
  Coverage   99.28%   99.29%            
========================================
  Files           6        6            
  Lines        1404     1561   +157     
========================================
+ Hits         1394     1550   +156     
- Misses         10       11     +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Only affects markdown ranges, but it wasn't correctly filtering out the correct ranges because it was finding the first item of a level in all ranges, not after the offset

By having inline elements at a higher level than text, it caused for some strange breakpoints. While an inline element can have a text element inside it, this should then be skipped if necessary. It still allows inline elements to be kept together, but also allow for smaller text elements to still get pulled in. This also makes sure that higher semantic levels get preferred if they are shorter in length than a lower level, to avoid the algorithm stopping sooner for a lower level, when a higher level could fit. This also adds an optimization to drop ranges from the caches if we have already moved past them, since these ranges get iterated over quite frequently.

benbrandt added 2 commits March 23, 2024 20:59

Reuse vec for next sections

7974c3c

Since in binary search we need random access to the next sections, we have to allocate a Vec at some point. Rather do this for every chunk, this now reuses the same vec so we can reuse the allocated memory as often as possible.

benbrandt linked an issue Mar 24, 2024 that may be closed by this pull request

Reduce number of allocations in chunk generation #115

Closed

4 tasks

benbrandt added 4 commits March 24, 2024 08:58

Simplify the level selection logic

8f681cd

Remove extra allocation for ranges

957eb53

Fix bug with ranges after offset

25e08ed

Only affects markdown ranges, but it wasn't correctly filtering out the correct ranges because it was finding the first item of a level in all ranges, not after the offset

benbrandt changed the title ~~Allocation reduction~~ Reduce allocations, and other optimizations Mar 25, 2024

benbrandt merged commit 35f30dc into main Mar 25, 2024
24 checks passed

benbrandt deleted the allocation-reduction branch March 25, 2024 19:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce allocations, and other optimizations #121

Reduce allocations, and other optimizations #121

benbrandt commented Mar 24, 2024

codecov bot commented Mar 24, 2024 •

edited

Loading

Reduce allocations, and other optimizations #121

Reduce allocations, and other optimizations #121

Conversation

benbrandt commented Mar 24, 2024

codecov bot commented Mar 24, 2024 • edited Loading

Codecov Report

codecov bot commented Mar 24, 2024 •

edited

Loading