Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce allocations, and other optimizations #121

Merged
merged 6 commits into from
Mar 25, 2024
Merged

Conversation

benbrandt
Copy link
Owner

Addresses #115

Since in binary search we need random access to the next sections, we have to allocate a Vec at some point.
Rather do this for every chunk, this now reuses the same vec so we can reuse the allocated memory as often as possible.
Does two things:
1. Memoizes the output of `chunk_size`, since this can get called several times on the same chunk in the course of selecting a chunk.
2. Since we are doing this, we no longer need the `sizes` array which tried to do the same thing.
3. Levels in the next semantic chunks now also reuse an allocation. This isn't ideal because it didn't allocate at all before, but was necessary to allow a mutable reference. This does however set things up to also do binary search on the levels.
@benbrandt benbrandt linked an issue Mar 24, 2024 that may be closed by this pull request
4 tasks
Copy link

codecov bot commented Mar 24, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.29%. Comparing base (7d321a2) to head (21e0dd9).

Additional details and impacted files
@@           Coverage Diff            @@
##             main     #121    +/-   ##
========================================
  Coverage   99.28%   99.29%            
========================================
  Files           6        6            
  Lines        1404     1561   +157     
========================================
+ Hits         1394     1550   +156     
- Misses         10       11     +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Only affects markdown ranges, but it wasn't correctly filtering out the correct ranges because it was finding the first item of a level in all ranges, not after the offset
By having inline elements at a higher level than text, it caused for some strange breakpoints.

While an inline element can have a text element inside it, this should then be skipped if necessary. It still allows inline elements to be kept together, but also allow for smaller text elements to still get pulled in.

This also makes sure that higher semantic levels get preferred if they are shorter in length than a lower level, to avoid the algorithm stopping sooner for a lower level, when a higher level could fit.

This also adds an optimization to drop ranges from the caches if we have already moved past them, since these ranges get iterated over quite frequently.
@benbrandt benbrandt changed the title Allocation reduction Reduce allocations, and other optimizations Mar 25, 2024
@benbrandt benbrandt merged commit 35f30dc into main Mar 25, 2024
24 checks passed
@benbrandt benbrandt deleted the allocation-reduction branch March 25, 2024 19:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reduce number of allocations in chunk generation
1 participant