Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Perf] Only use raw iterators with RocksDB and speed up ledger load #2518

Merged
merged 3 commits into from
Nov 4, 2024

Conversation

ljedrz
Copy link
Collaborator

@ljedrz ljedrz commented Jul 12, 2024

The signatures of iteration-related methods (with the rocksdb feature) have been haunting my heap profiles for a long time now, and while I initially believed this was something we could fine-tune with some configuration, all such attempts were unsatisfactory. It turns out that the "default" (higher-level) RocksDB iterators are just inherently inefficient, and only the DBRawIterator is able to avoid a massive number of allocations, some of which may remain lingering in the RSS and contribute to its inflation over time. The raw iterator is somewhat trickier to use correctly, but since it's what the regular iterator is built upon, there is nothing particularly novel about it, and we're already using it to count records.

With these changes I ran profiling with both an instrumented OS allocator and jemalloc and - contrary to the usual results - it was the latter that experienced larger relative differences when loading the ledger:

  • total allocs and temp allocs were both reduced by 98%
  • ledger load time was down 14%
  • max RSS was reduced by 13%

In addition, heaptrack ran 73% faster and produced a 99% smaller profile, which is very practical for future profiling needs.

Alongside these changes I realized that the current len_confirmed method - while working correctly and passing all tests - can be made more solid by including iterator validity checks; the only situation where this would truly be needed is if it was called on the very last map in the database and if the map was empty, but this commit guards against that edge case.

The final commit is a further improvement on one of the changes from #2515 (and made possible with the switch to raw iterators).

CI run link

Copy link
Collaborator

@vicsn vicsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

98%?! Amazing

Copy link

@Meshiest Meshiest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Impressive gains

@vicsn
Copy link
Collaborator

vicsn commented Oct 22, 2024

Superseded by #2561 to merge staging and fix cargo fmt

@alzger alzger merged commit 591f3bf into ProvableHQ:staging Nov 4, 2024
76 of 83 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants