Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use max-heap to collect shink candidates by byte savings #33731

Closed
wants to merge 12 commits into from

Conversation

HaoranYi
Copy link
Contributor

@HaoranYi HaoranYi commented Oct 17, 2023

Problem

  1. selecting storage to shink by alive ratio may not improve the alive ratio as much as by byte savings.

For example, given two storages below:
Storage A: alive = 1, total = 2, alive ratio = 0.5
Storage B: alive = 6, total = 10, alive ratio = 0.6

If we select by ratio, then A will be chosen to shrink, which gives result ratio of 7/11 (0.63).
However, if we select by bytes savings, B will be chosen. This gives the result ratio of 7/8 (0.875).

  1. When collecting candidate storages for shrinking, we don't need to sort all the
    storages. We could just partially sort the storages before the cutoff (i.e. use a heap).

Summary of Changes

Use max-heap to store and partially sort and select candidate storage for shrinking by bye savings.

Fixes #

@HaoranYi HaoranYi changed the title use min-heap to collect shink candidate by alive ratio use min-heap to collect shink candidates by alive ratio Oct 17, 2023
@codecov
Copy link

codecov bot commented Oct 17, 2023

Codecov Report

Merging #33731 (402d9c0) into master (673a38c) will increase coverage by 0.0%.
Report is 26 commits behind head on master.
The diff coverage is 96.4%.

@@           Coverage Diff           @@
##           master   #33731   +/-   ##
=======================================
  Coverage    81.8%    81.8%           
=======================================
  Files         806      806           
  Lines      217859   217923   +64     
=======================================
+ Hits       178239   178296   +57     
- Misses      39620    39627    +7     

@brooksprumo
Copy link
Contributor

How are the performance numbers, before and after?

@HaoranYi
Copy link
Contributor Author

How are the performance numbers, before and after?

image
looks like there is not much performance difference (red: master (#33739), blue: this pr).

@HaoranYi HaoranYi changed the title use min-heap to collect shink candidates by alive ratio use max-heap to collect shink candidates by byte savings Oct 18, 2023
Copy link
Contributor

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable to me, just one nit.

accounts-db/src/accounts_db.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@brooksprumo brooksprumo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. One nit/question.

accounts-db/src/accounts_db.rs Outdated Show resolved Hide resolved
@brooksprumo
Copy link
Contributor

How do the byte savings look on mnb and/or pop-net?

@HaoranYi
Copy link
Contributor Author

How do the byte savings look on mnb and/or pop-net?

image

comparison of mnb run from last night shows not much difference.

blue: this pr
red: master

@github-actions github-actions bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Nov 6, 2023
@github-actions github-actions bot closed this Nov 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale [bot only] Added to stale content; results in auto-close after a week.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants