Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wip: ancient packing algorithm tweaks #2849

Draft
wants to merge 19 commits into
base: master
Choose a base branch
from

Conversation

jeffwashington
Copy link

Problem

ancient packing when skipping rewrites has some non-ideal behavior

Summary of Changes

prototype branch to try some things out

Fixes #

@@ -5082,6 +5104,41 @@ impl AccountsDb {
}
};

let mut limit = 10;
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HaoranYi @brooksprumo this is the algorithm that I hacked up to have shrink pick up some shrinkable ancient storages if shrink isn't overloaded. I don't love this, but it appears to work.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

You can see the cliff when I restarted with this algorithm in place. It took care of a backlog of 160k shrinkable storages over a very short amount of time. This may even be too aggressive.

db.shrink_candidate_slots.lock().unwrap().insert(slot);
}
candidate_for_shrink
|| if can_randomly_shrink && thread_rng().gen_range(0..80000) == 0 {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this random constant needs to be proportional to total # ancient storages allowed.

@@ -577,6 +624,14 @@ impl AccountsDb {
})
.count()
.saturating_sub(randoms as usize);
// ideal storage size is total alive bytes of ancient storages / half of max ancient slots
tuning.ideal_storage_size = NonZeroU64::new(
(infos.total_alive_bytes.0 / tuning.max_ancient_slots.max(1) as u64 * 2).max(5_000_000),
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think ideal storage size needs to be dynamically calculated. Here, I picked max ancient slots / 2 should be able to hold all of the alive bytes. We could be more aggressive and maybe say / 4. This constant can be true no matter the width. 10k is the default width today. @HaoranYi this is a new pr.

return;
};
*/
let max_ancient_slots = 100_000; // this gives us approx 5M ideal size for testing against mnb now
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HaoranYi I think once other changes are made, this number probably needs to be a cli arg and it should probably default to 100k or so. The question is 432k + this number - ancient offset becomes the expected # of storages we may have open at one time. So, that affects the global mmap or file handle limit. The more files we put here, the smaller size per file. Smaller files are great. Larger files are intense in i/o and cpu to rewrite.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. I will add a cli arg for this one.

percent_of_alive_shrunk_data: 55,
max_ancient_slots,
// don't re-pack anything just to shrink. Shrink will handle these old storages.
percent_of_alive_shrunk_data: 0,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can change this tuning parameter to 0 to stop re-packing shrinkable slots. This mixes warm and cold accounts, which is not good. Once shrink is taking care of ancient storages, the pack algorithm should not mix old and new. It should only mix new and small in order to hit the storage count limit.

@jeffwashington jeffwashington force-pushed the 4aug50_100k branch 9 times, most recently from 487bc42 to 51c34c0 Compare September 6, 2024 20:47
limit = shrink_slots.len() + 1;
}
let mut num_found = 0;
if shrink_slots.len() < limit {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition seems to be always true: shrink_slots().len is 0 or 1, limit is 2, shrink_slots.len() is 2, 3, ... limit is 3, 4, ... shrink_slots.len() + 1.

I'm trying to understand the purpose of this entire block. We add something to the shrink_slots only if the something's capacity is the same as storage capacity for the corresponding slot, but only one such addition is allowed (or two if we start with an empty list of shrink_slots). Could you, maybe, in a few words help me to figure out what we're accomplishing here in the context of the whole shrinking algorithm?

@dmakarov dmakarov self-assigned this Sep 12, 2024
@@ -550,7 +596,7 @@ impl AccountsDb {
let max_slot = slots.iter().max().cloned().unwrap_or_default();
// heuristic to include some # of newly eligible ancient slots so that the pack algorithm always makes progress
let high_slot_boundary = max_slot.saturating_sub(HIGH_SLOT_OFFSET);
let is_high_slot = |slot| slot >= high_slot_boundary;
let is_high_slot = |slot| false; // nothign is high slot slot >= high_slot_boundary;
Copy link

@dmakarov dmakarov Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a temporary replacement for experimentation or a permanent change that you intended to make for this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants