-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce ~2 GBs mem by avoiding another overalloc. #14806
Conversation
Codecov Report
@@ Coverage Diff @@
## master #14806 +/- ##
========================================
Coverage 80.2% 80.3%
========================================
Files 403 403
Lines 102221 102476 +255
========================================
+ Hits 82072 82300 +228
- Misses 20149 20176 +27 |
I finally filled this pretty un-documented draft pr and promoted out of being draft. :) |
perf/src/packet.rs
Outdated
@@ -61,7 +65,7 @@ impl Packets { | |||
pub fn to_packets_chunked<T: Serialize>(xs: &[T], chunks: usize) -> Vec<Packets> { | |||
let mut out = vec![]; | |||
for x in xs.chunks(chunks) { | |||
let mut p = Packets::default(); | |||
let mut p = Packets::with_capacity(chunks); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be Packets::with_capacity(x.len())
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, yeah, if the xs
doesn't divide by chunks
. The last element can still be overallocated. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, probably not a big deal either way, but might as well size to fit :)
@@ -61,7 +65,7 @@ impl Packets { | |||
pub fn to_packets_chunked<T: Serialize>(xs: &[T], chunks: usize) -> Vec<Packets> { | |||
let mut out = vec![]; | |||
for x in xs.chunks(chunks) { | |||
let mut p = Packets::default(); | |||
let mut p = Packets::with_capacity(chunks); | |||
p.packets.resize(x.len(), Packet::default()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah does the resize
call here not remove the excess bloat (based on the Vec
docs, looks like it might only truncate length, not the allocated capacity)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(based on the Vec docs, looks like it might only truncate length, not the allocated capacity)
That's correct. You need to call shrink_to_fit
if you wanted that behavior. But, better off avoiding the overallocation to begin with. :)
automerge label removed due to a CI failure |
* Reduce few GBs mem by avoiding another overalloc. * Use x.len() for the last item from chunks() (cherry picked from commit 015058e)
* Reduce few GBs mem by avoiding another overalloc. * Use x.len() for the last item from chunks() (cherry picked from commit 015058e)
* Reduce few GBs mem by avoiding another overalloc. * Use x.len() for the last item from chunks() (cherry picked from commit 015058e) Co-authored-by: Ryo Onodera <[email protected]>
* Reduce few GBs mem by avoiding another overalloc. * Use x.len() for the last item from chunks() (cherry picked from commit 015058e) Co-authored-by: Ryo Onodera <[email protected]>
Problem
Currently,
ClusterInfoVoteListener
over-allocates (unrecycled)Packets
by factor of 128. Ultimately, this results in approximately ~2G consistent unneeded memory use both on testnet and mainnet-beta, which is not easily swappable because of quick create and destroy.Details
Firstly, about the the overallocation:
ClusterInfoVoteListener
callsto_packets_chunked
and in turn it callsPackets::Default()
, which surprisingly creates PinnedVec with capacity of128
. (I have a draft cleaning pr for this)solana/core/src/cluster_info_vote_listener.rs
Line 344 in cbffab7
solana/perf/src/packet.rs
Lines 61 to 64 in cbffab7
solana/perf/src/packet.rs
Lines 21 to 22 in cbffab7
solana/perf/src/packet.rs
Line 11 in cbffab7
Then,
Packet
's memory size is a1304
:https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=adfaa40101d596d9e372f96b20f1a9df
And finally,
VerifiedVotePakcets
retains thosePackets
keyed byCrdsValueLabel::Vote(VoteIndex, Pubkey)
:solana/core/src/verified_vote_packets.rs
Line 9 in cbffab7
solana/core/src/verified_vote_packets.rs
Lines 28 to 32 in cbffab7
Also, combined with the fact that there are 300~400 validators both on testnet and mainnet-beta, and x32 votes per validator, we allocate 2G mem, only for 15M worth data (x 128):
And this is found by anallyzing the following
heaptrack
info:Changes
Just don't overallocate. In this case,
Packets
with capacity of1
seems to be correct.Also, as this was such an instance, I think
Packets::default
should be changed. I'll do that as a follow-up PR; I first want to land this pretty low-risk memory saving pr to ship it to both v1.4 & v1.5.Found via: #14366
related to in that this is overallocation bug: #14435
started to overallocate maybe at?: #9694