-
Notifications
You must be signed in to change notification settings - Fork 949
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(tiering): Faster small bins serialization #3340
Conversation
549f0a0
to
150256b
Compare
Signed-off-by: Vladislav Oleshko <[email protected]>
void FlushDelayed(bool force) { | ||
// Flush pages with most records accumulated first, or all, if forced. | ||
// It's enough just to issue reads, because they are collapsed by the tiered storage internally | ||
while ((force && !delayed_.empty()) || delayed_.size() > kMaxPageAccum) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to choose some kind of limit here... by the number of pages? by the total sum of enties? 🤷🏻♂️ We have to take into account that once we flush all of those, we can get a memory spike
} | ||
|
||
// Flush tiered delayed entries to avoid reordering with journal | ||
tiered_serializer_->FlushDelayed(true); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we pass the value with ChangeReq& and find out whether it was tiered, only if it's tiered and part of a relevant page, we should flush it
int64_t memory_margin_ = 0; | ||
std::vector<tiering::DiskSegment> delayed_deletes_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: better naming
@romange Please look only at the idea, I left some open questions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice that you suggest to do it without changing the RDB format.
I am not sure about the effectiveness of flush though. the probability of keys from the same page being in the same serialization batch are low.
: db_slice_(slice), dest_(dest), compression_mode_(compression_mode) { | ||
: db_slice_(slice), | ||
dest_(dest), | ||
tiered_serializer_(new TieredSerializer{}), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you want to initialize it unconditionally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤷🏻♂️ I don't think it's expensive
auto entries = delayed_.extract(page); | ||
for (auto& [base, value] : entries.mapped()) { | ||
DCHECK(value.IsExternal()); | ||
pending_.push_back(Read(std::move(base), value)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what guarantess that by the time you flush delayed, their segment are still correct?
maybe the pages were freed and repurposed for other values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inside tiered storage, I don't delete small bin pages while we're serializing
delayed_sizes_.insert({entries.size(), page}); | ||
} | ||
} else { | ||
pending_.push_back(Read(std::move(base), pv)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: pending_reads_
I thought about this more. Master is the more fragile piece compared to slave or an instance loading the snapshot. |
Closed in favour of #3396 |
No description provided.