-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
k/p/gen: Use fragmented_vector
for fetchable_partition_response
#11181
k/p/gen: Use fragmented_vector
for fetchable_partition_response
#11181
Conversation
239a03f
to
84a5002
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible that large_fragment_vector is too far in the wrong direction
I think the general concern is that we shouldn't just blindly dump the fragmented_vector everywhere because it will create lots of memory wastage if we only put something like a couple 8 bytes elements into it (and this is even worse for the large_fragment_vector I guess).
fetch_response::partition_response
isn't too large but I guess we are expecting many of those for topics with higher partition counts?
src/v/utils/fragmented_vector.h
Outdated
/** | ||
* Assign from a std::vector. | ||
*/ | ||
fragmented_vector& operator=(const std::vector<T>& rhs) noexcept { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this because it is not needed anymore or for any other reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was added to satisfy a use case that is no longer needed. If we want something like this I'd rather be explicit, as it can hide unexpected copies.
make_fragmented_vector(std::initializer_list<T> in) { | ||
fragmented_vector<T> ret; | ||
for (auto& e : in) { | ||
ret.push_back(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can one move out of a initializer_list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my hunch would be yes. but being a test helper, im not sure how much it matters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See #13854
Yeah, the oversize allocation was in: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right makes sense, I guess the question is how much we are pessimizing responses with few partitions.
If there is no new data on a partition do we return both a metadata response and a partition response?
I guess it's fine either way as long as there is only a single one for each fetch request.
The cover page mentions metadata response, but this change affects the produce path, not metadata path, right? So it's also perf/memory sensitive. |
The fetch response is the primary fix, but I reworked some stuff in the generator, so the metadata response builds into a fragmented_vector, instead of a std::vector for metadata response, so a large alloc and copy is avoided there, too.
Aye. |
Honestly, I think @travisdowns as the author of #8469, do you have any thoughts about the pessimisation here? Towards the end of this PR it should be possible to avoid
They're separate responses for separate requests. Both occur quite often. |
I noticed this has been sitting stale for a little bit. I think @BenPope has an outstanding question for @travisdowns irt if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
{%- if field.nullable() %} | ||
{%- if flex %} | ||
{{ fname }} = reader.read_nullable_flex_array([version](protocol::decoder& reader) { | ||
{%- else %} | ||
{{ fname }} = reader.read_nullable_array([version](protocol::decoder& reader) { | ||
{%- endif %} | ||
{%- else %} | ||
{%- if flex %} | ||
{{ fname }} = reader.read_flex_array([version](protocol::decoder& reader) { | ||
{%- else %} | ||
{{ fname }} = reader.read_array([version](protocol::decoder& reader) { | ||
{%- endif %} | ||
{%- endif %} | ||
{%- set nullable = "nullable_" if field.nullable() else "" %} | ||
{%- set flex = "flex_" if flex else "" %} | ||
{{ fname }} = reader.read_{{nullable}}{{flex}}array([version](protocol::decoder& reader) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🔥
make_fragmented_vector(std::initializer_list<T> in) { | ||
fragmented_vector<T> ret; | ||
for (auto& e : in) { | ||
ret.push_back(e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my hunch would be yes. but being a test helper, im not sure how much it matters?
not sure if there are more outstanding questions. maybe ping travis? |
@BenPope - yeah, the part where fragmented vector creates a large first segment as soon as the first entry is added is a problem. Arguably it's worse here for fetch because this is a much hotter path in general, and it's also involved in the "poll" loop where we wake up periodically to do the whole fetch execution again but then go to sleep if we don't get enough bytes. That said, there are 8 commits here and I think the first 7 seems non-controversial. So we could always go with those now and continue the discussion on fetch problem? The fetch is a problem and I guess we have to solve it. |
Is this just a matter of using a more modest segment size in the fragmented vector? maybe not, as it sounds like, between the "poll" case and large fetches the variability is large enough that we need something adaptive? maybe we could ditch the vector entirely and use a boost intrusive list... |
84a5002
to
6865d5b
Compare
I've rebased everything but the last commit into #13854 with some minor changes. |
6865d5b
to
bc63e9f
Compare
|
The merge-base changed after approval.
Right, I think it's exactly a case of using a small segment size, like say 1K. 1K isn't going to bother anyone and since these are small objects we can still fit plenty into a segment so I don't see much of a downside here. Yes there will be more segments, but none of the operations are really O(number_of_segments) anyway (arguably object destruction can be for primitives which avoid O(n) deletion cost otherwise, but oh well). |
Signed-off-by: Ben Pope <[email protected]>
* Introduce `small_fragment_vector` * Switch `fetchable_partition_response` to it Signed-off-by: Ben Pope <[email protected]>
de9e26f
to
f33973c
Compare
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Known failure: #14053 |
/backport v23.2.x |
/backport v23.1.x |
Failed to create a backport PR to v23.1.x branch. I tried:
|
/backport v23.1.x |
Avoid oversize allocations by converting
fetchable_partition_response
tosmall_fragment_vector
Fixes #11017
Backports Required
Release Notes
Improvements