-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
partition_allocator: exclude internal topics from per node partition limit checks #11249
partition_allocator: exclude internal topics from per node partition limit checks #11249
Conversation
4d43e9f
to
1c0f077
Compare
ss::sharded<members_table>& /*members_table*/, | ||
config::binding<std::optional<size_t>> /*memory_per_partition*/, | ||
config::binding<std::optional<int32_t>> /*fds_per_partition*/, | ||
config::binding<uint32_t> /*partitions_per_shard*/, | ||
config::binding<uint32_t> /*partitions_reserve_shard0*/, | ||
config::binding< | ||
std::vector<ss::sstring>> /*kafka_topics_skipping_allocation*/, | ||
config::binding<bool> /*enable_rack_awareness*/); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: comments are unnecessary, these can just be parameter names
return [](const allocation_node& node) { return !node.is_full(); }; | ||
make_evaluator(const model::ntp& ntp, const replicas_t&) const final { | ||
return [&ntp](const allocation_node& node) { | ||
return !node.is_full(ntp); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: not necessary to fix right now, but I have a feeling that allocation_state
should be just "dumb state" (e.g. for full node check it should just provide the number of currently allocated partitions) and all logic should be in evaluators. The reason is that 1) evaluators already contain some of the logic, why not all? and 2) they have more context available (existing replicas, previous node for that replica (not available right now, but we'll have to add it soon)) so we'll either have to pass it all to allocation_state
or just decide everything in the evaluators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right.. this thought did cross my mind, I agree with you. Makes sense to move to constraint once the context becomes more complicated.
internal_topics.cbegin(), | ||
internal_topics.cend(), | ||
[&ntp](const ss::sstring& topic) { | ||
return ntp.ns() == model::kafka_namespace && topic == ntp.tp.topic(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I guess namespace check can be hoisted out of the any_of
loop
Now that the constraint check depends on the ntp being allocated, this propagates ntp into hard_constraints evaluator.
These topics are meant to be excluded from the reservation checks. To be used in the next set of commits.
1c0f077
to
5ca4dda
Compare
Rebased to fix conflicts and addressed comments. |
Internal topics are now excluded from partition allocator limit checks. This means that a given allocation node can host any number of such internal topic partitions. A topic is considered internal in the following cases:
redpanda
orkafka_internal
namespaces orkafka
namespace but included inkafka_nodelete_topics
The rationale for do so is that these topics can potentially be created lazily, for example transactions related topics are created when a user uses transactions for the first time. This could be problematic if the user has already exhausted
topic_partitions_per_shard
limits and the creation of these internal topics just hangs. To avoid such scenarios we exclude these internal topics from limit checks. This should not be problematic in practice because total number of such internal partitions in a typical deployment is fairly small compared to total number of non internal partitions.Fixes: #10995
Backports Required
Release Notes