-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix][broker] partitioned __change_events topic is policy topic #20392
[fix][broker] partitioned __change_events topic is policy topic #20392
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I still think this is the right change, but it is odd that the net effect is to change when we delete the topic policy topic. See: pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/admin/impl/NamespacesBase.java Lines 257 to 294 in 908d0b3
|
Codecov Report
@@ Coverage Diff @@
## master #20392 +/- ##
=============================================
+ Coverage 36.82% 72.94% +36.11%
- Complexity 12064 31878 +19814
=============================================
Files 1687 1864 +177
Lines 128856 138416 +9560
Branches 14018 15188 +1170
=============================================
+ Hits 47456 100968 +53512
+ Misses 75154 29440 -45714
- Partials 6246 8008 +1762
Flags with carried forward coverage won't be shown. Click here to find out more.
|
It's possible that an important part of the fix is here: pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java Lines 1345 to 1350 in ad236ca
With this change, we won't send a notification that the |
It looks like the order is likely the key to this bug fix. When we mis-classify the topic policies topic (as we do before this fix), we likely do the following:
Then, when we go to delete the tenant (in the flaky test), we get a failure because there is a managed ledger, and that will happen to be the one for the |
Note that #17609 attempted to prevent the conditions I describe above. However, I think the key mistake may be in the logic made here: pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BrokerService.java Lines 3291 to 3301 in e008de9
That pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/BrokerService.java Lines 3037 to 3039 in e008de9
It would seem that getting the policies only when cached presents a problem because the notification that the namespace is being deleted might not actually be present on the broker that receives the lookup request. Perhaps we should update that logic to get the latest version of the namespace policies object. |
Now, I am wondering if #18220 was necessary because of this bug. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Good work @michaeljmarshall
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch
(cherry picked from commit 9918bce)
(cherry picked from commit 9918bce)
(cherry picked from commit 9918bce) backported to branch-2.10 by renaming SystemTopicsNames to EventsTopicNames
Fixes #20376
Motivation
The
__change_events
system topic is created in each namespace. It is used for topic policies. TheSystemTopicNames#isSystemTopic
classifies partitioned and non-partitioned__change_events
topics as system topics. TheSystemTopicNames#isTopicPoliciesSystemTopic
only classifies non-partitioned__change_events
topics as topic policies system topics. I think this is a bug, and we need to make sure that parititioned__change_events
topics are topic policies system topics.Note: I am not sure that it is recommended to use a partitioned topic for the topic policies, but for backwards compatibility, I think we have to classify a partitioned
__change_events
topic as a topic policies topic. Since #10850, we test partitioned__change_events
topics.The
AdminApi2Test
class fails frequently with an error associated with managed ledgers not being deleted yet. That error may be because of an incomplete classification of the change events system topic name. I say this because the test fails at tenant deletion, and tenant deletion appears to rely on correct classification of the topic policies topic:pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/admin/impl/NamespacesBase.java
Lines 262 to 266 in 908d0b3
Modifications
SystemTopicNames#isTopicPoliciesSystemTopic
so that it considers a partitioned__change_events
topic to be a topic policies system topic.PersistentTopic
so that it does not do its own classification and instead relies on theSystemTopicNames
class.SystemTopicNames#isSystemTopic
.Verifying this change
A new test is added.
Documentation
doc-not-needed
Matching PR in forked repository
PR in forked repository: Skipping for this minor change