-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix][test] Fix flaky test deleteNamespaceGracefully #18220
Conversation
7cb7b20
to
05212ce
Compare
@@ -164,7 +164,7 @@ public void resetClusters() throws Exception { | |||
pulsar.getConfiguration().setForceDeleteNamespaceAllowed(true); | |||
for (String tenant : admin.tenants().getTenants()) { | |||
for (String namespace : admin.namespaces().getNamespaces(tenant)) { | |||
deleteNamespaceGraceFullyByMultiPulsars(namespace, true, admin, pulsar, | |||
deleteNamespaceGraceFully(namespace, true, admin, pulsar, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question, can we use the admin.namespaces().deleteNamespace()
instead of this?
deleteNamespaceGraceFully
looks like a hack operation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When admin.namespaces().deleteNamespace()
and auto create topic __change_event
are concurrent executed, there is a problem #17070, and deleteNamespaceGraceFully
is used to solve this problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I try to fix the flaky test this way: make delete namespace after __change_events is successfully created, but there still has another race condition: the
__change_events/__compaction
async creates and__change_events
delete by namespace delete. Therefore, I will disabled systemTopic in methodtestDeleteTenant
to solve this flaky test.
Do you point?
If right, I suggest we should fix this first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In user use, the probability of occurrence is almost 0. Even if it actually occurs, the user can try to execute the delete namespace
again. But to solve this problem fundamentally requires a big change
05212ce
to
dce5599
Compare
dce5599
to
d0d9f3f
Compare
8dc781c
to
d740815
Compare
} | ||
|
||
/** | ||
* Wait until system topic "__change_event" and subscription "__compaction" are created, and then delete the namespace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the description need to be modified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already fixed.
canPausedNamespaceService.pause(); | ||
} | ||
|
||
Awaitility.await().until(() -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can add the most wait time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already fixed.
ebec5a7
to
882cd56
Compare
@poorbarcode Could you fix these conflicts? |
882cd56
to
d442374
Compare
Codecov Report
@@ Coverage Diff @@
## master #18220 +/- ##
============================================
- Coverage 47.39% 47.35% -0.05%
- Complexity 10479 10483 +4
============================================
Files 698 698
Lines 68070 68070
Branches 7279 7279
============================================
- Hits 32264 32235 -29
- Misses 32228 32255 +27
- Partials 3578 3580 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
|
/pulsarbot rerun-failure-checks |
1 similar comment
/pulsarbot rerun-failure-checks |
/pulsarbot rerun-failure-checks |
1 similar comment
/pulsarbot rerun-failure-checks |
Already fixed. |
(cherry picked from commit c544ea3)
This reverts commit a3e593a.
Fixes #18232
Motivation
https://github.com/poorbarcode/pulsar/actions/runs/3267034877/jobs/5371836130
https://github.com/apache/pulsar/actions/runs/3242148385/jobs/5326408439
The original implementation determines whether to wait for the creation of
__chang_event
based on whether a broker has taken ownership of the bundle or not. However, there are often 'set namespace policy', 'unload namespace', 'delete namespace', 'split bundle', and other operations in the test case. When thebundle unload
and the 'bundle checkconcurrently execute, the method
deleteNamespaceGraceFully` will run unstably.Modifications
deleteNamespaceGraceFully
->deleteNamespaceWithRetry
deleteNamespaceGraceFully
in classBrokerTestBase
, and thenMockedPulsarServiceBaseTest
calledBrokerTestBase.deleteNamespaceGraceFully()
. ButMockedPulsarServiceBaseTest
is the super class ofBrokerTestBase
, so move methoddeleteNamespaceGraceFully
toMockedPulsarServiceBaseTest
.Documentation
doc
doc-required
doc-not-needed
doc-complete
Matching PR in forked repository
PR in forked repository: