-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CI Failure (Timeout: Leadership did not stablize) in MultiTopicAutomaticLeadershipBalancingTest.test_topic_aware_rebalance
#11454
Comments
I've looked at a couple of the log files: The first one in the issue: The test expects
For the most recent test failure: The test expects
In both cases the last two lines appear to show it getting worse. |
Side note: possibly we should capture the initial conditions/input to the balancer (including random seed) and encode this in a unit test, since it seems like plausibly it's getting stuck in a local minimum or there is another problem with the balancer or assumptions in the test and we don't really need a full ducktape test to iterate on that. cc @ballard26 |
5 new occurrences: FAIL test: MultiTopicAutomaticLeadershipBalancingTest.test_topic_aware_rebalance (5/31 runs) |
Yeah there are a lot of these occurrences. I think now there's Michal, Ben and Brandon possibly involved here -- please align so there's no duplicated effort -- really appreciate folks jumping on the most-failing test. |
Using progress tracking and unbounded timeout to track topic aware rebalancing. Instead of using a particular timeout value we verify if topic aware rebalancing is making progress. If so we give the test a time to pass. Fixes: redpanda-data#11454 Signed-off-by: Michal Maslanka <[email protected]>
Using progress tracking and unbounded timeout to track topic aware rebalancing. Instead of using a particular timeout value we verify if topic aware rebalancing is making progress. If so we give the test a time to pass. Fixes: redpanda-data#11454 Signed-off-by: Michal Maslanka <[email protected]>
Using progress tracking and unbounded timeout to track topic aware rebalancing. Instead of using a particular timeout value we verify if topic aware rebalancing is making progress. If so we give the test a time to pass. Fixes: redpanda-data#11454 Signed-off-by: Michal Maslanka <[email protected]>
FAIL test: MultiTopicAutomaticLeadershipBalancingTest.test_topic_aware_rebalance (4/13 runs) |
Using progress tracking and unbounded timeout to track topic aware rebalancing. Instead of using a particular timeout value we verify if topic aware rebalancing is making progress. If so we give the test a time to pass. Fixes: redpanda-data#11454 Signed-off-by: Michal Maslanka <[email protected]>
Think this should be fixed by #17145, lets reopen if the issue persists after the fix. |
https://buildkite.com/redpanda/redpanda/builds/31317#0188bb7e-46df-41db-a768-aebe518f818b/363-5766
Failure mode is different from #11044:
The text was updated successfully, but these errors were encountered: