-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-28413 Fix race condition in TestCleanerChore.retriesIOExceptionInStatus #5735
base: master
Are you sure you want to change the base?
HBASE-28413 Fix race condition in TestCleanerChore.retriesIOExceptionInStatus #5735
Conversation
…InStatus We occasionally get a test failure in TestCleanerChore.retriesIOExceptionInStatus. For example, from a recent PR build [0] on branch-2.6, ``` java.util.concurrent.ExecutionException: java.io.IOException: whomp whomp. at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:395) at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1999) at org.apache.hadoop.hbase.master.cleaner.TestCleanerChore.retriesIOExceptionInStatus(TestCleanerChore.java:163) ... Caused by: java.io.IOException: whomp whomp. at org.apache.hadoop.hbase.master.cleaner.TestCleanerChore$1.listStatus(TestCleanerChore.java:134) at org.apache.hadoop.hbase.master.cleaner.CleanerChore.traverseAndDelete(CleanerChore.java:475) at org.apache.hadoop.hbase.master.cleaner.CleanerChore.lambda$chore$0(CleanerChore.java:258) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) ... 1 more ``` This looks like a race condition where the chore manages an entire execution between when the flag is flipped and when the test thread gets back around to continuing execution. Make the test a little more pessimistic about its view of the world. [0]: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5725/1/testReport/org.apache.hadoop.hbase.master.cleaner/TestCleanerChore/precommit_checks___yetus_jdk11_hadoop3_checks___retriesIOExceptionInStatus/
Relates to test change introduced in #4730. |
hbase-server/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestCleanerChore.java
Outdated
Show resolved
Hide resolved
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
The test is failing in jdk17 build? |
I think that my fix is incomplete. Or there's something else disrupting the test harness. The test was terminated, interrupting the chore thread, while the waiter loop had only sat for 5 seconds.
|
In fact, jenkins doesn't tell us which of these failed runs is the output we have in the unit test summary.
|
There is still a race.
|
We occasionally get a test failure in TestCleanerChore.retriesIOExceptionInStatus. For example, from a recent PR build 0 on branch-2.6,
This looks like a race condition where the chore manages an entire execution between when the flag is flipped and when the test thread gets back around to continuing execution. Make the test a little more pessimistic about its view of the world.