-
Notifications
You must be signed in to change notification settings - Fork 7.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZOOKEEPER-3037 - Add JVMPauseMonitor #904
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this code taken from HBase or another ASF project?
Yes, JVMPauseMonitor is taken out of Hadoop Common |
ok, I think that we should cite it in the header, like: even if it is an ASF project |
anyway this is a great feature, thanks ! |
@nkalmar Please add some description to the pull request or link the Jira ticket. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the code looks good to me, some comments on the testing side:
- Currently the tests are not unit tests: they're testing multiple things in a single method. We need to split them.
- Please don't use
Thread.sleep()
in unit tests, - The test file is called
JvmPauseMonitorTest
, but it's not testing the class that is named of. It tests the configuration parsing logic. - Please categorise your unit tests into separate classes based on the single piece of logic that you're testing, e.g. split them into QuorumPeerMainTest, QuorumPeerConfigTest, ServerConfigTest, etc.
- Tests in
JvmPauseMonitorTest
should validate and only validate code inJvmPauseMonitor
and its dependencies nothing else. - Similarly when you test QuorumPeerMain whether it uses JvmPauseMonitor correctly, you should mock JvmPauseMonitor and verify the interaction with mocked methods.
I know that this code has been borrowed from the HBase project, but I'd like to improve the testing side as much as we can without too many modification in the original logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome.
I left some comments about the test
zookeeper-server/src/test/java/org/apache/zookeeper/server/util/JvmPauseMonitorTest.java
Show resolved
Hide resolved
Assert.assertEquals(sleepTime, Long.valueOf(pauseMonitor.sleepTimeMs)); | ||
Assert.assertEquals(infoTH, Long.valueOf(pauseMonitor.infoThresholdMs)); | ||
|
||
Thread.sleep(1000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is better to have a loop and wait at most X seconds for the condition to be true.
This way we will reduce flakyness
Thanks guys, refactored the unit tests. While I'm still using Thread.sleep(), the intention has changed. I think this is OK, but let me know if you have another solution to wait until a condition becomes true. (I know there are 3rd party libraries but I didn't want to introduce a dependency) edit: I could just have an empty while loop or pull the value from JvmPauseMonitor... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking a lot better now. Just a few more nitpicks please.
} | ||
} | ||
|
||
String ret = "Detected pause in JVM or host machine (eg GC): pause of approximately " + extraSleepTime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use String.format
instead?
final Long warnTH = 5555L; | ||
final Long infoTH = 555L; | ||
|
||
QuorumPeerConfig quorumPeerConfig = new MockQuorumPeerConfig(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use the "mocked" version here, it's not needed.
Assert.assertEquals(warnTH, Long.valueOf(pauseMonitor.warnThresholdMs)); | ||
Assert.assertEquals(infoTH, Long.valueOf(pauseMonitor.infoThresholdMs)); | ||
|
||
while(pauseMonitor.getNumGcInfoThresholdExceeded() == 0 && pauseMonitor.getNumGcWarnThresholdExceeded() == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might want to ||
here instead of &&
.
monitorThread.start(); | ||
} | ||
|
||
public void serviceStop() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is only used in the test currently. Please add it to shutdown()
methods of ZooKeeperServer
and QuorumPeer
classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 lgtm
retest this please |
retest maven build |
Merged to master. Thanks @nkalmar ! |
https://issues.apache.org/jira/browse/ZOOKEEPER-3037 Author: Norbert Kalmar <[email protected]> Reviewers: [email protected] Closes apache#904 from nkalmar/ZOOKEEPER-3037 and squashes the following commits: a610532 [Norbert Kalmar] ZOOKEEPER-3037 - add serviceStop() and improve unit tests 7d0baaa [Norbert Kalmar] ZOOKEEPER-3037 - refactor unit tests 97d2c61 [Norbert Kalmar] ZOOKEEPER-3037 - cite hadoop-common as source 3661389 [Norbert Kalmar] ZOOKEEPER-3037 - Add unit test and various improvements f309757 [Norbert Kalmar] ZOOKEEPER-3037 - Add JvmPauseMonitor (cherry picked from commit e9adf6e)
Backporting https://issues.apache.org/jira/browse/ZOOKEEPER-3037 from branch-3.6 to branch-3.5. Author: Norbert Kalmar <nkalmaryahoo.com> Reviewers: andorapache.org Closes #904 from nkalmar/ZOOKEEPER-3037 and squashes the following commits: a610532 [Norbert Kalmar] ZOOKEEPER-3037 - add serviceStop() and improve unit tests 7d0baaa [Norbert Kalmar] ZOOKEEPER-3037 - refactor unit tests 97d2c61 [Norbert Kalmar] ZOOKEEPER-3037 - cite hadoop-common as source 3661389 [Norbert Kalmar] ZOOKEEPER-3037 - Add unit test and various improvements f309757 [Norbert Kalmar] ZOOKEEPER-3037 - Add JvmPauseMonitor (cherry picked from commit e9adf6e) Author: Norbert Kalmar <[email protected]> Reviewers: Enrico Olivelli <[email protected]>, Mate Szalay-Beko <[email protected]>, Andor Molnar <[email protected]> Closes #1594 from symat/ZOOKEEPER-3037-branch-3.5
https://issues.apache.org/jira/browse/ZOOKEEPER-3037 Author: Norbert Kalmar <[email protected]> Reviewers: [email protected] Closes apache#904 from nkalmar/ZOOKEEPER-3037 and squashes the following commits: a610532 [Norbert Kalmar] ZOOKEEPER-3037 - add serviceStop() and improve unit tests 7d0baaa [Norbert Kalmar] ZOOKEEPER-3037 - refactor unit tests 97d2c61 [Norbert Kalmar] ZOOKEEPER-3037 - cite hadoop-common as source 3661389 [Norbert Kalmar] ZOOKEEPER-3037 - Add unit test and various improvements f309757 [Norbert Kalmar] ZOOKEEPER-3037 - Add JvmPauseMonitor
https://issues.apache.org/jira/browse/ZOOKEEPER-3037 Author: Norbert Kalmar <[email protected]> Reviewers: [email protected] Closes apache#904 from nkalmar/ZOOKEEPER-3037 and squashes the following commits: a610532 [Norbert Kalmar] ZOOKEEPER-3037 - add serviceStop() and improve unit tests 7d0baaa [Norbert Kalmar] ZOOKEEPER-3037 - refactor unit tests 97d2c61 [Norbert Kalmar] ZOOKEEPER-3037 - cite hadoop-common as source 3661389 [Norbert Kalmar] ZOOKEEPER-3037 - Add unit test and various improvements f309757 [Norbert Kalmar] ZOOKEEPER-3037 - Add JvmPauseMonitor (cherry picked from commit e9adf6e) Change-Id: I7654fc85168bf163bdbbe96ebf73cb9ccf265e81
https://issues.apache.org/jira/browse/ZOOKEEPER-3037