Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-24265 Remove hedged rpc call support, implement the logic in MaterRegistry … #1593

Merged
merged 4 commits into from
May 6, 2020

Conversation

Apache9
Copy link
Contributor

@Apache9 Apache9 commented Apr 27, 2020

…directly

@Apache9 Apache9 requested review from saintstack and bharathv April 27, 2020 10:10
@Apache9 Apache9 self-assigned this Apr 27, 2020
@Apache9
Copy link
Contributor Author

Apache9 commented Apr 27, 2020

The function here is like region replica so I do not think we should implement it in the rpc framework, as the prepered way in HBase, is to implement a special RpcRetryingCaller.

Anyway MasterRegistry is a bit strange, as it is the root of the client implementation, so we can not make use of the existing rpc retrying caller, but anyway, implement the hedge logic in MasterRegistry is still easier than in the rpc framework.

And I dropped the fan out parameter. This is because that, usually we will only have 2 masters, and even if we have more, I do not think it makes much difference. So I only support two options, send the request one by one, or send them altogether at the same time.

And since now the logic is in MasterRegistry, both rpc client implementation can work with MasterRegistry.

PTAL. @saintstack @bharathv
Thanks.

@Apache9 Apache9 changed the title Remove hedged rpc call support, implement the logic in MaterRegistry … HBASE-24265 Remove hedged rpc call support, implement the logic in MaterRegistry … Apr 27, 2020
@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 28s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
_ master Compile Tests _
+0 🆗 mvndep 0m 39s Maven dependency ordering for branch
+1 💚 mvninstall 5m 11s master passed
+1 💚 checkstyle 2m 34s master passed
+1 💚 spotbugs 4m 52s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 14s Maven dependency ordering for patch
+1 💚 mvninstall 3m 55s the patch passed
+1 💚 checkstyle 2m 12s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 hadoopcheck 13m 30s Patch does not cause any errors with Hadoop 3.1.2 3.2.1.
+1 💚 spotbugs 5m 50s the patch passed
_ Other Tests _
+1 💚 asflicense 0m 36s The patch does not generate ASF License warnings.
49m 58s
Subsystem Report/Notes
Docker Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/2/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #1593
Optional Tests dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle
uname Linux 46fe4994150a 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 84f2e95
Max. process+thread count 84 (vs. ulimit of 12500)
modules C: hbase-common hbase-client hbase-server U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/2/console
versions git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 31s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 32s Maven dependency ordering for branch
+1 💚 mvninstall 4m 7s master passed
+1 💚 compile 1m 58s master passed
+1 💚 shadedjars 5m 9s branch has no errors when building our shaded downstream artifacts.
-0 ⚠️ javadoc 0m 19s hbase-common in master failed.
-0 ⚠️ javadoc 0m 28s hbase-client in master failed.
-0 ⚠️ javadoc 0m 40s hbase-server in master failed.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 16s Maven dependency ordering for patch
+1 💚 mvninstall 3m 48s the patch passed
+1 💚 compile 1m 57s the patch passed
+1 💚 javac 1m 57s the patch passed
+1 💚 shadedjars 5m 12s patch has no errors when building our shaded downstream artifacts.
-0 ⚠️ javadoc 0m 18s hbase-common in the patch failed.
-0 ⚠️ javadoc 0m 26s hbase-client in the patch failed.
-0 ⚠️ javadoc 0m 38s hbase-server in the patch failed.
_ Other Tests _
+1 💚 unit 1m 35s hbase-common in the patch passed.
+1 💚 unit 1m 5s hbase-client in the patch passed.
-1 ❌ unit 130m 16s hbase-server in the patch failed.
162m 3s
Subsystem Report/Notes
Docker Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/2/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR #1593
Optional Tests javac javadoc unit shadedjars compile
uname Linux c28062a52037 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 84f2e95
Default Java 2020-01-14
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/2/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-common.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/2/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-client.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/2/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/2/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-common.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/2/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-client.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/2/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt
unit https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/2/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-server.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/2/testReport/
Max. process+thread count 3970 (vs. ulimit of 12500)
modules C: hbase-common hbase-client hbase-server U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/2/console
versions git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f)
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 21s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 30s Maven dependency ordering for branch
+1 💚 mvninstall 3m 50s master passed
+1 💚 compile 1m 42s master passed
+1 💚 shadedjars 5m 26s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 1m 18s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 13s Maven dependency ordering for patch
+1 💚 mvninstall 3m 44s the patch passed
+1 💚 compile 1m 42s the patch passed
+1 💚 javac 1m 42s the patch passed
+1 💚 shadedjars 5m 27s patch has no errors when building our shaded downstream artifacts.
-0 ⚠️ javadoc 0m 21s hbase-client generated 2 new + 2 unchanged - 0 fixed = 4 total (was 2)
_ Other Tests _
+1 💚 unit 1m 37s hbase-common in the patch passed.
+1 💚 unit 1m 7s hbase-client in the patch passed.
-1 ❌ unit 198m 13s hbase-server in the patch failed.
229m 42s
Subsystem Report/Notes
Docker Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/2/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR #1593
Optional Tests javac javadoc unit shadedjars compile
uname Linux 0e2e489f7973 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 84f2e95
Default Java 1.8.0_232
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/2/artifact/yetus-jdk8-hadoop3-check/output/diff-javadoc-javadoc-hbase-client.txt
unit https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/2/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-server.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/2/testReport/
Max. process+thread count 3680 (vs. ulimit of 12500)
modules C: hbase-common hbase-client hbase-server U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/2/console
versions git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f)
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@bharathv
Copy link
Contributor

Trying to understand the motivation for this refactor.

The function here is like region replica so I do not think we should implement it in the rpc framework, as the prepered way in HBase, is to implement a special RpcRetryingCaller.

I think hedging RPC support is a standard RPC layer thing (for ex grpc: [1]). We can probably make a case to port region replica to hedging framework in the RPC layer. Fun fact, when I first implemented this as a prototype, I had all the logic in the master registry itself and then the review comments from @apurtell and @saintstack were to push it into a layer below, which is RPC and it made sense to me.

Also, anyone who wants to implement hedging means they have to write a lot of boiler plate code around synchronizing responses from multiple RPC threads and issuing cancellations etc. It is not trivial and why would anyone want to do that all over again?

There are some issues we can fix, like the one you noted in the jira, to fix the usage of common fork join pool (I'm trying to think why I did that, I initially implemented it like a regular async code implementation, but revered it later). Let me think a little bit about why I did this and get back to you.

And I dropped the fan out parameter. This is because that, usually we will only have 2 masters, and even if we have more, I do not think it makes much difference. So I only support two options, send the request one by one, or send them altogether at the same time.

We use > 5 masters for redundancy in our critical production deployments. If we were to use this, we definitely don't want to spam every master for each of the request. -1 on removing it unless you have a strong counter argument.

[1] https://github.com/grpc/proposal/blob/master/A6-client-retries.md#hedging-policy

Copy link
Contributor

@bharathv bharathv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just realized this work was triggered by HBASE-24264. How did you narrow it down to testHedgedAsyncEcho? There is not much detail in the jira/commit. Any jstacks? Curious what was hanging in the test and if we can de-flake it.

@ndimiduk
Copy link
Member

I think >2 masters is a common deployment pattern. Typical ZooKeeper and HDFS HA deploy requires three coordinator hosts minimum, so it makes sense to deploy 3 masters across these same hosts as well.

@saintstack
Copy link
Contributor

To be clear, I'm good on disabling a flakey test till it gets love. Doesn't mean I in favor of a purge of this hedging facility.

@Apache9
Copy link
Contributor Author

Apache9 commented Apr 27, 2020

Things are different in HBase. HBase is a rich client, you can see the implementation of the roc retrying caller. Typically, in HBase, a rpc retrying will lead to a clearing of meta location cache and then relocating, or fetching the new masters.

For the case here, you have a test function to verify that whether the return protobuf message is valid right? In your old code, if someone returns an invalid protobuf message, the rpc framework will not consider it as incorrect and will return it to the upper layer and leads to a failure, even if other endpoints may give you the correct result. And if you implement the logic in MasterRegistry directly, you are free to wait request from other masters as well.

And implementing hedge read is not an easy work. As I mentioned on the JIRA issue, the implementation is not fully ‘async’, it just execute a blocking operation in a thread pool, and uses an anti pattern in java that execute a critical task in common pool, which may lead to unpredictable dead lock.

GRPC has the hedge read support does not mean we have to support the feature. Even if we make use of GRPC directly, as said above, I do not think we will use its hedge read feature directly. Different scenarios.

@Apache9
Copy link
Contributor Author

Apache9 commented Apr 27, 2020

And if you guys really want to implement the hedge read support, please open a new issue for it, and implement the feature in both rpc client implementations, and make it suitable for other operations as well, not only for MasterRegistry. For example, support the read replica feature.

Now the implementation was pulled in as a side effect of MasterRegistry, and lacked of lots of features. For example, only support NettyRpcClient, do not have built in retries support, etc.

To be clear, it is not a good idea to expose a feature which may have a long term impacting when implementing another feature. Developers may accidentally use it and then it becomes a cancer of the code since it is not in a good shape at the beginning. I spent a lot of time to purge the old stale code in sync client, both in rpc implementation and the old retrying caller, especially, the big big AsyncProcess class. I do not want to do it in the future again.

Thanks.

@Apache9
Copy link
Contributor Author

Apache9 commented Apr 28, 2020

I think >2 masters is a common deployment pattern. Typical ZooKeeper and HDFS HA deploy requires three coordinator hosts minimum, so it makes sense to deploy 3 masters across these same hosts as well.

It does not make much difference with 2 or 3 masters, if we have 10 masters then maybe it worths to implement the 'fan out limit' feature, but the code will also be harder to understand, think of a mix of the current 'sending all requests concurrently' and 'sending requests one by one'. Notice that, a blocking wait here is not allowed, so...

@Apache9
Copy link
Contributor Author

Apache9 commented Apr 28, 2020

We use > 5 masters for redundancy in our critical production deployments. If we were to use this, we definitely don't want to spam every master for each of the request. -1 on removing it unless you have a strong counter argument.

OK, for me it is fine to add this back. I have the confidence to implement this feature correctly.

@Apache9
Copy link
Contributor Author

Apache9 commented Apr 28, 2020

I think hedging RPC support is a standard RPC layer thing (for ex grpc: [1]). We can probably make a case to port region replica to hedging framework in the RPC layer. Fun fact, when I first implemented this as a prototype, I had all the logic in the master registry itself and then the review comments from @apurtell and @saintstack were to push it into a layer below, which is RPC and it made sense to me.

Most code of the current rpc-client and hbase-client are written by me so I think I'm more familiar with these things in HBase. We have retries everywhere in hbase-client, but we just use the same RpcRetryingCaller if we have the same retrying logic, so the argument here that if we do not implement hedge read in rpc layer then everyone needs to implement the logic by its own does not make sense.

The problem here is the design of the client library. The ConnectionRegistry is the root of everything, we must make it available before we actually create the RpcClient which is used across all the client, so we can not make use of all the RpcRetryingCallers. But this is only a problem for the Registry implementation, not a common problem for the whole hbase client.

"hedgingPolicy": {
"maxAttempts": 4,
"hedgingDelay": "0.5s",
"nonFatalStatusCodes": [
"UNAVAILABLE",
"INTERNAL",
"ABORTED"
]
}

Back to the feature in grpc, this is the configs for the feature. Looking at the code in HBase, what do we do when there is an error? We just send the request again to the same endpoint? No! Definitely not. We need to check whether the endpoint is the correct place to go, if not, we change to another endpoint.

@Apache9
Copy link
Contributor Author

Apache9 commented Apr 28, 2020

I think what we want in 2.3.0 is MasterRegistry, not hedge read rpc right? So my goal is to make sure MasterRegistry available first and then we start the think of the hedge read feature.

If you guys think we must have hedge read feature in the rpc framework in 2.3.0, then I will veto on releasing 2,3.0 with the current code base. Let's at least move the read replicas logic down to the rpc framework layer if we really want to do this(although I do not think this is necessary in HBase). Maybe this will spend several months...

@Apache9
Copy link
Contributor Author

Apache9 commented Apr 28, 2020

Just realized this work was triggered by HBASE-24264. How did you narrow it down to testHedgedAsyncEcho? There is not much detail in the jira/commit. Any jstacks? Curious what was hanging in the test and if we can de-flake it.

To be honest, when I looked at the implementation of HedgedRpcChannel I just gave up on finding out the root cause as the code is not real 'async'...

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 9m 28s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
_ master Compile Tests _
+0 🆗 mvndep 0m 34s Maven dependency ordering for branch
+1 💚 mvninstall 4m 8s master passed
+1 💚 checkstyle 2m 13s master passed
+1 💚 spotbugs 4m 0s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 11s Maven dependency ordering for patch
+1 💚 mvninstall 3m 54s the patch passed
+1 💚 checkstyle 2m 6s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 hadoopcheck 12m 47s Patch does not cause any errors with Hadoop 3.1.2 3.2.1.
+1 💚 spotbugs 5m 33s the patch passed
_ Other Tests _
+1 💚 asflicense 0m 40s The patch does not generate ASF License warnings.
54m 52s
Subsystem Report/Notes
Docker Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/3/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #1593
Optional Tests dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle
uname Linux eca944a6f809 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 6eb5caf
Max. process+thread count 84 (vs. ulimit of 12500)
modules C: hbase-common hbase-client hbase-server U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/3/console
versions git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 29s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 33s Maven dependency ordering for branch
+1 💚 mvninstall 4m 2s master passed
+1 💚 compile 1m 57s master passed
+1 💚 shadedjars 5m 14s branch has no errors when building our shaded downstream artifacts.
-0 ⚠️ javadoc 0m 19s hbase-common in master failed.
-0 ⚠️ javadoc 0m 24s hbase-client in master failed.
-0 ⚠️ javadoc 0m 39s hbase-server in master failed.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 15s Maven dependency ordering for patch
+1 💚 mvninstall 3m 50s the patch passed
+1 💚 compile 1m 56s the patch passed
+1 💚 javac 1m 56s the patch passed
+1 💚 shadedjars 5m 9s patch has no errors when building our shaded downstream artifacts.
-0 ⚠️ javadoc 0m 18s hbase-common in the patch failed.
-0 ⚠️ javadoc 0m 25s hbase-client in the patch failed.
-0 ⚠️ javadoc 0m 39s hbase-server in the patch failed.
_ Other Tests _
+1 💚 unit 1m 22s hbase-common in the patch passed.
+1 💚 unit 1m 6s hbase-client in the patch passed.
+1 💚 unit 117m 52s hbase-server in the patch passed.
149m 7s
Subsystem Report/Notes
Docker Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/3/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR #1593
Optional Tests javac javadoc unit shadedjars compile
uname Linux ecc243600dbf 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 6eb5caf
Default Java 2020-01-14
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/3/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-common.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/3/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-client.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/3/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/3/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-common.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/3/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-client.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/3/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/3/testReport/
Max. process+thread count 4724 (vs. ulimit of 12500)
modules C: hbase-common hbase-client hbase-server U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/3/console
versions git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f)
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 9m 34s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 34s Maven dependency ordering for branch
+1 💚 mvninstall 4m 6s master passed
+1 💚 compile 1m 54s master passed
+1 💚 shadedjars 5m 45s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 1m 23s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 13s Maven dependency ordering for patch
+1 💚 mvninstall 3m 48s the patch passed
+1 💚 compile 1m 50s the patch passed
+1 💚 javac 1m 50s the patch passed
+1 💚 shadedjars 5m 34s patch has no errors when building our shaded downstream artifacts.
-0 ⚠️ javadoc 0m 23s hbase-client generated 2 new + 2 unchanged - 0 fixed = 4 total (was 2)
_ Other Tests _
+1 💚 unit 1m 29s hbase-common in the patch passed.
+1 💚 unit 1m 12s hbase-client in the patch passed.
+1 💚 unit 189m 6s hbase-server in the patch passed.
230m 0s
Subsystem Report/Notes
Docker Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/3/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR #1593
Optional Tests javac javadoc unit shadedjars compile
uname Linux a1e0bef62533 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 6eb5caf
Default Java 1.8.0_232
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/3/artifact/yetus-jdk8-hadoop3-check/output/diff-javadoc-javadoc-hbase-client.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/3/testReport/
Max. process+thread count 3491 (vs. ulimit of 12500)
modules C: hbase-common hbase-client hbase-server U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/3/console
versions git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f)
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@bharathv
Copy link
Contributor

bharathv commented Apr 29, 2020

Okay, let me first summarize your beef with the current approach, for my own understanding and for those following so that they can weigh in too. Your main concerns are as follows. I'll try to answer each one separately.

  1. Implementation is not real async, uses a background driver thread
  2. Lack of retries support built into the channel.
  3. Implementation doesn't support both blocking and non-blocking RPC implementations
  4. Read replicas was not ported to this feature, if we want to go with this approach, port read replicas too (or else you'll veto).
  5. HBase client implementation is not like GRPC, so it may not make sense to have hedging support in HBase just because GRPC did it.

=====

  1. This is true. May be I was a little lazy and I missed it when I implemented. I quickly put-together what I think is the async version you are looking for. Its here [DRAFT] Make HedgedRpcChannel real async #1601 if wanted to take a look.
    It doesn't use a dedicated thread anymore and relies on the last failed RPC thread of a previous batch to invoke the next batch in the same thread. Just FYI, the patch is raw and could be cleaned up, I quickly put it together so that you have something to look at. Also, to add, the issue with implementing it as a channel means that we don't have flexibility of returning CompletableFutures (due to proto buf interfaces). That makes the code to look a little unstructured but I think we can back it up with comments. Let me know what you think.

  2. I already clarified it in the design and class comments. Hedging is orthogonal to retries. You probably don't want to mix both of them. Also, the implementation adds a "Channel", and it doesn't make sense to have retries at the Channel level. That said, nothing stops us from having retries from the cilent side, just like the your code in AsyncRpcRetryingCaller...

  3. I actually disagree with you on this. I don't see why we should support BlockingRpcClient anymore. Async client is the current Java way of doing things and it has feature parity with Blocking client implementation and is the default. Now imagine implementing hedging on a blocking channel, we will have to do all sorts of gymnastics with thread pools (which you probably don't want, given your concern 1). If we end up doing this for a blocking client, it is going to be ton of code (and effort) that no one uses. Do we really want to do that?

  4. Well, I don't know how to answer this. I didn't do it because of lack of time. These set of patches were committed probably a month or two back and since then I was busy. I can attempt it but I cannot guarantee that I'll finish it in 2.3.0 time frame given my other commitments. Also, I don't fully understand the concern here. Just because one feature was not ported to a new framework, doesn't mean we should totally get rid of it.

  5. Of course HBase is not like GRPC. I was just trying to give an example. The analogy there was that HBase also has an abstraction of rpc stack and client implementation relying on rpc stack. So I think it makes sense to have hedging as a feature in the rpc stack.

@Apache9
Copy link
Contributor Author

Apache9 commented Apr 29, 2020

Well, all rpc frameworks have a dream to implement everything. But unfortunately, this is a rpc framework only for HBase, not for everyone. So there is a final judgement on whether or not implement a feature in rpc framework, is that if it introduces more complexity in code.

The motivation here is to support hedge read in MasterRegistry, and you can see the patch, "+186 -629", I just added 186 lines but removed 629 lines to just support the same function. OK there are about 100 lines of test code removed, any way, it is still about 200 vs. 500.

So I think for supporting MasterRegistry, implement it in MaterRegistry directly is much better than imlementing it in rpc. And the reason is very easy to understand, we do not need extra abstraction...

On other usage of hedged reads, the first thing is the read replica feature. If you can prove that, you can use less code to support same feature in rpc framework than RpcRetryingCaller, then I'm OK with implementing the hedged reads feature in rpc.

So my suggestion here is still, remove the hedged reads feature to not block the 2.3.0 release, and open another issue for implementing it, as it has much more long term impact than the MasterRegistry feature.

Thanks.

@Apache9
Copy link
Contributor Author

Apache9 commented Apr 29, 2020

For your statement on BlockingRpcClient, well, I'm fine with removing it as it is outdated, which still uses the tech from the 1990s(or even 1980s?)

But I still need to say, as netty is widely used, more developers are not familiar with the word 'NIO' and 'BIO'. The 'Blocking' in its name does not mean the rpc implementation is blocking, it just means the socket is in blocking mode. In netty there is a rarely used Channel type called 'OIO', it is the same thing here. Obviously, you can use netty OIO to write async code, so in HBase you can also use the BlockingRpcClient to implement async code. As you can see that, our async hbase client can run on both NettyRpcClient and BlockingRpcClient.

Thanks.

@Apache9
Copy link
Contributor Author

Apache9 commented Apr 29, 2020

Oh, forgot to say, in the newest patch I implemented the fan out limit feature.

@bharathv
Copy link
Contributor

bharathv commented May 1, 2020

On other usage of hedged reads, the first thing is the read replica feature. If you can prove that, you can use less code to support same feature in rpc framework than RpcRetryingCaller, then I'm OK with implementing the hedged reads feature in rpc.

Ok, let me read the code for read replicas. I had a brief look at it just now and I see that the timeline reads are very much intertwined with AsyncSingleRequestRpcRetryingCaller (which is common for all gets/puts/...etc). We need to separate it out for hedging gets and I think we can port it to use hedging rpc framework but I don't know about the number of lines, it may be more or less. Anyway, I don't think number of lines is always the right metric to go by.

So my suggestion here is still, remove the hedged reads feature to not block the 2.3.0 release, and open another issue for implementing it, as it has much more long term impact than the MasterRegistry feature.

Well if you already made your decision, I don't think I can do much to change it. The least we could do is to fix it to make it actually async (which is #1601). If you still have other concerns, I think we'd end up blocking 2.3.0 (which is very close AFAICT, based on jiras from @ndimiduk). Let's see what other reviewers think? Thanks.

@Apache9
Copy link
Contributor Author

Apache9 commented May 2, 2020

Ping @ndimiduk and @saintstack , what do you guys think? I've replied on #1601 , at least the current implementation does not make sense to me, there is no reason that why we can only support NettyRpcClient.

FWIW, I do not see any advantages on moving the logic from AsyncRpcRetryingCaller down to the rpc framework, except that the MasterRegistry can make use of it, but implementing the logic in MasterRegistry only requires a very small amount of code piece, and can do better error handling than implementing in the rpc framework, so I do not think it worths.

Thanks.

@bharathv
Copy link
Contributor

bharathv commented May 3, 2020

Ping @ndimiduk and @saintstack , what do you guys think? I've replied on #1601 , at least the current implementation does not make sense to me, there is no reason that why we can only support NettyRpcClient.

Your comments on #1601 make sense to me. Thanks for taking a look. It was a misunderstanding on my part that the BlockingRpcConnection implementation always defaulted to a blocking channel implementation, I took a closer look at the code after your comments and that cleared it up. This means it is much easier to support all RpcClients if we wrap the channels in HedgedRpcChannel. So, I think this can be fixed.

I think the whole discussion now boils down to whether we want to have the abstraction of hedging in RPC layer or not.

  • Looks like your opinion is no.
  • My take, I definitely understand your argument and see where you are coming from but I think the abstraction doesn't hurt (as long as it is implemented correctly). So I have a slight preference to include it in the RPC layer and I don't mind if the majority vote is the other way.
  • Do other reviewers/followers have any strong opinion either way?

@saintstack
Copy link
Contributor

After reading the above:

It was a mistake suggesting you generalize hedged reads Bharath -- at least for branch-2. I apologize if it made you do more work.

On read-replicas in RPC tier, it strikes me that this an base-special in need of support from tiers above RPC? It won't fit?

On the notion that the hedged reads addition to RPC implies our building a generic, feature-ful, RPC lib, let us take pause here (sorry if my suggestion that hedged reads be made generally available in RPC helped imply this). Lets not do this. Lets see if we can use an existing RPC lib instead (it may not be possible given the reminder above that retries, read-replicas, and hedging factor info from higher tiers). The Duo async refactor helped undo our RPC tangle. If we could plug in a GRPC, perf allowing, it would be fun being able to just enable ‘pushback’, controlled delay, tracing, etc., because we are up on an RPC lib that just supports these abilities and others (May not be possible in the hbase context but worth investigation).

Duo's reminder that we have to be careful when it comes to almost-there optional code is timely because too often, even with the best of intentions, the stuff is not enabled and rots.

I'm game to help out here how ever I can.

@ndimiduk
Copy link
Member

ndimiduk commented May 4, 2020

Seems we've lead @bharathv down a rabbit hole. These changes are part of a feature new for 2.3.0, so I prefer that they are settled before the release. I'm still in the process of setting up my infrastructure for running ITBLL tests in the peculiarities of my environment (see my recent patches around the chaos monkey tools). Thus there's still time. Please take the time to see this resolved before the first RC, whichever direction you choose to take it. Go ahead and mark appropriate JIRAs as blockers for 2.3.0.

My one request is that we update the class names, class-level javadoc, and package level javadoc to more clearly describe what's going on here and why. It seems that what structures we have in place are not enough for multiple developers familiar with the codebase to understand the subtleties of this subsystem. No offense @Apache9, but more people than just you need to be able to provide meaningful code review on our RPC implementation.

Copy link
Contributor

@bharathv bharathv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saintstack / @ndimiduk Thanks for chiming in. It looks like we have a consensus to move the logic into the master registry itself.

I apologize if it made you do more work.

No worries.

On read-replicas in RPC tier, it strikes me that this an base-special in need of support from tiers above RPC? It won't fit?

I think it would look out of place given the way the current client code is structured.

I prefer that they are settled before the release.

The refactor is pretty simple actually, I think should be doable before the release. I can review the patch.

MASTER_REGISTRY_HEDGED_REQS_FANOUT_DEFAULT);
int rpcTimeoutMs = (int) Math.min(Integer.MAX_VALUE,
conf.getLong(HConstants.HBASE_RPC_TIMEOUT_KEY, HConstants.DEFAULT_HBASE_RPC_TIMEOUT));
// XXX: we pass cluster id as null here since we do not have a cluster id yet, we have to fetch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Starting to wonder, why this didn't get flagged in tests. I guess there is some test hole with token based auth..

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just because we do not have a test where MasterRegistry is enabled and we use cluster id to select authentication. Most tests in HBase do not need authentication.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, thats what I meant.

future.complete(transformResult.apply(rpcResult));
};
// send requests concurrently to hedgedReadsFanout masters
private <T extends Message> void groupCall(CompletableFuture<T> future, int startIndexInclusive,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could use a comment around logic.., without that, would be difficult to follow.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added more comments for this method.

if (remaining.decrementAndGet() == 0) {
if (endIndexExclusive == masterStubs.size()) {
// we are done, complete the future with exception
RetriesExhaustedException ex = new RetriesExhaustedException("masters",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: technically these are not retries right? Instead wrap all the errors in to MasterRegistryFetch..?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RetriesExhaustedException is used to wrap all the exceptions and then it will be wrapped by the MasterRegistryFetchException to include all the master addresses.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, my question was about the word "retries". That will show up in the exception message (IIUC). These give a false impression that things are being retried, right (when in reality they are hedged, in some sense).

}
} else {
// do not need to decrement the counter any more as we have already finished the future.
future.complete(r);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens to the hedged calls? Shouldn't they be canceled?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no way to cancel a already sent rpc call, our rpc framework does not support this feature. For read replica feature there is a delay for the secondary replica calls so we have a chance to cancel the local delayed task but here, we just sent the request out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about Call#setException(), which cleans up the caller state and propagates the exception to the the future callback? For example, if a master is hung and the RPC is hung, that state would be cleaned up quicker.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 47s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
_ master Compile Tests _
+0 🆗 mvndep 0m 34s Maven dependency ordering for branch
+1 💚 mvninstall 4m 52s master passed
+1 💚 checkstyle 2m 22s master passed
+1 💚 spotbugs 4m 44s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 15s Maven dependency ordering for patch
+1 💚 mvninstall 4m 41s the patch passed
-0 ⚠️ checkstyle 0m 37s hbase-client: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
-0 ⚠️ whitespace 0m 0s The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
+1 💚 hadoopcheck 14m 55s Patch does not cause any errors with Hadoop 3.1.2 3.2.1.
+1 💚 spotbugs 5m 31s the patch passed
_ Other Tests _
+1 💚 asflicense 0m 38s The patch does not generate ASF License warnings.
52m 27s
Subsystem Report/Notes
Docker Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/4/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #1593
Optional Tests dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle
uname Linux 6f74baa32cc6 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 0632e98
checkstyle https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/4/artifact/yetus-general-check/output/diff-checkstyle-hbase-client.txt
whitespace https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/4/artifact/yetus-general-check/output/whitespace-eol.txt
Max. process+thread count 84 (vs. ulimit of 12500)
modules C: hbase-common hbase-client hbase-server U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/4/console
versions git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 32s Docker mode activated.
-0 ⚠️ yetus 0m 4s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 28s Maven dependency ordering for branch
+1 💚 mvninstall 3m 23s master passed
+1 💚 compile 1m 40s master passed
+1 💚 shadedjars 4m 58s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 1m 18s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 15s Maven dependency ordering for patch
+1 💚 mvninstall 3m 23s the patch passed
+1 💚 compile 1m 42s the patch passed
+1 💚 javac 1m 42s the patch passed
+1 💚 shadedjars 4m 58s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 1m 20s the patch passed
_ Other Tests _
+1 💚 unit 1m 12s hbase-common in the patch passed.
-1 ❌ unit 0m 44s hbase-client in the patch failed.
+1 💚 unit 133m 40s hbase-server in the patch passed.
162m 15s
Subsystem Report/Notes
Docker Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/4/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR #1593
Optional Tests javac javadoc unit shadedjars compile
uname Linux 78e6f2b62b30 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 0632e98
Default Java 1.8.0_232
unit https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/4/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-client.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/4/testReport/
Max. process+thread count 4636 (vs. ulimit of 12500)
modules C: hbase-common hbase-client hbase-server U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/4/console
versions git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f)
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 13s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 29s Maven dependency ordering for branch
+1 💚 mvninstall 4m 51s master passed
+1 💚 compile 2m 21s master passed
+1 💚 shadedjars 6m 13s branch has no errors when building our shaded downstream artifacts.
-0 ⚠️ javadoc 0m 19s hbase-common in master failed.
-0 ⚠️ javadoc 0m 27s hbase-client in master failed.
-0 ⚠️ javadoc 0m 46s hbase-server in master failed.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 14s Maven dependency ordering for patch
+1 💚 mvninstall 4m 48s the patch passed
+1 💚 compile 2m 19s the patch passed
+1 💚 javac 2m 19s the patch passed
+1 💚 shadedjars 6m 15s patch has no errors when building our shaded downstream artifacts.
-0 ⚠️ javadoc 0m 18s hbase-common in the patch failed.
-0 ⚠️ javadoc 0m 29s hbase-client in the patch failed.
-0 ⚠️ javadoc 0m 44s hbase-server in the patch failed.
_ Other Tests _
+1 💚 unit 1m 56s hbase-common in the patch passed.
-1 ❌ unit 1m 7s hbase-client in the patch failed.
+1 💚 unit 207m 58s hbase-server in the patch passed.
245m 2s
Subsystem Report/Notes
Docker Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/4/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR #1593
Optional Tests javac javadoc unit shadedjars compile
uname Linux 7aa9e4c30a2c 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 0632e98
Default Java 2020-01-14
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/4/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-common.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/4/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-client.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/4/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/4/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-common.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/4/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-client.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/4/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt
unit https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/4/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-client.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/4/testReport/
Max. process+thread count 2992 (vs. ulimit of 12500)
modules C: hbase-common hbase-client hbase-server U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/4/console
versions git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f)
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 46s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
_ master Compile Tests _
+0 🆗 mvndep 0m 25s Maven dependency ordering for branch
+1 💚 mvninstall 5m 21s master passed
+1 💚 checkstyle 2m 28s master passed
+1 💚 spotbugs 5m 21s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 15s Maven dependency ordering for patch
+1 💚 mvninstall 4m 17s the patch passed
+1 💚 checkstyle 2m 36s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 hadoopcheck 13m 14s Patch does not cause any errors with Hadoop 3.1.2 3.2.1.
+1 💚 spotbugs 6m 17s the patch passed
_ Other Tests _
+1 💚 asflicense 0m 38s The patch does not generate ASF License warnings.
51m 59s
Subsystem Report/Notes
Docker Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/5/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #1593
Optional Tests dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle
uname Linux 76fc8d22749f 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / fdbf458
Max. process+thread count 84 (vs. ulimit of 12500)
modules C: hbase-common hbase-client hbase-server U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/5/console
versions git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 42s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 34s Maven dependency ordering for branch
+1 💚 mvninstall 4m 39s master passed
+1 💚 compile 2m 8s master passed
+1 💚 shadedjars 6m 21s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 1m 23s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 14s Maven dependency ordering for patch
+1 💚 mvninstall 4m 26s the patch passed
+1 💚 compile 2m 3s the patch passed
+1 💚 javac 2m 3s the patch passed
+1 💚 shadedjars 6m 30s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 1m 34s the patch passed
_ Other Tests _
+1 💚 unit 1m 37s hbase-common in the patch passed.
-1 ❌ unit 0m 51s hbase-client in the patch failed.
+1 💚 unit 139m 40s hbase-server in the patch passed.
175m 2s
Subsystem Report/Notes
Docker Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/5/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR #1593
Optional Tests javac javadoc unit shadedjars compile
uname Linux cabba8730e9e 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / fdbf458
Default Java 1.8.0_232
unit https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/5/artifact/yetus-jdk8-hadoop3-check/output/patch-unit-hbase-client.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/5/testReport/
Max. process+thread count 3937 (vs. ulimit of 12500)
modules C: hbase-common hbase-client hbase-server U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/5/console
versions git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f)
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 23s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 36s Maven dependency ordering for branch
+1 💚 mvninstall 4m 46s master passed
+1 💚 compile 2m 15s master passed
+1 💚 shadedjars 5m 44s branch has no errors when building our shaded downstream artifacts.
-0 ⚠️ javadoc 0m 17s hbase-common in master failed.
-0 ⚠️ javadoc 0m 25s hbase-client in master failed.
-0 ⚠️ javadoc 0m 40s hbase-server in master failed.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 14s Maven dependency ordering for patch
+1 💚 mvninstall 4m 24s the patch passed
+1 💚 compile 2m 10s the patch passed
+1 💚 javac 2m 10s the patch passed
+1 💚 shadedjars 5m 48s patch has no errors when building our shaded downstream artifacts.
-0 ⚠️ javadoc 0m 16s hbase-common in the patch failed.
-0 ⚠️ javadoc 0m 25s hbase-client in the patch failed.
-0 ⚠️ javadoc 0m 43s hbase-server in the patch failed.
_ Other Tests _
+1 💚 unit 1m 49s hbase-common in the patch passed.
-1 ❌ unit 0m 58s hbase-client in the patch failed.
+1 💚 unit 189m 1s hbase-server in the patch passed.
223m 58s
Subsystem Report/Notes
Docker Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/5/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR #1593
Optional Tests javac javadoc unit shadedjars compile
uname Linux 38d5ad210d75 4.15.0-74-generic #84-Ubuntu SMP Thu Dec 19 08:06:28 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / fdbf458
Default Java 2020-01-14
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/5/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-common.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/5/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-client.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/5/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/5/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-common.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/5/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-client.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/5/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt
unit https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/5/artifact/yetus-jdk11-hadoop3-check/output/patch-unit-hbase-client.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/5/testReport/
Max. process+thread count 3420 (vs. ulimit of 12500)
modules C: hbase-common hbase-client hbase-server U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/5/console
versions git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f)
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@bharathv bharathv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patch looks fine to me, just a follow up on hedged call cancellations, otherwise +1 from me. Nice tests.

MASTER_REGISTRY_HEDGED_REQS_FANOUT_DEFAULT);
int rpcTimeoutMs = (int) Math.min(Integer.MAX_VALUE,
conf.getLong(HConstants.HBASE_RPC_TIMEOUT_KEY, HConstants.DEFAULT_HBASE_RPC_TIMEOUT));
// XXX: we pass cluster id as null here since we do not have a cluster id yet, we have to fetch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, thats what I meant.

if (remaining.decrementAndGet() == 0) {
if (endIndexExclusive == masterStubs.size()) {
// we are done, complete the future with exception
RetriesExhaustedException ex = new RetriesExhaustedException("masters",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, my question was about the word "retries". That will show up in the exception message (IIUC). These give a false impression that things are being retried, right (when in reality they are hedged, in some sense).

}
} else {
// do not need to decrement the counter any more as we have already finished the future.
future.complete(r);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about Call#setException(), which cleans up the caller state and propagates the exception to the the future callback? For example, if a master is hung and the RPC is hung, that state would be cleaned up quicker.

}
future.complete(transformResult.apply(rpcResult));
};
// send requests concurrently to hedgedReadsFanout masters. If any of the request is succeeded, we
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: javadoc for method formatting /**

@bharathv
Copy link
Contributor

bharathv commented May 6, 2020

Looks like the test class is missing a class rule (precommit failure).

@Apache9
Copy link
Contributor Author

Apache9 commented May 6, 2020

Ya, my question was about the word "retries". That will show up in the exception message (IIUC). These give a false impression that things are being retried, right (when in reality they are hedged, in some sense).

I think query multiple masters could also be some types of retry? If this master is not available, we just try another, and if all are failed we will return the exception to you.

I was thinking about Call#setException(), which cleans up the caller state and propagates the exception to the the future callback? For example, if a master is hung and the RPC is hung, that state would be cleaned up quicker.

Not a big deal? Only a small amount of memory. Implementing this will lead to a more complicated logic, we have to reference the CompletableFutures for other requests and call cancel on them, and we also need to deal with the cancel event and do the clean up work. Comparing to the benefits, I do not think it worths...

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 37s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
_ master Compile Tests _
+0 🆗 mvndep 0m 34s Maven dependency ordering for branch
+1 💚 mvninstall 3m 37s master passed
+1 💚 checkstyle 1m 55s master passed
+1 💚 spotbugs 3m 45s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 13s Maven dependency ordering for patch
+1 💚 mvninstall 3m 27s the patch passed
+1 💚 checkstyle 1m 50s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 hadoopcheck 11m 47s Patch does not cause any errors with Hadoop 3.1.2 3.2.1.
+1 💚 spotbugs 5m 9s the patch passed
_ Other Tests _
+1 💚 asflicense 0m 37s The patch does not generate ASF License warnings.
43m 8s
Subsystem Report/Notes
Docker Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/6/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #1593
Optional Tests dupname asflicense spotbugs hadoopcheck hbaseanti checkstyle
uname Linux b456a0c97f8b 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 3d96007
Max. process+thread count 95 (vs. ulimit of 12500)
modules C: hbase-common hbase-client hbase-server U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/6/console
versions git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f) spotbugs=3.1.12
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 30s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 27s Maven dependency ordering for branch
+1 💚 mvninstall 3m 53s master passed
+1 💚 compile 1m 56s master passed
+1 💚 shadedjars 5m 12s branch has no errors when building our shaded downstream artifacts.
-0 ⚠️ javadoc 0m 17s hbase-common in master failed.
-0 ⚠️ javadoc 0m 26s hbase-client in master failed.
-0 ⚠️ javadoc 0m 39s hbase-server in master failed.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 15s Maven dependency ordering for patch
+1 💚 mvninstall 3m 55s the patch passed
+1 💚 compile 1m 55s the patch passed
+1 💚 javac 1m 55s the patch passed
+1 💚 shadedjars 5m 6s patch has no errors when building our shaded downstream artifacts.
-0 ⚠️ javadoc 0m 17s hbase-common in the patch failed.
-0 ⚠️ javadoc 0m 24s hbase-client in the patch failed.
-0 ⚠️ javadoc 0m 39s hbase-server in the patch failed.
_ Other Tests _
+1 💚 unit 1m 33s hbase-common in the patch passed.
+1 💚 unit 1m 2s hbase-client in the patch passed.
+1 💚 unit 123m 54s hbase-server in the patch passed.
155m 7s
Subsystem Report/Notes
Docker Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/6/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR #1593
Optional Tests javac javadoc unit shadedjars compile
uname Linux 3705b7c31489 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 3d96007
Default Java 2020-01-14
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/6/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-common.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/6/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-client.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/6/artifact/yetus-jdk11-hadoop3-check/output/branch-javadoc-hbase-server.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/6/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-common.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/6/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-client.txt
javadoc https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/6/artifact/yetus-jdk11-hadoop3-check/output/patch-javadoc-hbase-server.txt
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/6/testReport/
Max. process+thread count 4196 (vs. ulimit of 12500)
modules C: hbase-common hbase-client hbase-server U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/6/console
versions git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f)
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 37s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 33s Maven dependency ordering for branch
+1 💚 mvninstall 3m 36s master passed
+1 💚 compile 1m 47s master passed
+1 💚 shadedjars 5m 11s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 1m 21s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 15s Maven dependency ordering for patch
+1 💚 mvninstall 3m 26s the patch passed
+1 💚 compile 1m 42s the patch passed
+1 💚 javac 1m 42s the patch passed
+1 💚 shadedjars 4m 53s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 1m 20s the patch passed
_ Other Tests _
+1 💚 unit 1m 22s hbase-common in the patch passed.
+1 💚 unit 1m 2s hbase-client in the patch passed.
+1 💚 unit 134m 43s hbase-server in the patch passed.
164m 31s
Subsystem Report/Notes
Docker Client=19.03.8 Server=19.03.8 base: https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/6/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR #1593
Optional Tests javac javadoc unit shadedjars compile
uname Linux d6b79fa1ed48 4.15.0-60-generic #67-Ubuntu SMP Thu Aug 22 16:55:30 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 3d96007
Default Java 1.8.0_232
Test Results https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/6/testReport/
Max. process+thread count 4390 (vs. ulimit of 12500)
modules C: hbase-common hbase-client hbase-server U: .
Console output https://builds.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-1593/6/console
versions git=2.17.1 maven=(cecedd343002696d0abb50b32b541b8a6ba2883f)
Powered by Apache Yetus 0.11.1 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@bharathv bharathv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comparing to the benefits, I do not think it worths...

Okay.

@Apache9 Apache9 merged commit c1cb22f into apache:master May 6, 2020
asfgit pushed a commit that referenced this pull request May 6, 2020
…terRegistry … (#1593)

Signed-off-by: Bharath Vissapragada <[email protected]>
asfgit pushed a commit that referenced this pull request May 6, 2020
…terRegistry … (#1593)

Signed-off-by: Bharath Vissapragada <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants