Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-27531 AsyncRequestFutureImpl unnecessarily clears meta cache for full server #4930

Merged
merged 1 commit into from
Jan 10, 2023

Conversation

bbeaudreault
Copy link
Contributor

This reverts to the original behavior prior to https://issues.apache.org/jira/browse/HBASE-21775, which never cleared the full server cache for normal multigets. The null tableName scenario is only in the case of HTableMultiplexer, otherwise tableName is always available. It only makes sense to clear the full cache for requests where we can't determine the correct region to clear, like in HTableMultiplexer.

Skipping the full server clear here is good because:

  1. We already clear the meta cache for the individual regions with failures. Each case where cleanServerCache is called has a subsequent call to updateCachedLocations for the individual regions.
  2. Clearing the cache for a full server is very expensive. The more regions a server hosts, the more expensive it is. An active client would expect to need to re-fill any cleared locations in short order. Clearing 100 regions results in 100x more work than just clearing one, increasing latency on the client and load on the meta region.

The test provided in HBASE-21775 didn't actually test the problem. It just tested that the newly ungated call to cleanServerCache was called. I updated the tests here which help verify that the existing call to updateCachedLocations is enough to recover from any failure.

There are two ways errors are handled:

  1. Failures to submit the multi requests themselves (timeouts or other exceptions thrown by the server prior to handling the multi actions). These go through receiveGlobalFailure.
  2. Failures with one or more of the actions in one of the batch (thrown by the server when it's already started handling the request, but fails partway through). These go through receiveMultiAction.

I added two test cases, one for each case.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@bbeaudreault
Copy link
Contributor Author

Current spotless failure is due to https://issues.apache.org/jira/browse/HBASE-27474. The net-new code here is clean for spotless.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 48s Docker mode activated.
-0 ⚠️ yetus 0m 5s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ branch-2 Compile Tests _
+1 💚 mvninstall 2m 4s branch-2 passed
+1 💚 compile 0m 16s branch-2 passed
+1 💚 shadedjars 3m 54s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 13s branch-2 passed
_ Patch Compile Tests _
+1 💚 mvninstall 2m 4s the patch passed
+1 💚 compile 0m 17s the patch passed
+1 💚 javac 0m 17s the patch passed
+1 💚 shadedjars 3m 55s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 14s the patch passed
_ Other Tests _
+1 💚 unit 2m 31s hbase-client in the patch passed.
17m 46s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4930/4/artifact/yetus-jdk8-hadoop2-check/output/Dockerfile
GITHUB PR #4930
Optional Tests javac javadoc unit shadedjars compile
uname Linux 26cdaef40649 5.4.0-1092-aws #100~18.04.2-Ubuntu SMP Tue Nov 29 08:39:52 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision branch-2 / edc2cbd
Default Java Temurin-1.8.0_352-b08
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4930/4/testReport/
Max. process+thread count 164 (vs. ulimit of 30000)
modules C: hbase-client U: hbase-client
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4930/4/console
versions git=2.34.1 maven=3.8.6
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 52s Docker mode activated.
-0 ⚠️ yetus 0m 5s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ branch-2 Compile Tests _
+1 💚 mvninstall 3m 8s branch-2 passed
+1 💚 compile 0m 20s branch-2 passed
+1 💚 shadedjars 4m 15s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 18s branch-2 passed
_ Patch Compile Tests _
+1 💚 mvninstall 2m 43s the patch passed
+1 💚 compile 0m 20s the patch passed
+1 💚 javac 0m 20s the patch passed
+1 💚 shadedjars 4m 12s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 17s the patch passed
_ Other Tests _
+1 💚 unit 2m 59s hbase-client in the patch passed.
20m 48s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4930/4/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR #4930
Optional Tests javac javadoc unit shadedjars compile
uname Linux f28e23418a85 5.4.0-1085-aws #92~18.04.1-Ubuntu SMP Wed Aug 31 17:21:08 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision branch-2 / edc2cbd
Default Java Eclipse Adoptium-11.0.17+8
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4930/4/testReport/
Max. process+thread count 192 (vs. ulimit of 30000)
modules C: hbase-client U: hbase-client
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4930/4/console
versions git=2.34.1 maven=3.8.6
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 51s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
_ branch-2 Compile Tests _
+1 💚 mvninstall 2m 26s branch-2 passed
+1 💚 compile 0m 43s branch-2 passed
+1 💚 checkstyle 0m 15s branch-2 passed
+1 💚 spotless 0m 41s branch has no errors when running spotless:check.
+1 💚 spotbugs 0m 51s branch-2 passed
_ Patch Compile Tests _
+1 💚 mvninstall 2m 30s the patch passed
+1 💚 compile 0m 43s the patch passed
+1 💚 javac 0m 43s hbase-client generated 0 new + 108 unchanged - 1 fixed = 108 total (was 109)
+1 💚 checkstyle 0m 15s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 hadoopcheck 13m 2s Patch does not cause any errors with Hadoop 2.10.2 or 3.2.4 3.3.4.
+1 💚 spotless 0m 39s patch has no errors when running spotless:check.
+1 💚 spotbugs 0m 52s the patch passed
_ Other Tests _
+1 💚 asflicense 0m 8s The patch does not generate ASF License warnings.
25m 38s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4930/4/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #4930
Optional Tests dupname asflicense javac spotbugs hadoopcheck hbaseanti spotless checkstyle compile
uname Linux e74c6b34d4f3 5.4.0-1092-aws #100~18.04.2-Ubuntu SMP Tue Nov 29 08:39:52 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision branch-2 / edc2cbd
Default Java Eclipse Adoptium-11.0.17+8
Max. process+thread count 78 (vs. ulimit of 30000)
modules C: hbase-client U: hbase-client
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4930/4/console
versions git=2.34.1 maven=3.8.6 spotbugs=4.7.3
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@@ -989,7 +989,7 @@ private void invokeCallBack(byte[] regionName, byte[] row, CResult result) {
}

private void cleanServerCache(ServerName server, Throwable regionException) {
if (ClientExceptionsUtil.isMetaClearingException(regionException)) {
if (tableName == null && ClientExceptionsUtil.isMetaClearingException(regionException)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which case tableName could be null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's only null for HTableMultiplexer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reverts the logic back to pre-21775: 5f25985#diff-7c58ffd83c150488599591ed5a3a068599646ebdbbbfdcd2233386e5472cca35L921

The tableName == null check here was standing for very long prior to that. I don't think it was correct to remove, since in the tableName != null (most usages) we'd fall into updateCachedLocations which clears the cache just for the individual regions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Apache9 can you give another look?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saintstack Do you still remember something sir? Skimmed the patch, I'm not sure why removing the tableName == null check can fix the problem described in HBASE-21775, as Tommy Li said

from what I can see, tableName shouldn't be null unless you manually create a BufferedMutatorImpl instead of using ConnectionFactory.createConnection().getBufferedMutator(). I not sure if the bufferedmutator would work at all without a table name.

@bbeaudreault bbeaudreault requested a review from Apache9 January 3, 2023 12:57
@bbeaudreault bbeaudreault merged commit 06c7548 into apache:branch-2 Jan 10, 2023
@bbeaudreault bbeaudreault deleted the HBASE-27531 branch January 10, 2023 12:54
bbeaudreault added a commit that referenced this pull request Jan 10, 2023
stoty pushed a commit to stoty/hbase that referenced this pull request Nov 26, 2024
… clears meta cache for full server (apache#4930) (apache#158)

* CDPD-73638: Backport HBASE-27531 AsyncRequestFutureImpl unnecessarily clears meta cache for full server (apache#4930)

Signed-off-by: Duo Zhang <[email protected]>
(cherry picked from commit 1c98931)

Change-Id: If18b2add03e0509b7b778fcb14c2448353adf442

* CDPD-73638 - TestAsyncProcess.java fix

Change-Id: I1d18f40b052ca2b62bd5428678cbd6a90a859977

---------

Co-authored-by: Bryan Beaudreault <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants