Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-27650 Merging empty regions corrupts meta cache #5037

Merged
merged 7 commits into from
Feb 26, 2023

Conversation

bbeaudreault
Copy link
Contributor

No description provided.

@bbeaudreault bbeaudreault requested a review from Apache9 February 17, 2023 16:20
boolean isLast = Bytes.equals(region.getEndKey(), HConstants.EMPTY_END_ROW);

while (true) {
Map.Entry<byte[], RegionLocations> overlap =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option here would have been to iterate a subMap. I felt this approach was better because merges are rare and in almost all cases we'll exit here after just 1 floorEntry/lastEntry call and 1 reference equality check. Using a subMap requires at least 2 comparator comparisons, to get the head and tail of the subMap.

@@ -442,6 +485,10 @@ private RegionLocations locateRowInCache(TableCache tableCache, TableName tableN
recordCacheHit();
return locs;
} else {
if (LOG.isTraceEnabled()) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log was helpful for diagnosing this bug, so I decided to keep it.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@bbeaudreault
Copy link
Contributor Author

Will look into test failures shortly. They look related

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@bbeaudreault bbeaudreault requested a review from Apache9 February 18, 2023 04:35
@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

* possible, calls beforeUpdate callback prior to making a change. Calls afterUpdate callback
* after making a change.
*/
public synchronized void remove(HRegionLocation loc, Runnable beforeUpdate,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to be careful that the beforeUpdate and afterUpdate do not hold other locks otherwise it may introduce dead lock

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me see if I can remove the callbacks. I was trying to keep the metaLocation.onError stuff out of here. I wasn't sure if the onError call needed to happen at the exact point

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is not easy, just add more comments to warn others.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the callbacks. I think remove can just return a boolean, and both actions can just happen after if an action was taken.


boolean isLast = isEmptyStopRow(region.getEndKey());

while (true) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the implementation can still be improved? Now it will stop when we hit the location itself, but it is still possible that an region whose startKey is less than this location but the endKey is greater than the startKey of this location.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once a region is merged, meta scan will only return the new child region whose start and end should fully encompass all the merged regions. So that is the only case we need to solve, which is handled here.

What you describe would only be possible if we tried to cache one of the merged parent regions. That should not happen.

Does that make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think theoretically, it is possible that the regions are merged/split again and again, for example all regions are merged to one, and then the region is split to multiple regions again. In this way, the boundaries can be anything...

Copy link
Contributor Author

@bbeaudreault bbeaudreault Feb 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok that makes sense, but not sure it's an issue. Just to be clear, I'm open to making a change here. I'm just trying to think through this, so please bear with me.

The problem we are trying to solve is due to how we use floorEntry to query the cache. Using floorEntry opens us to a problem where a stale cache entry with startKey greater than the correct one can cause the correct one to never be returned. The current solution solves that.

Since we use floorEntry, I don't think stale entries with startKey less than the correct location are really a problem. They would exist in cache but not cause any issues. If a request went to them they would be cleared. If they got overlapped by another region they'd be cleaned up at that point. Assuming a relatively active table, it would all clean up over time as different regions get requested.


That said, I did think about how to do it. All entries are indexed by startKey, but we're concerned about endKey. We could pretty easily check the entry just prior to the cached location. But that doesn't cover us. Theoretically even the first entry in the cache could have an endKey that overlaps.

So the only way to fully be sure of no overlaps given the endless possibility is to fully check all entries to the head of the cache. I don't think this is worth it given there could be many thousands of regions for a table and there sometimes be bursts of regions being cached which would all have to scan to head. We could also keep a secondary index by endKey, but again don't think it's worth the complexity given these don't cause issues.

Thoughts?

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@bbeaudreault
Copy link
Contributor Author

@Apache9 any chance you have time for another look here? Hopefully my reasoning above makes sense and we can keep the current implementation?

@Apache9
Copy link
Contributor

Apache9 commented Feb 23, 2023

I'm on a business trip until next Wednesday, so do not have much time to access gmail and github(and you know, in China you need to use something like a proxy to access gmail...)

I think your argument is reasonable, but I need more time to think whether there are some concern cases we do not cover.

Please give me sometime...

Thanks.

@bbeaudreault
Copy link
Contributor Author

Thanks for the update @Apache9, no worries. I will await your reply next week.

@Apache9
Copy link
Contributor

Apache9 commented Feb 25, 2023

After consideration, I think we can make this assumption

The problem can only occur when the new region fully cover an old region, for example, we have Start_New < Start_Old < End_Old < End_New, then if we only access within range [End_Old, End_New], then it will always return the old region but it will then find out the row is not in the range, and try to get the new region, and then we get [Start_New, End_New), still fall into the same situation.

If Start_Old is less than Start_New, even if we have overlap, it is not a problem, as when the row is greater than Start_New, we will locate to the new region, and if the row is less than Start_New, it will fall into the old region's range and we will try to access the region and get a NotServing exception, and then we will clean the cache.

So I think the implementation here is OK. But let's add more comments here so later developers could know it better. And better rename the method to something like 'cleanProblematicOverlappedRegions' so developers could know that the design here is not to clean all the overlapped regions, just the ones which could cause trouble.

Thanks.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase

This comment was marked as outdated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 35s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
_ master Compile Tests _
+0 🆗 mvndep 0m 11s Maven dependency ordering for branch
+1 💚 mvninstall 4m 48s master passed
+1 💚 compile 3m 34s master passed
+1 💚 checkstyle 0m 52s master passed
+1 💚 spotless 0m 46s branch has no errors when running spotless:check.
+1 💚 spotbugs 2m 37s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 11s Maven dependency ordering for patch
+1 💚 mvninstall 4m 37s the patch passed
+1 💚 compile 3m 20s the patch passed
+1 💚 javac 3m 20s the patch passed
+1 💚 checkstyle 0m 51s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 hadoopcheck 20m 6s Patch does not cause any errors with Hadoop 3.2.4 3.3.4.
+1 💚 spotless 0m 49s patch has no errors when running spotless:check.
+1 💚 spotbugs 3m 25s the patch passed
_ Other Tests _
+1 💚 asflicense 0m 21s The patch does not generate ASF License warnings.
56m 49s
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5037/10/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #5037
Optional Tests dupname asflicense javac spotbugs hadoopcheck hbaseanti spotless checkstyle compile
uname Linux 3acd03ccc1e1 5.4.0-1094-aws #102~18.04.1-Ubuntu SMP Tue Jan 10 21:07:03 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 4a9cf99
Default Java Eclipse Adoptium-11.0.17+8
Max. process+thread count 84 (vs. ulimit of 30000)
modules C: hbase-client hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5037/10/console
versions git=2.34.1 maven=3.8.6 spotbugs=4.7.3
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 23s Docker mode activated.
-0 ⚠️ yetus 0m 4s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 13s Maven dependency ordering for branch
+1 💚 mvninstall 4m 24s master passed
+1 💚 compile 1m 10s master passed
+1 💚 shadedjars 4m 26s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 42s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 13s Maven dependency ordering for patch
+1 💚 mvninstall 3m 20s the patch passed
+1 💚 compile 1m 8s the patch passed
+1 💚 javac 1m 8s the patch passed
+1 💚 shadedjars 4m 24s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 42s the patch passed
_ Other Tests _
+1 💚 unit 1m 16s hbase-client in the patch passed.
+1 💚 unit 210m 33s hbase-server in the patch passed.
238m 27s
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5037/10/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR #5037
Optional Tests javac javadoc unit shadedjars compile
uname Linux 127aa68e1226 5.4.0-137-generic #154-Ubuntu SMP Thu Jan 5 17:03:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 4a9cf99
Default Java Eclipse Adoptium-11.0.17+8
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5037/10/testReport/
Max. process+thread count 2458 (vs. ulimit of 30000)
modules C: hbase-client hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5037/10/console
versions git=2.34.1 maven=3.8.6
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 40s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+0 🆗 mvndep 0m 13s Maven dependency ordering for branch
+1 💚 mvninstall 4m 37s master passed
+1 💚 compile 1m 3s master passed
+1 💚 shadedjars 5m 16s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 48s master passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 12s Maven dependency ordering for patch
+1 💚 mvninstall 3m 49s the patch passed
+1 💚 compile 1m 21s the patch passed
+1 💚 javac 1m 21s the patch passed
+1 💚 shadedjars 5m 19s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 49s the patch passed
_ Other Tests _
+1 💚 unit 1m 29s hbase-client in the patch passed.
+1 💚 unit 210m 49s hbase-server in the patch passed.
240m 20s
Subsystem Report/Notes
Docker ClientAPI=1.42 ServerAPI=1.42 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5037/10/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR #5037
Optional Tests javac javadoc unit shadedjars compile
uname Linux 304544e9305b 5.4.0-1094-aws #102~18.04.1-Ubuntu SMP Tue Jan 10 21:07:03 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 4a9cf99
Default Java Temurin-1.8.0_352-b08
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5037/10/testReport/
Max. process+thread count 2682 (vs. ulimit of 30000)
modules C: hbase-client hbase-server U: .
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5037/10/console
versions git=2.34.1 maven=3.8.6
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@bbeaudreault bbeaudreault merged commit f20efaf into apache:master Feb 26, 2023
@bbeaudreault bbeaudreault deleted the HBASE-27650 branch February 26, 2023 21:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants