Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-28025 Enhance ByteBufferUtils.findCommonPrefix to compare 8 bytes each time #5354

Merged

Conversation

jbewing
Copy link
Contributor

@jbewing jbewing commented Aug 16, 2023

What

This PR updates ByteBufferUtils#findCommonPrefix and Bytes#findCommonPrefix to compare 8 bytes from the input buffers/arrays if Unsafe access is available. On platforms where Unsafe is unavailable, we use the current implementations. This is a similar optimization as to what is already done with ByteBufferUtils#compareToUnsafe.

Implementation Notes

There was a Bytes#findCommonPrefix method and a ByteBufferUtils#findCommonPrefix method that both accepted byte[] args. I've updated the ByteBufferUtils#findCommonPrefix method to delegate to Bytes#findCommonPrefix and applied the optimization for 8 byte at a time comparison to the Bytes class.

Overall, the implementation draws a ton of inspiration from ByteBufferUtils#compareToUnsafe. The only large change that I made is for how we handle mismatches in the big endian case. I used the number of leading zeros intrinsic there instead of the number of trailing zeros intrinsic to find which byte was mismatched.

Testing

I've added some unit tests to cover testing the path with unsafe enabled and disabled.

Benchmarking

I haven't done any micro-benchmarking of the new "faster" implementations vs. the current implementations. I'll update the JIRA with a link to those when I get a chance to write them. For now, I'm assuming that this method of finding common prefixes is faster than the current one based off the previous micro-benchmarking results for compareTo (as this is very similar code). I've done some microbenchmarking with JMH. The results are in this JIRA comment

HBASE-28025

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 13s Docker mode activated.
-0 ⚠️ yetus 0m 2s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+1 💚 mvninstall 2m 28s master passed
+1 💚 compile 0m 16s master passed
+1 💚 shadedjars 4m 31s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 16s master passed
_ Patch Compile Tests _
+1 💚 mvninstall 2m 18s the patch passed
+1 💚 compile 0m 17s the patch passed
+1 💚 javac 0m 17s the patch passed
+1 💚 shadedjars 4m 29s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 14s the patch passed
_ Other Tests _
+1 💚 unit 1m 44s hbase-common in the patch passed.
18m 10s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5354/1/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR #5354
Optional Tests javac javadoc unit shadedjars compile
uname Linux d48a3a195633 5.4.0-156-generic #173-Ubuntu SMP Tue Jul 11 07:25:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 2fb2ae1
Default Java Temurin-1.8.0_352-b08
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5354/1/testReport/
Max. process+thread count 354 (vs. ulimit of 30000)
modules C: hbase-common U: hbase-common
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5354/1/console
versions git=2.34.1 maven=3.8.6
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 24s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+1 💚 mvninstall 2m 54s master passed
+1 💚 compile 0m 15s master passed
+1 💚 shadedjars 4m 58s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 15s master passed
_ Patch Compile Tests _
+1 💚 mvninstall 2m 37s the patch passed
+1 💚 compile 0m 15s the patch passed
+1 💚 javac 0m 15s the patch passed
+1 💚 shadedjars 4m 56s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 13s the patch passed
_ Other Tests _
+1 💚 unit 2m 7s hbase-common in the patch passed.
20m 21s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5354/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR #5354
Optional Tests javac javadoc unit shadedjars compile
uname Linux b693bd2ca6af 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 2fb2ae1
Default Java Eclipse Adoptium-11.0.17+8
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5354/1/testReport/
Max. process+thread count 396 (vs. ulimit of 30000)
modules C: hbase-common U: hbase-common
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5354/1/console
versions git=2.34.1 maven=3.8.6
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 42s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
_ master Compile Tests _
+1 💚 mvninstall 3m 0s master passed
+1 💚 compile 0m 36s master passed
+1 💚 checkstyle 0m 17s master passed
+1 💚 spotless 0m 44s branch has no errors when running spotless:check.
+1 💚 spotbugs 0m 34s master passed
_ Patch Compile Tests _
+1 💚 mvninstall 2m 37s the patch passed
+1 💚 compile 0m 34s the patch passed
-0 ⚠️ javac 0m 34s hbase-common generated 2 new + 34 unchanged - 0 fixed = 36 total (was 34)
+1 💚 checkstyle 0m 13s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 hadoopcheck 9m 14s Patch does not cause any errors with Hadoop 3.2.4 3.3.5.
+1 💚 spotless 0m 41s patch has no errors when running spotless:check.
+1 💚 spotbugs 0m 39s the patch passed
_ Other Tests _
+1 💚 asflicense 0m 12s The patch does not generate ASF License warnings.
25m 56s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5354/1/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #5354
Optional Tests dupname asflicense javac spotbugs hadoopcheck hbaseanti spotless checkstyle compile
uname Linux f007f107ca19 5.4.0-156-generic #173-Ubuntu SMP Tue Jul 11 07:25:22 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 2fb2ae1
Default Java Eclipse Adoptium-11.0.17+8
javac https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5354/1/artifact/yetus-general-check/output/diff-compile-javac-hbase-common.txt
Max. process+thread count 78 (vs. ulimit of 30000)
modules C: hbase-common U: hbase-common
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-5354/1/console
versions git=2.34.1 maven=3.8.6 spotbugs=4.7.3
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache9 Apache9 merged commit dae078e into apache:master Aug 20, 2023
Apache9 pushed a commit that referenced this pull request Aug 20, 2023
…es each time (#5354)

Signed-off-by: Duo Zhang <[email protected]>
(cherry picked from commit dae078e)
Apache9 pushed a commit that referenced this pull request Aug 20, 2023
…es each time (#5354)

Signed-off-by: Duo Zhang <[email protected]>
(cherry picked from commit dae078e)
Apache9 pushed a commit that referenced this pull request Aug 20, 2023
…es each time (#5354)

Signed-off-by: Duo Zhang <[email protected]>
(cherry picked from commit dae078e)
Apache9 pushed a commit that referenced this pull request Aug 20, 2023
…es each time (#5354)

Signed-off-by: Duo Zhang <[email protected]>
(cherry picked from commit dae078e)
bbeaudreault pushed a commit to HubSpot/hbase that referenced this pull request Aug 21, 2023
…x to compare 8 bytes each time (apache#5354)

Signed-off-by: Duo Zhang <[email protected]>
(cherry picked from commit dae078e)
vinayakphegde pushed a commit to vinayakphegde/hbase that referenced this pull request Apr 4, 2024
…es each time (apache#5354)

Signed-off-by: Duo Zhang <[email protected]>
(cherry picked from commit dae078e)
(cherry picked from commit 6596ef6)
Change-Id: I046c51b34f9cd6809df3e940b42f927e75fdc85f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants