Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-25510 Optimize TableName.valueOf from O(n) to O(1) #2885

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

zhengzhuobinzzb
Copy link
Contributor

Optimize TableName.valueOf from O(n) to O(1). We can get benefits when the number of tables in the cluster is greater than dozens

@saintstack
Copy link
Contributor

I edited the summary to remove square brackets around the JIRA number.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 3m 55s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+1 💚 mvninstall 4m 43s master passed
+1 💚 compile 0m 30s master passed
+1 💚 shadedjars 7m 34s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 22s master passed
_ Patch Compile Tests _
+1 💚 mvninstall 4m 26s the patch passed
+1 💚 compile 0m 24s the patch passed
+1 💚 javac 0m 24s the patch passed
+1 💚 shadedjars 7m 55s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 25s the patch passed
_ Other Tests _
+1 💚 unit 1m 48s hbase-common in the patch passed.
33m 0s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2885/1/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR #2885
JIRA Issue HBASE-25510
Optional Tests javac javadoc unit shadedjars compile
uname Linux b26393b36a26 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 3cc2468
Default Java AdoptOpenJDK-1.8.0_232-b09
Test Results https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2885/1/testReport/
Max. process+thread count 227 (vs. ulimit of 30000)
modules C: hbase-common U: hbase-common
Console output https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2885/1/console
versions git=2.17.1 maven=3.6.3
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 44s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+1 💚 mvninstall 5m 29s master passed
+1 💚 compile 0m 30s master passed
+1 💚 shadedjars 8m 5s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 35s master passed
_ Patch Compile Tests _
+1 💚 mvninstall 5m 3s the patch passed
+1 💚 compile 0m 27s the patch passed
+1 💚 javac 0m 27s the patch passed
+1 💚 shadedjars 7m 58s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 31s the patch passed
_ Other Tests _
+1 💚 unit 2m 30s hbase-common in the patch passed.
33m 53s
Subsystem Report/Notes
Docker ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2885/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR #2885
JIRA Issue HBASE-25510
Optional Tests javac javadoc unit shadedjars compile
uname Linux e3d8ce757d3f 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 3cc2468
Default Java AdoptOpenJDK-11.0.6+10
Test Results https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2885/1/testReport/
Max. process+thread count 205 (vs. ulimit of 30000)
modules C: hbase-common U: hbase-common
Console output https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2885/1/console
versions git=2.17.1 maven=3.6.3
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@zhengzhuobinzzb zhengzhuobinzzb changed the title [HBASE-25510] Optimize TableName.valueOf from O(n) to O(1) HBASE-25510 Optimize TableName.valueOf from O(n) to O(1) Jan 15, 2021
@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 8s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
_ master Compile Tests _
+1 💚 mvninstall 4m 23s master passed
+1 💚 compile 0m 48s master passed
+1 💚 checkstyle 0m 25s master passed
+1 💚 spotbugs 0m 45s master passed
_ Patch Compile Tests _
+1 💚 mvninstall 3m 47s the patch passed
+1 💚 compile 0m 45s the patch passed
+1 💚 javac 0m 45s the patch passed
-0 ⚠️ checkstyle 0m 24s hbase-common: The patch generated 2 new + 17 unchanged - 0 fixed = 19 total (was 17)
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 hadoopcheck 21m 25s Patch does not cause any errors with Hadoop 3.1.2 3.2.1 3.3.0.
+1 💚 spotbugs 0m 54s the patch passed
_ Other Tests _
+1 💚 asflicense 0m 12s The patch does not generate ASF License warnings.
42m 43s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2885/1/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #2885
JIRA Issue HBASE-25510
Optional Tests dupname asflicense javac spotbugs hadoopcheck hbaseanti checkstyle compile
uname Linux a852dd790e34 4.15.0-128-generic #131-Ubuntu SMP Wed Dec 9 06:57:35 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 3cc2468
Default Java AdoptOpenJDK-1.8.0_232-b09
checkstyle https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2885/1/artifact/yetus-general-check/output/diff-checkstyle-hbase-common.txt
Max. process+thread count 85 (vs. ulimit of 30000)
modules C: hbase-common U: hbase-common
Console output https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2885/1/console
versions git=2.17.1 maven=3.6.3 spotbugs=3.1.12
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@virajjasani virajjasani requested a review from Apache9 January 20, 2021 16:16
@Apache9
Copy link
Contributor

Apache9 commented Jan 22, 2021

It is a bit strange that we convert bytes or ByteBuffer to String, and then convert back to store them...

@zhengzhuobinzzb
Copy link
Contributor Author

It is a bit strange that we convert bytes or ByteBuffer to String, and then convert back to store them...
Thank you very much for reviewing my code.

In face, i tryed to use ByteBuffer as hash key. But there ars some reasons i choose String finally.

  1. Because we use hashmap, so we must choose a class as hash key. Therefore, we must convert one of String and ByteBuffer to the other.
  2. Use ByteBuffer alse need to copy underlying array. If not, might cause memory leak.
  3. The code for construct hash key(${namespace} + ':' + ${tablename}), String is more concise than ByteBuffer.
  4. In my benchmark(code can find in jira attachment). The efficiency of String is a little better than ByteBuffer overall.

Benchmark Result For ByteBuffer:
Benchmark Mode Cnt Score Error Units
TestTableNameJMH.testBB1 thrpt 10 73319.634 ± 5162.112 ops/ms
TestTableNameJMH.testBB10 thrpt 10 69742.749 ± 1238.255 ops/ms
TestTableNameJMH.testBB100 thrpt 10 65220.820 ± 3357.727 ops/ms
TestTableNameJMH.testBB1000 thrpt 10 47865.385 ± 2510.229 ops/ms
TestTableNameJMH.testBB10000 thrpt 10 29383.002 ± 5535.025 ops/ms
TestTableNameJMH.testBB100000 thrpt 10 15333.042 ± 3451.444 ops/ms
TestTableNameJMH.testStr1 thrpt 10 17342.604 ± 2997.854 ops/ms
TestTableNameJMH.testStr10 thrpt 10 18562.438 ± 1984.190 ops/ms
TestTableNameJMH.testStr100 thrpt 10 16152.466 ± 3349.675 ops/ms
TestTableNameJMH.testStr1000 thrpt 10 14944.087 ± 1132.673 ops/ms
TestTableNameJMH.testStr10000 thrpt 10 9231.439 ± 2480.242 ops/ms
TestTableNameJMH.testStr100000 thrpt 10 7458.192 ± 1046.736 ops/ms

Benchmark Result For String:
Benchmark Mode Cnt Score Error Units
TestTableNameJMH.testBB1 thrpt 10 20719.730 ± 987.327 ops/ms
TestTableNameJMH.testBB10 thrpt 10 20698.180 ± 240.704 ops/ms
TestTableNameJMH.testBB100 thrpt 10 19331.825 ± 232.413 ops/ms
TestTableNameJMH.testBB1000 thrpt 10 18865.971 ± 260.727 ops/ms
TestTableNameJMH.testBB10000 thrpt 10 13420.776 ± 1473.567 ops/ms
TestTableNameJMH.testBB100000 thrpt 10 7209.563 ± 2928.791 ops/ms
TestTableNameJMH.testStr1 thrpt 10 133099.686 ± 5902.544 ops/ms
TestTableNameJMH.testStr10 thrpt 10 141280.563 ± 7157.877 ops/ms
TestTableNameJMH.testStr100 thrpt 10 127057.065 ± 11665.471 ops/ms
TestTableNameJMH.testStr1000 thrpt 10 85985.687 ± 18432.481 ops/ms
TestTableNameJMH.testStr10000 thrpt 10 51825.915 ± 1824.001 ops/ms
TestTableNameJMH.testStr100000 thrpt 10 19002.112 ± 3337.013 ops/ms

@Apache9
Copy link
Contributor

Apache9 commented Jan 22, 2021

It is a bit strange that we convert bytes or ByteBuffer to String, and then convert back to store them...
Thank you very much for reviewing my code.

In face, i tryed to use ByteBuffer as hash key. But there ars some reasons i choose String finally.

  1. Because we use hashmap, so we must choose a class as hash key. Therefore, we must convert one of String and ByteBuffer to the other.
  2. Use ByteBuffer alse need to copy underlying array. If not, might cause memory leak.
  3. The code for construct hash key(${namespace} + ':' + ${tablename}), String is more concise than ByteBuffer.
  4. In my benchmark(code can find in jira attachment). The efficiency of String is a little better than ByteBuffer overall.

Benchmark Result For ByteBuffer:
Benchmark Mode Cnt Score Error Units
TestTableNameJMH.testBB1 thrpt 10 73319.634 ± 5162.112 ops/ms
TestTableNameJMH.testBB10 thrpt 10 69742.749 ± 1238.255 ops/ms
TestTableNameJMH.testBB100 thrpt 10 65220.820 ± 3357.727 ops/ms
TestTableNameJMH.testBB1000 thrpt 10 47865.385 ± 2510.229 ops/ms
TestTableNameJMH.testBB10000 thrpt 10 29383.002 ± 5535.025 ops/ms
TestTableNameJMH.testBB100000 thrpt 10 15333.042 ± 3451.444 ops/ms
TestTableNameJMH.testStr1 thrpt 10 17342.604 ± 2997.854 ops/ms
TestTableNameJMH.testStr10 thrpt 10 18562.438 ± 1984.190 ops/ms
TestTableNameJMH.testStr100 thrpt 10 16152.466 ± 3349.675 ops/ms
TestTableNameJMH.testStr1000 thrpt 10 14944.087 ± 1132.673 ops/ms
TestTableNameJMH.testStr10000 thrpt 10 9231.439 ± 2480.242 ops/ms
TestTableNameJMH.testStr100000 thrpt 10 7458.192 ± 1046.736 ops/ms

Benchmark Result For String:
Benchmark Mode Cnt Score Error Units
TestTableNameJMH.testBB1 thrpt 10 20719.730 ± 987.327 ops/ms
TestTableNameJMH.testBB10 thrpt 10 20698.180 ± 240.704 ops/ms
TestTableNameJMH.testBB100 thrpt 10 19331.825 ± 232.413 ops/ms
TestTableNameJMH.testBB1000 thrpt 10 18865.971 ± 260.727 ops/ms
TestTableNameJMH.testBB10000 thrpt 10 13420.776 ± 1473.567 ops/ms
TestTableNameJMH.testBB100000 thrpt 10 7209.563 ± 2928.791 ops/ms
TestTableNameJMH.testStr1 thrpt 10 133099.686 ± 5902.544 ops/ms
TestTableNameJMH.testStr10 thrpt 10 141280.563 ± 7157.877 ops/ms
TestTableNameJMH.testStr100 thrpt 10 127057.065 ± 11665.471 ops/ms
TestTableNameJMH.testStr1000 thrpt 10 85985.687 ± 18432.481 ops/ms
TestTableNameJMH.testStr10000 thrpt 10 51825.915 ± 1824.001 ops/ms
TestTableNameJMH.testStr100000 thrpt 10 19002.112 ± 3337.013 ops/ms

I think the problem is we convert it back to bytes.

Could we introduce an internal methods which takes both string and bytes? So after checking the cache, we need to use bytes to create a new TableName, we just use it directly instead of converting the String back.

@zhengzhuobinzzb
Copy link
Contributor Author

I think the problem is we convert it back to bytes.

Could we introduce an internal methods which takes both string and bytes? So after checking the cache, we need to use bytes to create a new TableName, we just use it directly instead of converting the String back.

The matter is we can't believe outside bytes(because it can be modifyed outside). So we have to go through two conversions. The diff is below:

  1. bytes -> string -> bytes
  2. bytes -> string + bytes -> inner bytes

So, I think we choose 1 here is not a problem

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 59s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+1 💚 mvninstall 4m 19s master passed
+1 💚 compile 0m 23s master passed
+1 💚 shadedjars 8m 58s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 22s master passed
_ Patch Compile Tests _
+1 💚 mvninstall 4m 5s the patch passed
+1 💚 compile 0m 23s the patch passed
+1 💚 javac 0m 23s the patch passed
+1 💚 shadedjars 8m 58s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 20s the patch passed
_ Other Tests _
+1 💚 unit 1m 50s hbase-common in the patch passed.
31m 36s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2885/1/artifact/yetus-jdk8-hadoop3-check/output/Dockerfile
GITHUB PR #2885
Optional Tests javac javadoc unit shadedjars compile
uname Linux d8aec776018a 4.15.0-142-generic #146-Ubuntu SMP Tue Apr 13 01:11:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 0955a7a
Default Java AdoptOpenJDK-1.8.0_282-b08
Test Results https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2885/1/testReport/
Max. process+thread count 222 (vs. ulimit of 30000)
modules C: hbase-common U: hbase-common
Console output https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2885/1/console
versions git=2.17.1 maven=3.6.3
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 1m 38s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --whitespace-eol-ignore-list --whitespace-tabs-ignore-list --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+1 💚 mvninstall 5m 29s master passed
+1 💚 compile 0m 33s master passed
+1 💚 shadedjars 9m 26s branch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 30s master passed
_ Patch Compile Tests _
+1 💚 mvninstall 4m 54s the patch passed
+1 💚 compile 0m 28s the patch passed
+1 💚 javac 0m 28s the patch passed
+1 💚 shadedjars 9m 2s patch has no errors when building our shaded downstream artifacts.
+1 💚 javadoc 0m 24s the patch passed
_ Other Tests _
+1 💚 unit 2m 46s hbase-common in the patch passed.
36m 12s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2885/1/artifact/yetus-jdk11-hadoop3-check/output/Dockerfile
GITHUB PR #2885
Optional Tests javac javadoc unit shadedjars compile
uname Linux 34a65c8e5acd 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 0955a7a
Default Java AdoptOpenJDK-11.0.10+9
Test Results https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2885/1/testReport/
Max. process+thread count 213 (vs. ulimit of 30000)
modules C: hbase-common U: hbase-common
Console output https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2885/1/console
versions git=2.17.1 maven=3.6.3
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 28s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
_ master Compile Tests _
+1 💚 mvninstall 3m 56s master passed
+1 💚 compile 0m 51s master passed
+1 💚 checkstyle 0m 26s master passed
+1 💚 spotbugs 0m 46s master passed
_ Patch Compile Tests _
+1 💚 mvninstall 3m 34s the patch passed
+1 💚 compile 0m 48s the patch passed
+1 💚 javac 0m 48s the patch passed
-0 ⚠️ checkstyle 0m 25s hbase-common: The patch generated 2 new + 17 unchanged - 0 fixed = 19 total (was 17)
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 hadoopcheck 18m 2s Patch does not cause any errors with Hadoop 3.1.2 3.2.1 3.3.0.
+1 💚 spotbugs 0m 55s the patch passed
_ Other Tests _
+1 💚 asflicense 0m 15s The patch does not generate ASF License warnings.
38m 14s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2885/1/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #2885
Optional Tests dupname asflicense javac spotbugs hadoopcheck hbaseanti checkstyle compile
uname Linux e6b40efee63a 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / 0955a7a
Default Java AdoptOpenJDK-1.8.0_282-b08
checkstyle https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2885/1/artifact/yetus-general-check/output/diff-checkstyle-hbase-common.txt
Max. process+thread count 96 (vs. ulimit of 30000)
modules C: hbase-common U: hbase-common
Console output https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-2885/1/console
versions git=2.17.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@Apache9
Copy link
Contributor

Apache9 commented Mar 30, 2023

I think the problem is we convert it back to bytes.
Could we introduce an internal methods which takes both string and bytes? So after checking the cache, we need to use bytes to create a new TableName, we just use it directly instead of converting the String back.

The matter is we can't believe outside bytes(because it can be modifyed outside). So we have to go through two conversions. The diff is below:

  1. bytes -> string -> bytes
  2. bytes -> string + bytes -> inner bytes

So, I think we choose 1 here is not a problem

In the constructor of TableName, we will copy the bytes out. So let's avoid the extra converting?

@Apache9
Copy link
Contributor

Apache9 commented Dec 16, 2023

Any updates here?

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants