Skip to content

Commit

Permalink
HBASE-22409 update branch-1 ref guide for Hadoop, Java, and HBase ver…
Browse files Browse the repository at this point in the history
…sion support (#239)

Signed-off-by: Andrew Purtell <[email protected]>
Signed-off-by: Peter Somogyi <[email protected]>
  • Loading branch information
busbey authored and petersomogyi committed Jun 29, 2019
1 parent a0258b5 commit 7828d6a
Show file tree
Hide file tree
Showing 9 changed files with 231 additions and 599 deletions.
17 changes: 13 additions & 4 deletions src/main/asciidoc/_chapters/architecture.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -1449,10 +1449,19 @@ Alphanumeric Rowkeys::
Using a Custom Algorithm::
The RegionSplitter tool is provided with HBase, and uses a _SplitAlgorithm_ to determine split points for you.
As parameters, you give it the algorithm, desired number of regions, and column families.
It includes two split algorithms.
The first is the `link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/util/RegionSplitter.HexStringSplit.html[HexStringSplit]` algorithm, which assumes the row keys are hexadecimal strings.
The second, `link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/util/RegionSplitter.UniformSplit.html[UniformSplit]`, assumes the row keys are random byte arrays.
You will probably need to develop your own `link:http://hbase.apache.org/apidocs/org/apache/hadoop/hbase/util/RegionSplitter.SplitAlgorithm.html[SplitAlgorithm]`, using the provided ones as models.
It includes three split algorithms.
The first is the
`link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/util/RegionSplitter.HexStringSplit.html[HexStringSplit]`
algorithm, which assumes the row keys are hexadecimal strings.
The second is the
`link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/util/RegionSplitter.DecimalStringSplit.html[DecimalStringSplit]`
algorithm, which assumes the row keys are decimal strings in the range 00000000 to 99999999.
The third,
`link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/util/RegionSplitter.UniformSplit.html[UniformSplit]`,
assumes the row keys are random byte arrays.
You will probably need to develop your own
`link:https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/util/RegionSplitter.SplitAlgorithm.html[SplitAlgorithm]`,
using the provided ones as models.
=== Online Region Merges
Expand Down
301 changes: 124 additions & 177 deletions src/main/asciidoc/_chapters/configuration.adoc

Large diffs are not rendered by default.

9 changes: 7 additions & 2 deletions src/main/asciidoc/_chapters/developer.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -401,7 +401,7 @@ mvn -Dhadoop.profile=3.0 ...
The above will build against whatever explicit hadoop 3.y version we have in our _pom.xml_ as our '3.0' version.
Tests may not all pass so you may need to pass `-DskipTests` unless you are inclined to fix the failing tests.

To pick a particular Hadoop 3.y release, you'd set e.g. `-Dhadoop-three.version=3.0.0-alpha1`.
To pick a particular Hadoop 3.y release, you'd set hadoop-three.version property e.g. `-Dhadoop-three.version=3.0.0`.

[[build.protobuf]]
==== Build Protobuf
Expand Down Expand Up @@ -538,7 +538,12 @@ For the build to sign them for you, you a properly configured _settings.xml_ in

[[maven.release]]
=== Making a Release Candidate
Only committers may make releases of hbase artifacts.

NOTE: These instructions are for building HBase 1.y.z

.Point Releases
If you are making a point release (for example to quickly address a critical incompatibility or security problem) off of a release branch instead of a development branch, the tagging instructions are slightly different.
I'll prefix those special steps with _Point Release Only_.

.Before You Begin
Make sure your environment is properly set up. Maven and Git are the main tooling
Expand Down
30 changes: 1 addition & 29 deletions src/main/asciidoc/_chapters/getting_started.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@
*/
////
[[getting_started]]
= Getting Started
:doctype: book
:numbered:
Expand All @@ -38,35 +39,6 @@ This is not an appropriate configuration for a production instance of HBase, but
This section shows you how to create a table in HBase using the `hbase shell` CLI, insert rows into the table, perform put and scan operations against the table, enable or disable the table, and start and stop HBase.
Apart from downloading HBase, this procedure should take less than 10 minutes.
.Local Filesystem and Durability
WARNING: _The following is fixed in HBase 0.98.3 and beyond. See link:https://issues.apache.org/jira/browse/HBASE-11272[HBASE-11272] and link:https://issues.apache.org/jira/browse/HBASE-11218[HBASE-11218]._
Using HBase with a local filesystem does not guarantee durability.
The HDFS local filesystem implementation will lose edits if files are not properly closed.
This is very likely to happen when you are experimenting with new software, starting and stopping the daemons often and not always cleanly.
You need to run HBase on HDFS to ensure all writes are preserved.
Running against the local filesystem is intended as a shortcut to get you familiar with how the general system works, as the very first phase of evaluation.
See link:https://issues.apache.org/jira/browse/HBASE-3696[HBASE-3696] and its associated issues for more details about the issues of running on the local filesystem.
[[loopback.ip]]
.Loopback IP - HBase 0.94.x and earlier
NOTE: _The below advice is for hbase-0.94.x and older versions only. This is fixed in hbase-0.96.0 and beyond._
Prior to HBase 0.94.x, HBase expected the loopback IP address to be 127.0.0.1. Ubuntu and some other distributions default to 127.0.1.1 and this will cause problems for you. See link:http://devving.com/?p=414[Why does HBase care about /etc/hosts?] for detail
.Example /etc/hosts File for Ubuntu
====
The following _/etc/hosts_ file works correctly for HBase 0.94.x and earlier, on Ubuntu. Use this as a template if you run into trouble.
[listing]
----
127.0.0.1 localhost
127.0.0.1 ubuntu.ubuntu-domain ubuntu
----
====
=== JDK Version Requirements
HBase requires that a JDK be installed.
Expand Down
42 changes: 25 additions & 17 deletions src/main/asciidoc/_chapters/ops_mgt.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -44,13 +44,16 @@ Some commands, such as `version`, `pe`, `ltt`, `clean`, are not available in pre
$ bin/hbase
Usage: hbase [<options>] <command> [<args>]
Options:
--config DIR Configuration direction to use. Default: ./conf
--hosts HOSTS Override the list in 'regionservers' file
--config DIR Configuration direction to use. Default: ./conf
--hosts HOSTS Override the list in 'regionservers' file
--auth-as-server Authenticate to ZooKeeper using servers configuration
Commands:
Some commands take arguments. Pass no args or -h for usage.
shell Run the HBase shell
hbck Run the hbase 'fsck' tool
snapshot Tool for managing snapshots
snapshotinfo Tool for dumping snapshot information
wal Write-ahead-log analyzer
hfile Store file analyzer
zkcli Run the ZooKeeper shell
Expand All @@ -64,8 +67,10 @@ Some commands take arguments. Pass no args or -h for usage.
clean Run the HBase clean up script
classpath Dump hbase CLASSPATH
mapredcp Dump CLASSPATH entries required by mapreduce
completebulkload Run LoadIncrementalHFiles tool
pe Run PerformanceEvaluation
ltt Run LoadTestTool
canary Run the Canary tool
version Print the version
CLASSNAME Run the class named CLASSNAME
----
Expand All @@ -81,20 +86,29 @@ To see the usage, use the `--help` parameter.
----
$ ${HBASE_HOME}/bin/hbase canary -help
Usage: bin/hbase org.apache.hadoop.hbase.tool.Canary [opts] [table1 [table2]...] | [regionserver1 [regionserver2]..]
Usage: hbase org.apache.hadoop.hbase.tool.Canary [opts] [table1 [table2]...] | [regionserver1 [regionserver2]..]
where [opts] are:
-help Show this help and exit.
-regionserver replace the table argument to regionserver,
which means to enable regionserver mode
-allRegions Tries all regions on a regionserver,
only works in regionserver mode.
-zookeeper Tries to grab zookeeper.znode.parent
on each zookeeper instance
-daemon Continuous check at defined intervals.
-permittedZookeeperFailures <N> Ignore first N failures when attempting to
connect to individual zookeeper nodes in the ensemble
-interval <N> Interval between checks (sec)
-e Use region/regionserver as regular expression
which means the region/regionserver is regular expression pattern
-e Use table/regionserver as regular expression
which means the table/regionserver is regular expression pattern
-f <B> stop whole program if first error occurs, default is true
-t <N> timeout for a check, default is 600000 (milliseconds)
-t <N> timeout for a check, default is 600000 (millisecs)
-writeTableTimeout <N> write timeout for the writeTable, default is 600000 (millisecs)
-readTableTimeouts <tableName>=<read timeout>,<tableName>=<read timeout>, ... comma-separated list of read timeouts per table (no spaces), default is 600000 (millisecs)
-writeSniffing enable the write sniffing in canary
-treatFailureAsError treats read / write failure as error
-writeTable The table used for write sniffing. Default is hbase:canary
-Dhbase.canary.read.raw.enabled=<true/false> Use this flag to enable or disable raw scan during read canary test Default is false and raw is not enabled during scan
-D<configProperty>=<value> assigning or override the configuration params
----
Expand All @@ -107,6 +121,7 @@ private static final int USAGE_EXIT_CODE = 1;
private static final int INIT_ERROR_EXIT_CODE = 2;
private static final int TIMEOUT_ERROR_EXIT_CODE = 3;
private static final int ERROR_EXIT_CODE = 4;
private static final int FAILURE_EXIT_CODE = 5;
----
Here are some examples based on the following given case.
Expand Down Expand Up @@ -802,27 +817,20 @@ Options:

=== `hbase pe`

The `hbase pe` command is a shortcut provided to run the `org.apache.hadoop.hbase.PerformanceEvaluation` tool, which is used for testing.
The `hbase pe` command was introduced in HBase 0.98.4.
The `hbase pe` command runs the PerformanceEvaluation tool, which is used for testing.

The PerformanceEvaluation tool accepts many different options and commands.
For usage instructions, run the command with no options.

To run PerformanceEvaluation prior to HBase 0.98.4, issue the command `hbase org.apache.hadoop.hbase.PerformanceEvaluation`.

The PerformanceEvaluation tool has received many updates in recent HBase releases, including support for namespaces, support for tags, cell-level ACLs and visibility labels, multiget support for RPC calls, increased sampling sizes, an option to randomly sleep during testing, and ability to "warm up" the cluster before testing starts.

=== `hbase ltt`

The `hbase ltt` command is a shortcut provided to run the `org.apache.hadoop.hbase.util.LoadTestTool` utility, which is used for testing.
The `hbase ltt` command was introduced in HBase 0.98.4.
The `hbase ltt` command runs the LoadTestTool utility, which is used for testing.

You must specify either `-write` or `-update-read` as the first option.
You must specify one of `-write`, `-update`, or `-read` as the first option.
For general usage instructions, pass the `-h` option.

To run LoadTestTool prior to HBase 0.98.4, issue the command +hbase
org.apache.hadoop.hbase.util.LoadTestTool+.

The LoadTestTool has received many updates in recent HBase releases, including support for namespaces, support for tags, cell-level ACLS and visibility labels, testing security-related features, ability to specify the number of regions per server, tests for multi-get RPC calls, and tests relating to replication.

[[ops.regionmgt]]
Expand Down Expand Up @@ -885,7 +893,7 @@ See <<lb,lb>> below.
[NOTE]
====
In hbase-2.0, in the bin directory, we added a script named _considerAsDead.sh_ that can be used to kill a regionserver.
Hardware issues could be detected by specialized monitoring tools before the zookeeper timeout has expired. _considerAsDead.sh_ is a simple function to mark a RegionServer as dead.
Hardware issues could be detected by specialized monitoring tools before the zookeeper timeout has expired. _considerAsDead.sh_ is a simple function to mark a RegionServer as dead.
It deletes all the znodes of the server, starting the recovery process.
Plug in the script into your monitoring/fault detection tools to initiate faster failover.
Be careful how you use this disruptive tool.
Expand Down
35 changes: 35 additions & 0 deletions src/main/asciidoc/_chapters/preface.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -61,4 +61,39 @@ Please use link:https://issues.apache.org/jira/browse/hbase[JIRA] to report non-
To protect existing HBase installations from new vulnerabilities, please *do not* use JIRA to report security-related bugs. Instead, send your report to the mailing list [email protected], which allows anyone to send messages, but restricts who can read them. Someone on that list will contact you to follow up on your report.

[hbase_supported_tested_definitions]
.Support and Testing Expectations

The phrases /supported/, /not supported/, /tested/, and /not tested/ occur several
places throughout this guide. In the interest of clarity, here is a brief explanation
of what is generally meant by these phrases, in the context of HBase.

NOTE: Commercial technical support for Apache HBase is provided by many Hadoop vendors.
This is not the sense in which the term /support/ is used in the context of the
Apache HBase project. The Apache HBase team assumes no responsibility for your
HBase clusters, your configuration, or your data.

Supported::
In the context of Apache HBase, /supported/ means that HBase is designed to work
in the way described, and deviation from the defined behavior or functionality should
be reported as a bug.

Not Supported::
In the context of Apache HBase, /not supported/ means that a use case or use pattern
is not expected to work and should be considered an antipattern. If you think this
designation should be reconsidered for a given feature or use pattern, file a JIRA
or start a discussion on one of the mailing lists.

Tested::
In the context of Apache HBase, /tested/ means that a feature is covered by unit
or integration tests, and has been proven to work as expected.

Not Tested::
In the context of Apache HBase, /not tested/ means that a feature or use pattern
may or may notwork in a given way, and may or may not corrupt your data or cause
operational issues. It is an unknown, and there are no guarantees. If you can provide
proof that a feature designated as /not tested/ does work in a given way, please
submit the tests and/or the metrics so that other users can gain certainty about
such features or use patterns.

:numbered:
Loading

0 comments on commit 7828d6a

Please sign in to comment.