-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-22460 : Reopen regions with very high Store Ref Counts #600
Conversation
for (ServerMetrics serverMetrics : serverMetricsMap.values()) { | ||
Map<byte[], RegionMetrics> regionMetricsMap = serverMetrics.getRegionMetrics(); | ||
for (RegionMetrics regionMetrics : regionMetricsMap.values()) { | ||
final int regionStoreRefCount = regionMetrics.getStoreRefCount(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This count is the sum of ref count of all HFiles under each of the CF in this region right? So the sum is not just a factor of many refs on a file. It is a function of number of hfiles under this region. What is it is having so many files at a time and each of that having say <10 active refs. So 250 is obviously not a correct number IMO. Is there a way we can see refs per HFile? No need to know ref counts on each of the HFile. Max number may be enough. I believe this decision should be made based on the so many ref counts on a file rather than sum of. Thoughts?
@@ -149,6 +165,26 @@ protected Flow executeFromState(MasterProcedureEnv env, ReopenTableRegionsState | |||
} | |||
} | |||
|
|||
private List<HRegionLocation> prepareRegionsForReopen(MasterProcedureEnv env) { | |||
List<HRegionLocation> regionsToReopenList = new ArrayList<>(); | |||
List<HRegionLocation> tableRegionsForReopen = env.getAssignmentManager() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better pass this List to this private method than passing env.
@@ -149,6 +165,26 @@ protected Flow executeFromState(MasterProcedureEnv env, ReopenTableRegionsState | |||
} | |||
} | |||
|
|||
private List<HRegionLocation> prepareRegionsForReopen(MasterProcedureEnv env) { | |||
List<HRegionLocation> regionsToReopenList = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can avoid the terms like List/Map from the variable name?
🎊 +1 overall
This message was automatically generated. |
@@ -154,4 +154,9 @@ default String getNameAsString() { | |||
* @return the reference count for the stores of this region | |||
*/ | |||
int getStoreRefCount(); | |||
|
|||
/** | |||
* @return the max reference count for any store among all stores of this region |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the max ref count on a Store or on a StoreFile? The latter right? Pls change the javadoc and name of method accordingly. That will be very clear name then.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And name to be corrected accordingly in all related places.
@@ -358,6 +360,9 @@ public String toString() { | |||
for (RegionMetrics r : getRegionMetrics().values()) { | |||
storeCount += r.getStoreCount(); | |||
storeFileCount += r.getStoreFileCount(); | |||
int currentStoreRefCount = r.getStoreRefCount(); | |||
storeRefCount += currentStoreRefCount; | |||
maxStoreRefCount = Math.max(maxStoreRefCount, currentStoreRefCount); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RegionMetrics should give maxStoreFileRefCount also and that should be considered here.
<name>hbase.regions.recovery.store.count</name> | ||
<value>256</value> | ||
<description> | ||
Store Ref Count threshold value considered |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls correct this desc accordingly
|
||
private static final String REGIONS_RECOVERY_INTERVAL = | ||
"hbase.master.regions.recovery.interval"; | ||
private static final String STORE_REF_COUNT_THRESHOLD = "hbase.regions.recovery.store.count"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is store file ref count for recovery. Pls change config name such a way to indicate this.
"Error reopening regions with high storeRefCount. "; | ||
|
||
private final HMaster hMaster; | ||
private final int storeRefCountThreshold; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This var name too
@Override | ||
protected void chore() { | ||
if (LOG.isTraceEnabled()) { | ||
LOG.trace("Starting up Regions Recovery by reopening regions based on storeRefCount..."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Starting up Regions Recovery chore for reopening .......
LOG.error("Error while reopening regions based on storeRefCount threshold", e); | ||
} | ||
if (LOG.isTraceEnabled()) { | ||
LOG.trace("Exiting Regions Recovery by reopening regions based on storeRefCount..."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Log to be corrected.
// is beyond a threshold value, we should reopen the region. | ||
// Here, we take max ref count of all stores and not the cumulative count | ||
// of all stores. | ||
final int maxStoreRefCount = regionMetrics.getMaxStoreRefCount(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All places maxStoreFileRefCount pls.
// For each region, each store file can have different ref counts | ||
// We need to find maximum of all such ref counts and if that max count | ||
// is beyond a threshold value, we should reopen the region. | ||
// Here, we take max ref count of all stores and not the cumulative count |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max count on store or on a store file? Pls make it consistent. The above line says it is store file and that is what ideally needed too. I doubt whether we are doing that. I think what we actually doing is passing the max store ref count (sum of ref counts of all store files under that store)
// Here, we take max ref count of all stores and not the cumulative count | ||
// of all stores. | ||
final int maxStoreRefCount = regionMetrics.getMaxStoreRefCount(); | ||
if (maxStoreRefCount > storeRefCountThreshold) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should consider the system table different? META table might some time get too large ref count. It is possible. May be many new clients are started/restarted or many regions move/split etc. What if at the time of reporting it had a huge ref count. Reopening of META will affect the whole system! Anyways the def count of 256 (of now) is way less for any kind of regions IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, better we exclude META.
Updated PR with all the suggestions, started using store file's max refCount
💔 -1 overall
This message was automatically generated. |
@@ -358,6 +360,8 @@ public String toString() { | |||
for (RegionMetrics r : getRegionMetrics().values()) { | |||
storeCount += r.getStoreCount(); | |||
storeFileCount += r.getStoreFileCount(); | |||
storeRefCount += r.getStoreRefCount(); | |||
maxStoreFileRefCount += r.getMaxStoreFileRefCount(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should do max() here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is addition of all counts for toString() right? I thought of doing max() but somehow I feel all these counts should be addition of all regionMetrics for the sake of toString() only. Still, I agree better to include max of all storeFileRefCount
<description> | ||
Regions Recovery Chore interval in milliseconds. | ||
This chore keeps running at this interval to | ||
find all regions with high store ref count and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Say above a configurable max value
@@ -1901,4 +1901,24 @@ possible configurations would overwhelm and obscure the important. | |||
automatically deleted until it is manually deleted | |||
</description> | |||
</property> | |||
<property> | |||
<name>hbase.master.regions.recovery.interval</name> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name it like hbase.master.regions.recovery.check.interval (?) Is that a better name?
int maxStoreFileRefCount = 0; | ||
Collection<HStoreFile> hStoreFiles = this.storeEngine.getStoreFileManager() | ||
.getStorefiles(); | ||
for (HStoreFile storeFile : hStoreFiles) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return this.storeEngine.getStoreFileManager().getStorefiles().stream()
.filter(sf -> sf.getReader() != null).filter(HStoreFile::isHFile)
.mapToInt(HStoreFile::getRefCount).max()
??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes same thing but with streams, so better
</property> | ||
<property> | ||
<name>hbase.regions.recovery.store.file.count</name> | ||
<value>128</value> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be a default of 1000 or so better.
Ya safe side leave META region from this as of now?
f58691f
to
86a1644
Compare
💔 -1 overall
This message was automatically generated. |
@@ -1471,6 +1471,13 @@ | |||
// User defined Default TTL config key | |||
public static final String DEFAULT_SNAPSHOT_TTL_CONFIG_KEY = "hbase.master.snapshot.ttl"; | |||
|
|||
// Regions Recovery based on high storeFileRefCount threshold value | |||
public static final String STORE_FILE_REF_COUNT_THRESHOLD = | |||
"hbase.regions.recovery.store.file.count"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You missed ref from the config name. Its not store file count threshold but the ref count threshold
</description> | ||
</property> | ||
<property> | ||
<name>hbase.regions.recovery.store.file.count</name> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya change here too
@@ -1635,6 +1651,7 @@ private void stopChores() { | |||
choreService.cancelChore(this.replicationBarrierCleaner); | |||
choreService.cancelChore(this.snapshotCleanerChore); | |||
choreService.cancelChore(this.hbckChore); | |||
choreService.cancelChore(this.regionsRecoveryChore); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the chore was not scheduled, this cancel call wont make any issue right? Just confirming
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it won't make any issues, I confirmed it
</property> | ||
<property> | ||
<name>hbase.regions.recovery.store.file.count</name> | ||
<value>-1</value> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should allow users to change this with out restarting the HM. But on a followup issue
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean similar to _switch command?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No.. Via ConfigurationObserver way and allow changes to be impacted with out restart of process
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, looking forward to it on follow up Jira
// count of all store files | ||
final int maxStoreFileRefCount = regionMetrics.getMaxStoreFileRefCount(); | ||
// ignore store file ref count threshold <= 0 (default is -1 i.e. disabled) | ||
if (storeFileRefCountThreshold > 0 && maxStoreFileRefCount > storeFileRefCountThreshold) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
storeFileRefCountThreshold > 0 check is a redundant one here and can be avoided?
@@ -50,15 +55,27 @@ | |||
|
|||
private TableName tableName; | |||
|
|||
// Specify specific regions of a table to reopen. | |||
// if specified null, all regions of the table will be reopened. | |||
private final List<byte[]> regionNamesList; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls avoid List, Map etc from the variable name if possible.
@@ -92,8 +109,9 @@ protected Flow executeFromState(MasterProcedureEnv env, ReopenTableRegionsState | |||
LOG.info("Table {} is disabled, give up reopening its regions", tableName); | |||
return Flow.NO_MORE_STATE; | |||
} | |||
regions = | |||
env.getAssignmentManager().getRegionStates().getRegionsOfTableForReopen(tableName); | |||
List<HRegionLocation> tableRegionsForReopen = env.getAssignmentManager() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be all the regions of this table right? Better name for the var will be tableRegions.
env.getAssignmentManager().getRegionStates().getRegionsOfTableForReopen(tableName); | ||
List<HRegionLocation> tableRegionsForReopen = env.getAssignmentManager() | ||
.getRegionStates().getRegionsOfTableForReopen(tableName); | ||
regions = prepareRegionsForReopen(tableRegionsForReopen); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name of the method is not self explaining. What prepare it is doing? This basically gets the Region locations of the mentioned region names right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can even pass the regionNamesList also here.
|
||
|
||
[[hbase.regions.recovery.store.file.count]] | ||
*`hbase.regions.recovery.store.file.count`*:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here also the config name to be corrected.
+ | ||
.Description | ||
|
||
Store files Ref Count threshold value considered |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can say bit more detail here. Very large ref count on a file indicates that its a ref leak on that object. Such files can not get removed even after its invalidation via compaction or so. Only way to come out of such situation is the reopen of the region
Updated the PR based on latest comments. Please review @anoopsjohn |
5cc5910
to
9fe22e2
Compare
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
9fe22e2
to
dbe98f3
Compare
💔 -1 overall
This message was automatically generated. |
@apurtell @anoopsjohn The status on this so far:
Thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
@apurtell Please review as per your convenience |
💔 -1 overall
This message was automatically generated. |
* HBASE-23136 (#712) * HBASE-23136 PartionedMobFileCompactor bulkloaded files shouldn't get replicated (addressing buklload replication related issue raised in HBASE-22380) Signed-off-by: Josh Elser <[email protected]> * HBASE-23042 Parameters are incorrect in procedures jsp (#728) * HBASE-23172 HBase Canary region success count metrics reflect column family successes, not region successes * HBASE-23012 Add HBase 1.3.6 to the downloads page (#738) * HBASE-22679 : Revamping CellUtil (#735) * HBASE-22679 : Revamping CellUtil * checkstyle fix * incorporating review * minor indentation * HBASE-22460 : Reopen regions with very high Store Ref Counts (#600) Signed-off-by Anoop Sam John <[email protected]> * HBASE-15519 Add per-user metrics with lossy counting Introducing property hbase.regionserver.user.metrics.enabled(Default:true) to disable user metrics in case it accounts for any performance issues Close #661 Signed-off-by: Josh Elser <[email protected]> * HBASE-23207 Log a region open journal (#751) Signed-off-by: Abhishek Singh Chouhan <[email protected]> * HBASE-23073 Add an optional costFunction to balance regions according to a capacity rule (#677) Signed-off-by: Wellington Chevreuil <[email protected]> * HBASE-23012 Addendum - Update 1.3.6 Release date in the downloads page * HBASE-23203 NPE in RSGroup info (#747) Signed-off-by: Duo Zhang <[email protected]> * HBASE-23194 Remove unused methods from TokenUtil (#737) Signed-off-by: Peter Somogyi <[email protected]> * HBASE-22991 add HBase 1.4.11 to the downloads page. closes #756 Signed-off-by: Jan Hentschel <[email protected]> * HBASE-23181 Blocked WAL archive: "LogRoller: Failed to schedule flush of XXXX, because it is not online on us" (#753) Signed-off-by: Lijin Bin <[email protected]> Signed-off-by: stack <[email protected]> * HBASE-23216 Add 2.2.2 to download page (#758) Signed-off-by: Guanghao Zhang <[email protected]> * HBASE-15519 Add per-user metrics with lossy counting (addendum) * HBASE-23222 MOB compaction supportability improvements * better logging on MOB compaction process * HFileCleanerDelegate to optionally halt removal of mob hfiles * use archiving when removing committed mob file after bulkload ref failure closes #763 Signed-off-by: Wellington Chevreuil <[email protected]> Signed-off-by: Balazs Meszaros <[email protected]> * HBASE-23199 Error populating Table-Attribute fields (#741) Signed-off-by: GuangxuCheng <[email protected]> * HBASE-23187 Update parent region state to SPLIT in meta (#732) * HBASE-23187 Update parent region state to SPLIT in meta * Revert "HBASE-23194 Remove unused methods from TokenUtil (#737)" This reverts commit d7b90b319908113bb90ae871cf4a5843bbf6bbaa. * HBASE-23208 Unit formatting in Master & RS UI Signed-off-by: binlijin <[email protected]> Signed-off-by: Sean Busbey <[email protected]> * HBASE-23227 Upgrade jackson-databind to 2.9.10.1 Signed-off-by: Sean Busbey <[email protected]> * HBASE-20827 Use backoff on CallQueueTooBigException when reporting region state transition Signed-off-by: Josh Elser <[email protected]> * HBASE-23191 EOFE log spam (#733) Convert log message added for 2.2.0 from INFO to DEBUG. Signed-off-by: stack <[email protected]> * HBASE-22917 Proc-WAL roll fails saying someone else has already created log (#544) Signed-off-by: stack <[email protected]> * HBASE-23192 CatalogJanitor consistencyCheck does not log problematic row on exception (#734) Adds logging of row and complaint if consistency check fails during CJ checking. Adds a few more null checks. Does edit on the 'HBCK Report' top line. Signed-off-by: Reid Chan <[email protected]> * HBASE-22739 ArrayIndexOutOfBoundsException when balance (#729) Signed-off-by: stack <[email protected]> * HBASE-23175 Yarn unable to acquire delegation token for HBase Spark jobs * HBASE-23231 ReplicationSource do not update metrics after refresh (#778) Signed-off-by: stack <[email protected]> Signed-off-by: Duo Zhang <[email protected]> * HBASE-23231 Addendum remove redundant metrics.clear in ReplicationSource This is only a problem on master which was introduced when we merge sync replication feature branch back * HBASE-23184 The HeapAllocation in WebUI is not accurate (#730) Signed-off-by: stack <[email protected]> * HBASE-23221 Polish the WAL interface after HBASE-23181 (#774) Removes the closeRegion flag added by HBASE-23181 and instead relies on reading meta WALEdit content. Modified how qualifier is written when the meta WALEdit is for a RegionEventDescriptor so the 'type' is added to the qualifer so can figure type w/o having to deserialize protobuf value content: e.g. HBASE::REGION_EVENT::REGION_CLOSE Added doc on WALEdit and tried to formalize the 'meta' WALEdit type and how it works. Needs complete redo in part as suggested by HBASE-8457. Meantime, some doc and cleanup. Also changed the LogRoller constructor to remove redundant param. Because of constructor change, need to change also TestFailedAppendAndSync, TestWALLockup, TestAsyncFSWAL & WALPerformanceEvaluation.java Signed-off-by: Duo Zhang <[email protected]> Signed-off-by: Lijin Bin <[email protected]> * Revert "HBASE-22917 Proc-WAL roll fails saying someone else has already created log (#544)" This reverts commit 538a4c51ff8464f57e26cc22f9c559c1a30b5864. * HBASE-23238 Additional test and checks for null references on ScannerCallableWithReplicas (#780) Signed-off-by: Sean Busbey <[email protected]> (cherry picked from commit 577db5d7e50c56b4773c9ce92b807aae80bf5706 - Test only, removed changes from no more exisiting ScannerCallableWithReplicas class) * HBASE-23244 NPEs running Canary (#784) Signed-off-by: Viraj Jasani <[email protected]> * HBASE-23244 NPEs running Canary (#784) Addendum to fix findbugs complaint. * HBASE-23241 TestExecutorService sometimes fail (#782) Signed-off-by: Wellington Chevreuil <[email protected]> * HBASE-23247 [hbck2] Schedule SCPs for 'Unknown Servers' (#791) Signed-off-by: Sean Busbey <[email protected]> Signed-off-by: Duo Zhang <[email protected]> * HBASE-23243 [pv2] Filter out SUCCESS procedures; on decent-sized cluster, plethora overwhelms problems (#790) Signed-off-by: GuangxuCheng <[email protected]> Signed-off-by: Sean Busbey <[email protected]> * HBASE-23212 : Dynamically reload configs for Region Recovery chore (#773) * HBASE-23212 : Dynamically reload configs for Region Recovery chore * remove redundant volatile * HBASE-23250 Log message about CleanerChore delegate initialization should be at INFO Signed-off-by: Jan Hentschel <[email protected]> * HBASE-21458 Error: Could not find or load main class org.apache.hadoop.hbase.util.GetJavaProperty Signed-off-by: Sean Busbey <[email protected]> * HBASE-23085 Network and Data related Actions Add monkey actions: - manipulate network packages with tc (reorder, loose,...) - add CPU load - fill the disk - corrupt or delete regionserver data files Extend HBaseClusterManager to allow sudo calls. Signed-off-by: Josh Elser <[email protected]> Signed-off-by: Balazs Meszaros <[email protected]> * HBASE-22980 HRegionPartioner getPartition() method incorrectly partitions the regions of the table. (#590) Signed-off-by: Guangxu Cheng <[email protected]> * HBASE-23230 Enforce member visibility in HRegionServer (#775) * Clean up a bunch of private variable leakage into other classes. Reduces visibility as much as possible, providing getters where access remains necessary or making use of getters that already exist. There remains an insidious relationship between `HRegionServer` and `RSRpcServices`. * Rename `fs` to `dataFs`, `rootDir` as `dataRootDir` so as to distinguish from the new `walFs`, `walRootDir` (and make it easier to spot bugs). * Cleanup or delete a bunch of lack-luster javadoc comments. * Delete a handful of methods that are unused according to static analysis. * Reduces the warning count as reported by IntelliJ from 100 to 7. Signed-off-by: stack <[email protected]> * HBASE-23272 Fix link in Developer guide to "code review checklist" (#805) Signed-off-by: stack <[email protected]> * HBASE-22480 Get block from BlockCache once and return this block to BlockCache twice make ref count error. * HBASE-23262 Cannot load Master UI (#799) Signed-off-by: Duo Zhang <[email protected]> * HBASE-23263 NPE in Quotas.jsp (#800) Signed-off-by: Guangxu Cheng <[email protected]> * HBASE-23257: Track clusterID in stand by masters (#798) This patch implements a simple cache that all the masters can lookup to serve cluster ID to clients. Active HMaster is still responsible for creating it but all the masters will read it from fs to serve clients. RPCs exposing it will come in a separate patch as a part of HBASE-18095. Signed-off-by: Andrew Purtell <[email protected]> Signed-off-by: Wellington Chevreuil <[email protected]> Signed-off-by: Guangxu Cheng <[email protected]> * HBASE-23236 Upgrade to yetus 0.11.1 Signed-off-by: stack <[email protected]> * HBASE-22888 Share some stuffs with the initial reader when new stream reader created (#581) Signed-off-by: Duo Zhang <[email protected]> * HBASE-23268 Remove disable/enable ops from doc when altering schema Signed-off-by: Wellington Chevreuil <[email protected]> * HBASE-23228 Allow for jdk8 specific modules on branch-1 in precommit/nightly testing (#804) Signed-off-by: Josh Elser <[email protected]> * HBASE-18439 Subclasses of o.a.h.h.chaos.actions.Action all use the same logger Signed-off-by: Jan Hentschel <[email protected]> Signed-off-by: Guangxu Cheng <[email protected]> * HBASE-23251 - Add Column Family and Table Names to HFileContext and use in HFileWriterImpl logging (#796) Signed-off-by: Andrew Purtell <[email protected]> Signed-off-by: Xu Cang <[email protected]> Signed-off-by: Zheng Hu <[email protected]> * HBASE-23245 : MutableHistogram constructor changes and provide HistogramImpl maxExpected as long (#787) Signed-off-by: Andrew Purtell <[email protected]> Signed-off-by: Xu Cang <[email protected]> Signed-off-by: Guangxu Cheng <[email protected]> * HBASE-23284 Fix Hadoop wiki link in Developer guide to "Distributions and Commercial Support" Signed-off-by: Nick Dimiduk <[email protected]> * HBASE-23290 shell processlist command is broken * HBASE-23283 Provide clear and consistent logging about the period of enabled chores Signed-off-by: Sean Busbey <[email protected]> * HBASE-19450 Add log about average execution time for ScheduledChore Signed-off-by: Sean Busbey <[email protected]> * HBASE-23294 ReplicationBarrierCleaner should delete all the barriers for a removed region which does not belong to any serial replication peer (#827) Signed-off-by: stack <[email protected]> * HBASE-19450 Addendum Limit logging of chore execution time at INFO to once per 5 minutes. * Ensure MovingAverage related classes are IA.Private * Move trace logging into MovingAverage class Signed-off-by: Duo Zhang <[email protected]> * HBASE-23102: Improper Usage of Map putIfAbsent (#828) Signed-off-by: Wellington Chevreuil <[email protected]> * HBASE-22969 A new binary component comparator(BinaryComponentComparator) to perform comparison of arbitrary length and position (#829) Signed-off-by: Balazs Meszaros <[email protected]> * HBASE-23289 Update links to Hadoop wiki in book and code Signed-off-by: Nick Dimiduk <[email protected]> * HBASE-23282 HBCKServerCrashProcedure for 'Unknown Servers' Have the existing scheduleRecoveries launch a new HBCKSCP instead of SCP. It gets regions to recover from Master in-memory context AND from a scan of hbase:meta. This new HBCKSCP is For processing 'Unknown Servers', servers that are 'dead' and purged but still have references in hbase:meta. Rare occurance but needs tooling to address. Later have catalogjanitor take care of these deviations between Master in-memory and hbase:meta content (usually because of overdriven cluster with failed RPCs to hbase:meta, etc) Changed expireServers in ServerManager so could pass in custom reaction to expired server.... This is how we run our custom HBCKSCP while keeping all other aspects of expiring services (rather than try replicate it externally). * HBASE-23182 The create-release scripts are broken (#736) Signed-off-by: stack <[email protected]> Signed-off-by: Bharath Vissapragada <[email protected]> * HBASE-23318 LoadTestTool doesn't start (#848) * Package the test jar from hbase-zookeeper into lib/ Signed-off-by: Duo Zhang <[email protected]> * HBASE-23278 Add a table-level compaction progress display on the UI (#816) Signed-off-by: Guangxu Cheng <[email protected]> * HBASE-23315 Miscellaneous HBCK Report page cleanup * Add a bit of javadoc around SerialReplicationChecker. * Miniscule edit to the profiler jsp page and then a bit of doc on how to make it work that might help. * Add some detail if NPE getting BitSetNode to help w/ debug. * Change HbckChore to log region names instead of encoded names; helps doing diagnostics; can take region name and query in shell to find out all about the region according to hbase:meta. * Add some fix-it help inline in the HBCK Report page – how to fix. * Add counts in procedures page so can see if making progress; move listing of WALs to end of the page. * HBASE-23308: Review of NullPointerExceptions (#836) Signed-off-by: stack <[email protected]> * HBASE-22607. TestExportSnapshotNoCluster fails intermittently * HBASE-23321 [hbck2] fixHoles of fixMeta doesn't update in-memory state * HBASE-23259: Populate master address end points in cluster/rs configs (#807) All the clients need to know the master RPC end points while using master based registry for creating cluster connections. This patch amends the test cluster utility to populate these configs in the base configuration object used to spin up the cluster. The config key added here ("hbase.master.addrs") is used in the subsequent patches for HBASE-18095. Signed-off-by: Nick Dimiduk <[email protected]> * HBASE-23322 [hbck2] Simplification on HBCKSCP scheduling Signed-off-by: Lijin Bin <[email protected]> * HBASE-23223 Support the offsetLock of bucketCache to use strong ref (#764) Signed-off-by: Wellington Chevreuil <[email protected]> * HBASE-23234 Provide .editorconfig based on checkstyle configuration (#846) This file is generated using IntelliJ, following these steps: #. Open Preferences > Editor > Code Style #. Select (config wheel) > Import Schema > CheckStyle Configuration #. Select `hbase-checkstyle/src/main/resources/hbase/checkstyle.xml` #. Select (config wheel) > Export > EditorConfig File Signed-off-by: Sean Busbey <[email protected]> * HBASE-23237 Prevent Negative values in metrics requestsPerSecond Closes #834 Signed-off-by: Guangxu Cheng <[email protected]> Signed-off-by: Josh Elser <[email protected]> * HBASE-23325 [UI]rsgoup average load keep two decimals (#860) Signed-off-by: Guangxu Cheng <[email protected]> * HBASE-23328 info:regioninfo goes wrong when region replicas enabled (#863) Signed-off-by: Ramkrishna <[email protected]> Signed-off-by: Guangxu Cheng <[email protected]> * HBASE-23329 Remove unused methods from RequestConverter (#870) Signed-off-by: Sean Busbey <[email protected]> * HBASE-23307 Add running of ReplicationBarrierCleaner to hbck2 fixMeta invocation (#859) Signed-off-by: Lijin Bin <[email protected]> * HBASE-22969 A new binary component comparator(BinaryComponentComparator) to perform comparison of arbitrary length and position; ADDENDUM (#869) Signed-off-by: Sean Busbey <[email protected]> * HBASE-23085 Network and Data related Actions; ADDENDUM (#871) Fix percentage in String.format Signed-off-by: Sean Busbey <[email protected]> * HBASE-23197 'IllegalArgumentException: Wrong FS' on edits replay when… (#740) Signed-off-by: Josh Elser <[email protected]> * HBASE-23334 The table-lock node of zk is not needed since HBASE-16786 (#873) Signed-off-by: Guangxu Cheng <[email protected]> * HBASE-23312 HBase Thrift SPNEGO configs (HBASE-19852) should be backwards compatible HBase Thrift SPNEGO configs should not be required. The `hbase.thrift.spnego.keytab.file` and `hbase.thrift.spnego.principal` configs should fall back to the `hbase.thrift.keytab.file` and `hbase.thrift.kerberos.principal` configs. This will avoid any issues during upgrades. Signed-off-by: Josh Elser <[email protected]> Amending-author: Josh Elser <[email protected]> Closes #850 * HBASE-23293 [REPLICATION] make ship edits timeout configurable (#825) Signed-off-by: Guangxu Cheng <[email protected]> Signed-off-by: Wellington Chevreuil <[email protected]> * HBASE-23336 [CLI] Incorrect row(s) count 'clear_deadservers' (#875) Signed-off-by: Guangxu Cheng <[email protected]> Signed-off-by: Lijin Bin <[email protected]> * HBASE-20395 Displaying thrift server type on the thrift page (#811) Signed-off-by: Lijin Bin <[email protected]> Signed-off-by: Bharath Vissapragada <[email protected]> * HBASE-23117: Bad enum in hbase:meta info:state column can fail loadMeta and stop startup (#867) * Handling the BAD value in info:state columns in hbase:meta * Adding a unit test and region encoded name in the LOG * Adding a null check for region state to complete the test scenario and fixing the nit Signed-off-by: Wellington Chevreuil <[email protected]> Signed-off-by: GuangxuCheng <[email protected]> Signed-off-by: stack <[email protected]> * HBASE-23313 [hbck2] setRegionState should update Master in-memory sta… (#864) Signed-off-by: Mingliang Liu <[email protected]> Signed-off-by: stack <[email protected]> * HBASE-20395 Addendum Displaying thrift server type on the thrift page * HBASE-23323 Update downloads page for Apache HBase 1.4.12. (#886) Signed-off-by: stack <[email protected]> * HBASE-23352: Allow chaos monkeys to access cmd line params, and improve FillDiskCommandAction (#885) Instead of using the default properties when checking for monkey properties, now we use the ones already extended with command line params. Change FillDiskCommandAction to try to stop the remote process if the command failed with an exception. Signed-off-by: stack <[email protected]> * HBASE-23342 : Handle NPE while closing compressingStream (#877) Signed-off-by Anoop Sam John <[email protected]> * HBASE-23298 Refactor LogRecoveredEditsOutputSink and BoundedLogWriterCreationOutputSink (#832) Signed-off-by: Duo Zhang <[email protected]> * HBASE-23335 : Improving cost functions array copy in StochasticLoadBalancer (#874) * HBASE-23337 Release scripts should rely on maven for deploy. (#887) - switch to nexus-staging-maven-plugin for asf-release - refactor release-build to use mvn deploy and its output. - cleaned up some tabs in the root pom Signed-off-by: stack <[email protected]> * HBASE-23345 Table need to replication unless all of cfs are excluded (#881) Signed-off-by: stack <[email protected]> Signed-off-by: Guanghao Zhang <[email protected]> * HBASE-22096 /storeFile.jsp shows CorruptHFileException when the storeFile is a reference file (#888) Signed-off-by: Lijin Bin <[email protected]> * HBASE-23356 When construct StoreScanner throw exceptions it is possible to left some KeyValueScanner not closed. (#891) Signed-off-by: GuangxuCheng <[email protected]> * HBASE-23357 Add 2.1.8 to download page (#892) Signed-off-by: Guanghao Zhang <[email protected]> * HBASE-23362: [WalPrettyPrinter] print/filter by table name. (#898) Signed-off-by: Wellington Chevreuil <[email protected]> * HBASE-22529 Add sanity check for in-memory compaction policy Signed-off-by: Toshihiro Suzuki <[email protected]> * HBASE-23365 Minor change MemStoreFlusher's log (#900) Signed-off-by: GuangxuCheng <[email protected]> Signed-off-by: Xu Cang <[email protected]> * HBASE-23361 Limit two decimals in total average load (#897) Signed-off-by: GuangxuCheng <[email protected]> Signed-off-by: Xu Cang <[email protected]> * HBASE-23367 Remove unused constructor from WALPrettyPrinter (#901) Signed-off-by: Lijin Bin <[email protected]> Signed-off-by: Xu Cang <[email protected]> * HBASE-23364 HRegionServer sometimes does not shut down. * HBASE-23373 Log `RetriesExhaustedException` context with full time precision (#903) Signed-off-by: Lijin Bin <[email protected]> * HBASE-23309: Adding the flexibility to ChainWalEntryFilter to filter the whole entry if all cells get filtered (#837) HBASE-23309: Adding the flexibility to ChainWalEntryFilter to filter the whole entry if all cells get filtered * HBASE-23303 Add security headers to REST server/info page (#843) Signed-off-by: Toshihiro Suzuki <[email protected]> Signed-off-by: Sean Busbey <[email protected]> * HBASE-22280 Separate read/write handler for priority request(especial… (#202) Signed-off-by: Yu Li <[email protected]> * HBASE-18382 add transport type info into Thrift UI (#880) Signed-off-by: Wellington Chevreuil <[email protected]> Signed-off-by: Bharath Vissapragada <[email protected]> Signed-off-by: Viraj Jasani <[email protected]> * HBASE-23377 Balancer should skip disabled tables's regions (#908) Signed-off-by: Wellington Chevreuil <[email protected]> Signed-off-by: GuangxuCheng <[email protected]> Signed-off-by Anoop Sam John <[email protected]> Signed-off-by: Reid Chan <[email protected]> * HBASE-23066 Allow cache on write during compactions when prefetching is (#919) Contributed by : Jacob LeBlanc * HBASE-23552 Format Javadocs on ITBLL We have this nice description in the java doc on ITBLL but it's unformatted and thus illegible. Add some formatting so that it can be read by humans. Signed-off-by: Jan Hentschel <[email protected]> Signed-off-by: Josh Elser <[email protected]> * HBASE-23379 Clean Up FSUtil getRegionLocalityMappingFromFS Signed-off-by: Jan Hentschel <[email protected]> * HBASE-23556: Minor ChoreService Cleanup (#927) * HBASE-23553 Snapshot referenced data files are deleted in some case (#922) * HBASE-23360 [CLI] Fix help command 'set_quota' for removing limits (#896) * Revert "HBASE-23066 Allow cache on write during compactions when prefetching is (#919)" TestCacheOnWrite failing all the time. This reverts commit d561130e82c5956f0dd9fff5c6f6240a686d3d6a. * HBASE-23554 Encoded regionname to regionname utility (#923) Adds shell command regioninfo: hbase(main):001:0> regioninfo '0e6aa5c19ae2b2627649dc7708ce27d0' {ENCODED => 0e6aa5c19ae2b2627649dc7708ce27d0, NAME => 'TestTable,,1575941375972.0e6aa5c19ae2b2627649dc7708ce27d0.', STARTKEY => '', ENDKEY => '00000000000000000000299441'} Took 0.4737 seconds Signed-off-by: Sean Busbey <[email protected]> Signed-off-by: Duo Zhang <[email protected]> * HBASE-23554 Encoded regionname to regionname utility (#923); ADDENDUM * HBASE-23555 TestQuotaThrottle is broken (#924) * HBASE-23566: Fix package/packet terminology problem in chaos monkeys (#933) s/package/packet/g Signed-off-by: Sean Busbey <[email protected]> * HBASE-23570 Point users to the async-profiler home page if diagrams are coming up blank (#937) * HBASE-23575 Remove dead code in AsyncRegistry (#929) Removes a bunch of dead code and fixes some checkstyle nits. Signed-off-by: Viraj Jasani <[email protected]> Signed-off-by: Sean Busbey <[email protected]> * HBASE-23380 General cleanup of FSUtil (#912) * Clean up JavaDocs * Clean up logging and error messages * Remove superfluous code * Replace static code with library call * Do not swallow Interrupted Exceptions * Use try-with-resources * User multi-Exception catches to reduce code size Signed-off-by: Jan Hentschel <[email protected]> Signed-off-by: Sean Busbey <[email protected]> * HBASE-22920 github pr testing job should use dev-support script (#883) Signed-off-by: stack <[email protected]> Signed-off-by: Sean Busbey <[email protected]> Signed-off-by: Bharath Vissapragada <[email protected]> * HBASE-23582 Unbalanced braces in string representation of table descriptor Signed-off-by: Lijin Bin <[email protected]> Signed-off-by: Jan Hentschel <[email protected]> * HBASE-23066 Allow cache on write during compactions when prefetching … (#935) * HBASE-23066 Allow cache on write during compactions when prefetching is enabled * Fix checkstyle issues - recommit * HBASE-23065 [hbtop] Top-N heavy hitter user and client drill downs Signed-off-by: Toshihiro Suzuki <[email protected]> Signed-off-by: Josh Elser <[email protected]> Signed-off-by: Andrew Purtell <[email protected]> * HBASE-23575 Remove dead code in AsyncRegistry (addendum) Additions to MiniHBaseCluster needed for branch-2. Signed-off-by: Jan Hentschel <[email protected]> Signed-off-by: Xu Cang <[email protected]> Signed-off-by: Sean Busbey <[email protected]> Signed-off-by: Viraj Jasani <[email protected]> * HBASE-23239 Reporting on status of backing MOB files from client-facing cells (#785) * Adds a new MapReduce job that builds a report of health of mob files * Also builds background information on mob system use * add a basic mob architecture in the ref guide to explain how mob takes the reference hfile value and finds the actual cell contents * add a troubleshooting mob section to the ref guide to explain how to do a mob reference scan. Signed-off-by: Peter Somogyi <[email protected]> * HBASE-23549 Document steps to disable MOB for a column family (#928) Signed-off-by: Peter Somogyi <[email protected]> Signed-off-by: Josh Elser <[email protected]> * HBASE-20461 Implement fsync for AsyncFSWAL (#947) Signed-off-by: stack <[email protected]> * HBASE-23376 NPE happens while replica region is moving (#906) Signed-off-by: Duo Zhang <[email protected]> * HBASE-23594 Procedure stuck due to region happen to recorded on two servers. (#953) Signed-off-by: stack <[email protected]> * HBASE-23564 RegionStates may has some expired serverinfo and make regions do not balance. (#930) Signed-off-by: stack <[email protected]> Signed-off-by: Wellington Chevreuil <[email protected]> * HBASE-23572 In 'HBCK Report', distringush between live, dead, and unknown servers Signed-off-by: Sean Busbey <[email protected]> * HBASE-23320 Upgrade surefire plugin to 3.0.0-M4 Signed-off-by: Peter Somogyi <[email protected]> * HBASE-23581 Creating table gets stuck when specifying an invalid split policy as METADATA (#942) Signed-off-by: Lijin Bin <[email protected]> Signed-off-by: Anoop Sam John <[email protected]> Signed-off-by: Xu Cang <[email protected]> * HBASE-23589: FlushDescriptor contains non-matching family/output combinations (#949) Signed-off-by: Duo Zhang <[email protected]> Signed-off-by: Lijin Bin <[email protected]> * HBASE-23613 ProcedureExecutor check StuckWorkers blocked by DeadServe… (#960) Signed-off-by: stack <[email protected]> Signed-off-by: Duo Zhang <[email protected]> * HBASE-23326 Implement a ProcedureStore which stores procedures in a HRegion (#941) Signed-off-by: Guanghao Zhang <[email protected]> Signed-off-by: stack <[email protected]> * HBASE-23374 ExclusiveMemHFileBlock’s allocator should not be hardcoded as ByteBuffAllocator.HEAP Signed-off-by: stack <[email protected]> Signed-off-by: Jan Hentschel <[email protected]> * HBASE-23238: Remove 'static'ness of cell counter in LimitKVsReturnFilter (addendum) (#963) Having it as static means the test cannot be parameterized (ran into this issue in HBASE-23305). That happens because the field is not reset between parameterized runs. * HBASE-23286 Improve MTTR: Split WAL to HFile (#820) Signed-off-by: Duo Zhang <[email protected]> * HBASE-23619 Used built-in formatting for logger in hbase-zookeeper Signed-off-by: stack <[email protected]> * Adding developer details to pom.xml * HBASE-23617 Add a stress test tool for region based procedure store (#962) Signed-off-by: stack <[email protected]> * HBASE-23621 Reduced the number of Checkstyle violations in tests of hbase-common Signed-off-by: stack <[email protected]> Signed-off-by: Viraj Jasani <[email protected]> * HBASE-23622 Reduced the number of Checkstyle violations in hbase-common Signed-off-by: stack <[email protected]> Signed-off-by: Viraj Jasani <[email protected]> * HBASE-23618 Add a tool to dump procedure info in the WAL file (#969) Signed-off-by: stack <[email protected]> * HBASE-23618 Addendum add main method * HBASE-23626 Reduced number of Checkstyle violations in tests in hbase-common Signed-off-by: Viraj Jasani <[email protected]> * HBASE-23615 Use a dedicated thread for executing WorkerMonitor in Pro… (#961) Signed-off-by: stack <[email protected]> Signed-off-by: Duo Zhang <[email protected]> Signed-off-by: virajjasani <[email protected]> * HBASE-23590 : Update maxStoreFileRefCount to maxCompactedStoreFileRefCount for auto region recovery based on old reader references Signed-off-by: Anoop Sam John <[email protected]> * HBASE-23625 Reduced number of Checkstyle violations in hbase-common Signed-off-by: stack <[email protected]> * HBASE-23627 Resolved remaining Checkstyle violations in hbase-thrift Signed-off-by: Viraj Jasani <[email protected]> Signed-off-by: stack <[email protected]> * HBASE-23623 Reduced the number of Checkstyle violations in hbase-rest Signed-off-by: stack <[email protected]> Signed-off-by: Viraj Jasani <[email protected]> * HBASE-23624 Add a tool to dump the procedure info in HFile (#975) Signed-off-by: stack <[email protected]> * HBASE-23629: Add to 'Supporting Projects' in site (#976) * Update supportingprojects.xml Signed-off-by: Jan Hentschel <[email protected]> Signed-off-by: Viraj Jasani <[email protected]> * HBASE-23588 : Cache index & bloom blocks on write if CacheCompactedBlocksOnWrite is enabled Signed-off-by: ramkrish86 <[email protected]> Signed-off-by: chenxu14 <[email protected]> * HBASE-23628: Remove Apache Commons Digest Base64 (#977) Signed-off-by: stack <[email protected]> * HBASE-23596 HBCKServerCrashProcedure can double assign Signed-off-by: Duo Zhang <[email protected]> Signed-off-by: Lijin Bin <[email protected]> Signed-off-by: Viraj Jasani <[email protected]> Change its behavior so it will only look in hbase:meta if the call to the super class turns up zero references. Only then will it search hbase:meta for references to 'Unknown Servers'. Normal operation where we read Master context is usual and sufficient. The scan of hbase:meta is only for case where Master state has been corrupted and we need to clear out 'Unknown Servers'. * HBASE-23632 DeadServer cleanup (#979) Signed-off-by: Bharath Vissapragada <[email protected]> * HBASE-23098 [bulkload] If one of the peers in a cluster is configured with NAMESPACE level, its hfile-refs(zk) will be backlogged (#676) Signed-off-by: Wellington Chevreuil <[email protected]> Signed-off-by: stack <[email protected]> * HBASE-23587 The FSYNC_WAL flag does not work on branch-2.x (#974) Signed-off-by: Guanghao Zhang <[email protected]> Signed-off-by: stack <[email protected]> * HBASE-23369 Auto-close 'unknown' Regions reported as OPEN on RegionServers Master force-closes unknown/incorrect Regions OPEN on RS M hbase-client/src/main/java/org/apache/hadoop/hbase/MetaTableAccessor.java Added a note and small refactor. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java Fix an NPE when CJ ran. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java Minor clean up of log message; make it clearer. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java Make it so closeRegionSilentlyAndWait can be used w/o timeout. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java If a RegionServer Report notes a Region is OPEN and the Master does not know of said Region, close it (We used to crash out the RegionServer) M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStateNode.java Minor tweak of toString -- label should be state, not rit (confusing). M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStates.java Doc. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/TransitRegionStateProcedure.java Add region name to exception. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/HBCKServerCrashProcedure.java Be more careful about which Regions we queue up for reassign. This procedure is run by the operator so could happen at any time. We will likely be running this when Master has some accounting of cluster members so check its answers for what Regions were on server before running. M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java Doc and we were misrepresenting the case where a Region as not in RIT when we got CLOSE -- we were reporting it as though it was already trying to CLOSE. * HBASE-23333 Include Call.toShortString() in sendCall exceptions * HBASE-23369 Auto-close 'unknown' Regions reported as OPEN on RegionServers; ADDENDUM on master * HBASE-23635 Reduced number of Checkstyle violations in hbase-mapreduce Signed-off-by: Viraj Jasani <[email protected]> Signed-off-by: stack <[email protected]> * HBASE-23588 : Cache index & bloom blocks on write (ADDENDUM) Signed-off-by: Anoop Sam John <[email protected]> * HBASE-23636 Disable table may hang when regionserver stop or abort. (#982) Signed-off-by: Duo Zhang <[email protected]> Signed-off-by: virajjasani <[email protected]> * HBASE-23645 Fixed remaining Checkstyle violations in hbase-common tests Signed-off-by: Viraj Jasani <[email protected]> * HBASE-22908 Link To HBase 1.4 Documentation on HBase Site (#993) Signed-off-by: stack <[email protected]> Signed-off-by: Sean Busbey <[email protected]> * HBASE-23651 Region balance throttling can be disabled (#991) Signed-off-by: Viraj Jasani <[email protected]> Signed-off-by: Anoop Sam John <[email protected]> * HBASE-23660 hbase:meta's table.jsp ref to wrong rs address (#999) Signed-off-by: Wellington Chevreuil <[email protected]> * HBASE-23658 Fix flaky TestSnapshotFromMaster (#998) Signed-off-by: Duo Zhang <[email protected]> * HBASE-23663 Allow dot and hyphen in Profiler's URL (#1002) Signed-off-by: Sean Busbey <[email protected]> Signed-off-by: Viraj Jasani <[email protected]> * HBASE-23655 Fix flaky TestRSGroupsKillRS: should wait the SCP to finish (#996) Signed-off-by: Duo Zhang <[email protected]> * HBASE-23654 Adding Apache Trafodion and EsgynDB to 'Powered by Apache HBase' page Signed-off-by: Viraj Jasani <[email protected]> Signed-off-by: Jan Hentschel <[email protected]> * HBASE-23664 Upgrade JUnit to 4.13 (#1004) Signed-off-by: Sean Busbey <[email protected]> Signed-off-by: Bharath Vissapragada <[email protected]> Signed-off-by: Jan Hentschel <[email protected]> Signed-off-by: Duo Zhang <[email protected]> Signed-off-by: Viraj Jasani <[email protected]> * HBASE-22285 A normalizer which merges small size regions with adjacent regions (#978) Signed-off-by: Viraj Jasani <[email protected]> Signed-off-by: stack <[email protected]> * HBASE-23378: Clean Up FSUtil setClusterId (#910) Signed-off-by: stack <[email protected]> * HBASE-23055 Alter hbase:meta Make it so hbase:meta can be altered. TableState for hbase:meta was hardcoded ENABLED. Make it dynamic. State is now kept in current active Master. It is transient so falls back to default if Master crashes. Add to registry a getMetaTableState which reads mirrored state from zookeeper (NOT from Master and defaults ENABLED if no implementation or error fetching state). hbase:meta schema will be bootstrapped from the filesystem. Changes to filesystem schema are atomic so we should be ok if Master fails mid-edit (TBD). Undoes a bunch of guards that prevented our being able to edit hbase:meta. TODO: Tests, more clarity around hbase:meta table state, and undoing references to hard-coded hbase:meta regioninfo. M hbase-client/src/main/java/org/apache/hadoop/hbase/MetaTableAccessor.java Throw illegal access exception if you try to use MetaTableAccessor getting state of the hbase:meta table. M hbase-client/src/main/java/org/apache/hadoop/hbase/client/ConnectionImplementation.java Add fetching of hbase:meta table state from registry. Adds cache of tablestates w/ a ttl of 1 second (adjustable). M hbase-client/src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java M hbase-client/src/main/java/org/apache/hadoop/hbase/client/RawAsyncHBaseAdmin.java Add querying registry for hbase:meta table state. M hbase-client/src/main/java/org/apache/hadoop/hbase/client/ZKAsyncRegistry.java Add querying of mirrored table state for hbase:meta table. M hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/ZNodePaths.java Shutdown access. M hbase-server/src/main/java/org/apache/hadoop/hbase/TableDescriptors.java Just cleanup. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/TableStateManager.java Add state holder for hbase:meta. Removed unused methods. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/RegionStateStore.java Shut down access. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DisableTableProcedure.java Allow hbase:meta to be disabled. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/EnableTableProcedure.java Allow hbase:meta to be enabled. Signed-off-by: Bharath Vissapragada <[email protected]> * HBASE-23659 BaseLoadBalancer#wouldLowerAvailability should consider region replicas (#1001) Signed-off-by: stack <[email protected]> Signed-off-by: Duo Zhang <[email protected]> * HBASE-23668 Master log start filling with "Flush journal status" messages" This reverts commit fb9fa04da72379431d13f22a7e5d8e75ae1267be. i.e. reapplication of patch that was preamaturely applied. Signed-off-by: Duo Zhang <[email protected]> * Revert "HBASE-23668 Master log start filling with "Flush journal status" messages" Minor addendum fixing log message. * HBASE-23662 : Replace HColumnDescriptor(String cf) with ColumnFamilyDescriptor Signed-off-by: Peter Somogyi <[email protected]> Signed-off-by: Jan Hentschel <[email protected]> * HBASE-23675 Move to Apache parent POM version 22 * HBASE-23165 [hbtop] Some modifications from HBASE-22988 (#987) Signed-off-by: stack <[email protected]> * Revert "HBASE-23055 Alter hbase:meta" This reverts commit 9abdb7b5ae4de0b6d839f5c5d85e9bb899b5edbd. * HBASE-23646 Resolved remaining Checkstyle violations in tests of hbase-rest Signed-off-by: Viraj Jasani <[email protected]> * HBASE-23681 Add UT for procedure store region flusher (#1024) Signed-off-by: stack <[email protected]> * HBASE-23662 : Keep HCD constructor until shell usages are replaced * HBASE-23679 FileSystem objects leak when cleaned up in cleanupBulkLoad The cleanupBulkLoad method is only called for the first Region in the table which was being bulk loaded into. This means that potentially N-1 other RegionServers (where N is the number of RegionServers) will leak one FileSystem object into the FileSystem cache which will never be cleaned up. We need to do this clean-up as a part of secureBulkLoadHFiles otherwise we cannot guarantee that heap usage won't grow unbounded. Closes #1029 Signed-off-by: Sean Busbey <[email protected]> * HBASE-23383 [hbck2] `fixHoles` should queue assignment procedures for any regions its fixing (#917) The current process for an operator, after fixing holes in meta, is to manually disable and enable the whole table. Let's try to avoid bringing the whole table offline if we can. Have the master attempt to queue up assignment procedures for any new regions it creates. Signed-off-by: stack <[email protected]> * HBASE-23687 DEBUG logging cleanup (#1040) Signed-off-by: Jan Hentschel <[email protected]> * HBASE-23689: Bookmark for github PR to jira redirection (#1042) Signed-off-by: stack <[email protected]> * HBASE-23688 Update docs for setting up IntelliJ as a development environment (#1041) Signed-off-by: stack <[email protected]> * HBASE-23569 : Validate that all default chores of HMaster are scheduled Signed-off-by: Andrew Purtell <[email protected]> * HBASE-23691 Add 2.2.3 to download page (#1045) Signed-off-by: Jan Hentschel <[email protected]> * fix 500/NPE of region.jsp (#1033) Signed-off-by: Wellington Chevreuil <[email protected]> * HBASE-23683 Make HBaseInterClusterReplicationEndpoint more extensible (#1027) Signed-off-by: Bharath Vissapragada <[email protected]> Signed-off-by: Josh Elser <[email protected]> Signed-off-by: binlijin <[email protected]> * HBASE-23665: Split unit tests from TestTableName into a separate test-only class. (#1032) Signed-off-by: Nick Dimiduk <[email protected]> * Revert "fix 500/NPE of region.jsp (#1033)" Missed the jira number on that commit message. Will re-apply it with the jira number. This reverts commit d60ce17c1765a445e944738f49953579bdf0bba6. * HBASE-23677 fix 500/NPE of region.jsp (#1033) Signed-off-by: Wellington Chevreuil <[email protected]> * HBASE-23674 Too many rit page Numbers show confusion * HBASE-23694 After RegionProcedureStore completes migration of WALProcedureStore, still running WALProcedureStore.syncThread keeps trying to delete now inexistent log files. (#1048) Signed-off-by: Duo Zhang <[email protected]> * HBASE-23652 Move the unsupported procedure type check before migrating to RegionProcedureStore (#1018) Signed-off-by: Nick Dimiduk <[email protected]> * HBASE-23695 Fail gracefully if no category is present Signed-off-by: Wellington Chevreuil <[email protected]> Closes #1052 * HBASE-23347 Allow custom authentication methods for RPCs Decouple the HBase internals such that someone can implement their own SASL-based authentication mechanism and plug it into HBase RegionServers/Masters. Comes with a design doc in dev-support/design-docs and an example in hbase-examples known as "Shade" which uses a flat-password file for authenticating users. Closes #884 Signed-off-by: Wellington Chevreuil <[email protected]> Signed-off-by: Andrew Purtell <[email protected]> Signed-off-by: Reid Chan <[email protected]> * HBASE-23653 Expose content of meta table in web ui (#1020) Adds a display of the content of 'hbase:meta' to the Master's table.jsp, when that table is selected. Supports basic pagination, filtering, &c. Signed-off-by: stack <[email protected]> Signed-off-by: Bharath Vissapragada <[email protected]> * HBASE-23569 : Validate that all default chores of HRegionServer are scheduled (ADDENDUM) Signed-off-by: Andrew Purtell <[email protected]> * HBASE-23703 Add HBase 2.2.3 documentation to website (#1059) Signed-off-by: Jan Hentschel <[email protected]> * HBASE-23701 Try to converge automated checks around Category We have a compile-time guarantee to either have a category array of length zero or one because the category annotation is non-repeatable. Also tries to clean up the checking for unit-test annotations. Closes #1057 Signed-off-by: Viraj Jasani <[email protected]> Signed-off-by: Bharath Vissapragada <[email protected]> * HBASE-23690 Checkstyle plugin complains about our checkstyle.xml format; doc how to resolve mismatched version (#1044) Signed-off-by: Jan Hentschel <[email protected]> Signed-off-by: Bharath Vissapragada <[email protected]> * HBASE-23612 Add new profile to make hbase build success on ARM (#959) * HBASE-23700 Upgrade checkstyle and plugin versions (#1056) Bump checkstyle version to 8.28, maven-checkstyle-plugin to 3.1.0. As per HBASE-23242 and the updated checkstyle docs[1], the LineLength check should be placed under an instance of Checker. [1] https://checkstyle.sourceforge.io/config_sizes.html#LineLength Co-authored-by: Bharath Vissapragada <[email protected]> Signed-off-by: Jan Hentschel <[email protected]> Signed-off-by: Viraj Jasani <[email protected]> * HBASE-23653 Expose content of meta table in web ui; addendum (#1061) Fix error prone problem Signed-off-by: Nick Dimiduk <[email protected]> Signed-off-by: Viraj Jasani <[email protected]> Signed-off-by: stack <[email protected]> * HBASE-23680 RegionProcedureStore missing cleaning of hfile archive (#1022) Signed-off-by: stack <[email protected]> * HBASE-23661 Reduced number of Checkstyle violations in hbase-rest Signed-off-by: Peter Somogyi <[email protected]> Signed-off-by: Sean Busbey <[email protected]> * HBASE-23686 Revert binary incompatible change in ByteRangeUtils and removed reflections in CommonFSUtils Signed-off-by: Sean Busbey <[email protected]> * HBASE-23156 start-hbase.sh failed with ClassNotFoundException when build with hadoop3 (#1067) Signed-off-by: Duo Zhang <[email protected]> * HBASE-23347 Allow custom authentication methods for RPCs; addendum (#1060) Signed-off-by: Viraj Jasani <[email protected]> * HBASE-23055 Alter hbase:meta (#1043) Make hbase:meta region schema dynamic. Patch has been under development a good while and its focus has changed a few times so its bloated with fixup from older versions. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/TableStateManager.java M hbase-client/src/main/java/org/apache/hadoop/hbase/zookeeper/ZNodePaths.java Shut down access to internals and removed unused methods. M hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/EnableTableProcedure.java Cleanup/refactor section on replica-handling. M hbase-server/src/main/java/org/apache/hadoop/hbase/util/FSTableDescriptors.java Get hbase:meta schema from filesystem rather than from hard-coding. * HBASE-20516 Offheap read-path needs more detail (#1081) Signed-off-by: Anoop Sam John <[email protected]> * HBASE-23711 - Add test for MinVersions and KeepDeletedCells TTL (#1079) Signed-off-by: stack <[email protected]> Signed-off-by: Viraj Jasani <[email protected]> * HBASE-23709 Unwrap the real user to properly dispatch proxy-user auth'n REST and Thrift servers started failing because the check in BuiltinProviderSelector wasn't checking the "real" user for kerberos credentials. This resulted in the KerberosAuthnProvider not being invoked when it should have been. Closes #1080 Signed-off-by: Peter Somogyi <[email protected]> * HBASE-23719 Add 1.5.0 release to Downloads (#1083) Signed-off-by: Jan Hentschel <[email protected]> * HBASE-23720 [create-release] Update yetus version used from 0.11.0 to 0.11.1 * HBASE-23705 Add CellComparator to HFileContext (#1062) Codecs don't have access to what CellComparator to use. Backfill. M hbase-common/src/main/java/org/apache/hadoop/hbase/CellComparator.java Adds a new compareRows with default implementation that takes a ByteBuffer. Needed by the index in a block encoder implementation. M hbase-common/src/main/java/org/apache/hadoop/hbase/CellComparatorImpl.java Adds implementation for meta of new compareRows method. Adds utility method for figuring comparator based off tablename. M hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/AbstractDataBlockEncoder.java M hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java M hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/RowIndexCodecV1.java M hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/RowIndexSeekerV1.java Comparator is in context. Remove redundant handling. M hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java Comparator is in context. Remove redundant handling. Clean javadoc. M hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/HFileBlockDecodingContext.java Clean javadoc. M hbase-common/src/main/java/org/apache/hadoop/hbase/io/encoding/RowIndexEncoderV1.java Cache context so can use it to get comparator to use later. M hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContext.java Cache cellcomparator to use. Javdoc on diff between HFileContext and HFileInfo. M hbase-common/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileContextBuilder.java Add CellComparator M hbase-mapreduce/src/main/java/org/apache/hadoop/hbase/mapreduce/HFileOutputFormat2.java M hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java M hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderImpl.java M hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileWriterImpl.java M hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFileWriter.java Remove comparator caching. Get from context instead. M hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/FixedFileTrailer.java Skip a reflection if we can. M hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileInfo.java Javadoc. Removed unused filed. Signed-off-by: Anoop Sam John <[email protected]> Signed-off-by: Ramkrishna <[email protected]> Signed-off-by: Jan Hentschel <[email protected]> * HBASE-23715 MasterFileSystem should not create MasterProcWALs dir on … (#1078) Signed-off-by: Josh Elser <[email protected]> * HBASE-23069 periodic dependency bump for Sep 2019 (#1082) Signed-off-by: Nick Dimiduk <[email protected]> * HBASE-21065 Try ROW_INDEX_V1 encoding on meta table (fix bloomfilters… (#1012) Set encoding and blooms on meta as default. Also shutdown access to the initial meta schema creating method; get from TableDescriptors if you need access to schema or edit it as you would any other table if you want to edit it. Signed-off-by: Viraj Jasani <[email protected]> * HBASE-23069 periodic dependency bump for Sep 2019 (#1082); ADDENDUM For asciidoctor, s/1.5.8/1.5.8.1/. * HBASE-23722 Real user might be null in non-proxy-user case Closes #1085 Signed-off-by: stack <[email protected]> Signed-off-by: Viraj Jasani <[email protected]> * HBASE-23069 periodic dependency bump for Sep 2019 (#1082); ADDENDUM Remove staging repo added by mistake. * HBASE-23069 periodic dependency bump for Sep 2019 (#1082); ADDENDUM AND.... undo thirdparty testing version update. * HBASE-23710 - Priority configuration for system coprocessors (#1077) Signed-off-by: Viraj Jasani <[email protected]> * HBASE-23729 [Flakeys] TestRSGroupsBasics#testClearNotProcessedDeadServer fails most of the time * HBASE-23733 [Flakey Tests] TestSplitTransactionOnCluster * HBASE-23726 Forward-port HBASE-21345 to branch-2.2, 2.3 & master as well. HBASE-21345 - [hbck2] Allow version check to proceed even though master is 'initializing'. Just remove the check state from the getClusterStatus call. Signed-off-by: Michael Stack <[email protected]> Signed-off-by: Peter Somogyi <[email protected]> Signed-off-by: Sakthi <[email protected]> (cherry picked from commit dd8496a5460693c49ec0bf5475ef79e40457e6bd) * HBASE-23735 [Flakey Tests] TestClusterRestartFailover & TestClusterRestartFailoverSplitWithoutZk * HBASE-23737 [Flakey Tests] TestFavoredNodeTableImport fails 30% of the time; AMENDMENT This is actual fix; previous added debug to test. * HBASE-23737 [Flakey Tests] TestFavoredNodeTableImport fails 30% of the time Second part of this issue -- changes to the test. Co-authored-by: Wellington Ramos Chevreuil <[email protected]> Co-authored-by: meiyi <[email protected]> Co-authored-by: Sakthi <[email protected]> Co-authored-by: Viraj Jasani <[email protected]> Co-authored-by: Ankit Singhal <[email protected]> Co-authored-by: Andrew Purtell <[email protected]> Co-authored-by: Pierre Zemb <[email protected]> Co-authored-by: Karthik Palanisamy <[email protected]> Co-authored-by: Sean Busbey <[email protected]> Co-authored-by: Duo Zhang <[email protected]> Co-authored-by: Peter Somogyi <[email protected]> Co-authored-by: binlijin <[email protected]> Co-authored-by: Wei-Chiu Chuang <[email protected]> Co-authored-by: Pankaj <[email protected]> Co-authored-by: Michael Stack <[email protected]> Co-authored-by: chenxu14 <[email protected]> Co-authored-by: ravowlga123 <[email protected]> Co-authored-by: BukrosSzabolcs <[email protected]> Co-authored-by: Shardul Singh <[email protected]> Co-authored-by: Nick Dimiduk <[email protected]> Co-authored-by: Bharath Vissapragada <[email protected]> Co-authored-by: Dice <[email protected]> Co-authored-by: Geoffrey Jacoby <[email protected]> Co-authored-by: Mingliang Liu <[email protected]> Co-authored-by: Reid Chan <[email protected]> Co-authored-by: belugabehr <[email protected]> Co-authored-by: Udai Bhan Kashyap <[email protected]> Co-authored-by: Baiqiang Zhao <[email protected]> Co-authored-by: bsglz <[email protected]> Co-authored-by: xuqinya1 <[email protected]> Co-authored-by: Kevin Risden <[email protected]> Co-authored-by: Guangxu Cheng <[email protected]> Co-authored-by: Sandeep Pal <[email protected]> Co-authored-by: Guanghao Zhang <[email protected]> Co-authored-by: XinSun <[email protected]> Co-authored-by: Toshihiro Suzuki <[email protected]> Co-authored-by: Junegunn Choi <[email protected]> Co-authored-by: lhofhansl <[email protected]> Co-authored-by: Andor Molnár <[email protected]> Co-authored-by: Beata Sudi <[email protected]> Co-authored-by: ramkrish86 <[email protected]> Co-authored-by: Jan Hentschel <[email protected]> Co-authored-by: Manu Manjunath <[email protected]> Co-authored-by: Yiran Wu <[email protected]> Co-authored-by: Aman Poonia <[email protected]> Co-authored-by: Josh Elser <[email protected]> Co-authored-by: WenFeiYi <[email protected]> Co-authored-by: bzhaoopenstack <[email protected]>
Jira: https://issues.apache.org/jira/browse/HBASE-22460