-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-26700 The way we bypass broken track file is not enough in Stor… #4055
Conversation
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way, currently, to identify when the load was not correct? Or does it always parse the contents with no error, despite not parsing the contents correctly?
We have the "first" SFT version ready to ship internally. Am worried how bad this is, as it would break compatibility when we then upgrade to version including this change.
The possibility is low, usually it will lead to an InvalidProtocolBufferException. When testing the FSTableDescriptors, the TableDescriptor is ~300 bytes, and I've tested the to read 1 byte, 2 bytes... And finally when I reached ~50 bytes, the parse succeeded without error, but given an invalid table name. And since the content itself is very small, it is not likely to have split in the middle, and on typical OSS it is impossible to write a partial object right? The semantice is all or nothing. But distributed storage system design is to fight with 'small possibility' right? As if it happens, it will lead to data loss. So I suggest your internal version to also include this PR, or at least we should be binary compatible. Thanks. |
…eFileListFile (#4055) Signed-off-by: Wellington Ramos Chevreuil <[email protected]>
Thanks for the ping, Duo. I'm still curious about what you saw.
You have a filelist which you noticed giving a this TableNotFoundException when it was parsed. When you tried parsing it by writing some custom code, you saw that with different lengths provided, the protobuf parser gave different errors? You are assuming that, eventually, we may have a case where HBase may write out a SFT file with a bad size (maybe HBase or HDFS bug), and this should protect us in that case? The simple crc at the head of the file sounds reasonable to prevent that causing bigger problems. |
I mean, for example, after serializing, a protobuf message should be ~300 bytes, but with only the leading ~50 bytes, you could deserialize the protobuf message succesfully, without any error, but the fields of the message will be different with the ones you expect. This is a very serious problem, the assumption in the old code is that, if the bytes of the protobuf message is incomplete, then we will always get a InvalidProtocolBufferException, but this is not true... And it does not need to be a bug of HDFS or HBase, if we crash while writing the content to HDFS, it is possible that we write a partial file right? So here I added a length at the beginning of the file, if the file length is not enough, i.e, we hit an EOFException, then we could say that, the file is incomplete, let's ignore it. And the trailing crc is used to test whether the content is as expected, if the crc mismatches, we will throw an IOException out and fail the region openning process. In this case, we need to manually check what is the actual problem and try to fix it, for example, regenerate the tracker file with the current store files under the data directory. |
…eFileListFile (apache#4055) Signed-off-by: Wellington Ramos Chevreuil <[email protected]>
…eFileListFile (apache#4055) Signed-off-by: Wellington Ramos Chevreuil <[email protected]>
…eFileListFile (apache#4055) Signed-off-by: Wellington Ramos Chevreuil <[email protected]>
…eFileListFile (apache#4055) Signed-off-by: Wellington Ramos Chevreuil <[email protected]>
…others) to branch-2.5 Previous cherry picks: commit 6aaef89 HBASE-26064 Introduce a StoreFileTracker to abstract the store file tracking logic commit 43b40e9 HBASE-25988 Store the store file list by a file apache#3578) commit 6e05376 HBASE-26079 Use StoreFileTracker when splitting and merging apache#3617) commit 090b2fe HBASE-26224 HBASE-26224 Introduce a MigrationStoreFileTracker to support migratin… apache#3656) commit 0ee1689 HBASE-26246 Persist the StoreFileTracker configurations to TableDescriptor when creating table apache#3666) commit 2052e80 HBASE-26248 Should find a suitable way to let users specify the store… apache#3665) commit 5ff0f98 HBASE-26264 Add more checks to prevent misconfiguration on store file… apache#3681) commit fc4f6d1 HBASE-26280 HBASE-26280 Use store file tracker when snapshoting apache#3685) commit 06db852 HBASE-26326 CreateTableProcedure fails when FileBasedStoreFileTracker… apache#3721) commit e4e7cf8 HBASE-26386 Refactor StoreFileTracker implementations to expose the s… apache#3774) commit 08d1171 HBASE-26328 Clone snapshot doesn't load reference files into FILE SFT impl apache#3749) commit 8bec26e HBASE-26263 [Rolling Upgrading] Persist the StoreFileTracker configur… apache#3700) commit a288365 HBASE-26271: Cleanup the broken store files under data directory apache#3786) commit d00b5fa HBASE-26454 CreateTableProcedure still relies on temp dir and renames… apache#3845) commit 771e552 HBASE-26286: Add support for specifying store file tracker when restoring or cloning snapshot commit f16b7b1 HBASE-26265 Update ref guide to mention the new store file tracker im… apache#3942) commit 755b3b4 HBASE-26585 Add SFT configuration to META table descriptor when creating META apache#3998) commit 39c42c7 HBASE-26639 The implementation of TestMergesSplitsAddToTracker is pro… apache#4010) commit 6e1f5b7 HBASE-26586 Should not rely on the global config when setting SFT implementation for a table while upgrading apache#4006) commit f1dd865 HBASE-26654 ModifyTableDescriptorProcedure shoud load TableDescriptor… apache#4034) commit 8fbc9a2 HBASE-26674 Should modify filesCompacting under storeWriteLock apache#4040) commit 5aa0fd2 HBASE-26675 Data race on Compactor.writer apache#4035) commit 3021c58 HBASE-26700 The way we bypass broken track file is not enough in Stor… apache#4055) commit a8b68c9 HBASE-26690 Modify FSTableDescriptors to not rely on renaming when wr… apache#4054) commit dffeb8e HBASE-26587 Introduce a new Admin API to change SFT implementation (#… apache#4080) commit b265fe5 HBASE-26673 Implement a shell command for change SFT implementation apache#4113) commit 4cdb380 HBASE-26640 Reimplement master local region initialization to better … apache#4111) commit 77bb153 HBASE-26707: Reduce number of renames during bulkload (apache#4066) apache#4122) commit a4b192e HBASE-26611 Changing SFT implementation on disabled table is dangerous apache#4082) commit d3629bb HBASE-26837 Set SFT config when creating TableDescriptor in TestClone… apache#4226) commit 541d748 HBASE-26881 Backport HBASE-25368 to branch-2 (apache#4267) Fixups for precommit error prone, checkstyle, and javadoc warnings after applying cherry picks. Signed-off-by: Josh Elser <[email protected]> Reviewed-by: Wellington Ramos Chevreuil <[email protected]>
…eFileListFile (#4055) Signed-off-by: Wellington Ramos Chevreuil <[email protected]>
…others) to branch-2.5 Previous cherry picks: commit 6aaef89 HBASE-26064 Introduce a StoreFileTracker to abstract the store file tracking logic commit 43b40e9 HBASE-25988 Store the store file list by a file #3578) commit 6e05376 HBASE-26079 Use StoreFileTracker when splitting and merging #3617) commit 090b2fe HBASE-26224 HBASE-26224 Introduce a MigrationStoreFileTracker to support migratin… #3656) commit 0ee1689 HBASE-26246 Persist the StoreFileTracker configurations to TableDescriptor when creating table #3666) commit 2052e80 HBASE-26248 Should find a suitable way to let users specify the store… #3665) commit 5ff0f98 HBASE-26264 Add more checks to prevent misconfiguration on store file… #3681) commit fc4f6d1 HBASE-26280 HBASE-26280 Use store file tracker when snapshoting #3685) commit 06db852 HBASE-26326 CreateTableProcedure fails when FileBasedStoreFileTracker… #3721) commit e4e7cf8 HBASE-26386 Refactor StoreFileTracker implementations to expose the s… #3774) commit 08d1171 HBASE-26328 Clone snapshot doesn't load reference files into FILE SFT impl #3749) commit 8bec26e HBASE-26263 [Rolling Upgrading] Persist the StoreFileTracker configur… #3700) commit a288365 HBASE-26271: Cleanup the broken store files under data directory #3786) commit d00b5fa HBASE-26454 CreateTableProcedure still relies on temp dir and renames… #3845) commit 771e552 HBASE-26286: Add support for specifying store file tracker when restoring or cloning snapshot commit f16b7b1 HBASE-26265 Update ref guide to mention the new store file tracker im… #3942) commit 755b3b4 HBASE-26585 Add SFT configuration to META table descriptor when creating META #3998) commit 39c42c7 HBASE-26639 The implementation of TestMergesSplitsAddToTracker is pro… #4010) commit 6e1f5b7 HBASE-26586 Should not rely on the global config when setting SFT implementation for a table while upgrading #4006) commit f1dd865 HBASE-26654 ModifyTableDescriptorProcedure shoud load TableDescriptor… #4034) commit 8fbc9a2 HBASE-26674 Should modify filesCompacting under storeWriteLock #4040) commit 5aa0fd2 HBASE-26675 Data race on Compactor.writer #4035) commit 3021c58 HBASE-26700 The way we bypass broken track file is not enough in Stor… #4055) commit a8b68c9 HBASE-26690 Modify FSTableDescriptors to not rely on renaming when wr… #4054) commit dffeb8e HBASE-26587 Introduce a new Admin API to change SFT implementation (#… #4080) commit b265fe5 HBASE-26673 Implement a shell command for change SFT implementation #4113) commit 4cdb380 HBASE-26640 Reimplement master local region initialization to better … #4111) commit 77bb153 HBASE-26707: Reduce number of renames during bulkload (#4066) #4122) commit a4b192e HBASE-26611 Changing SFT implementation on disabled table is dangerous #4082) commit d3629bb HBASE-26837 Set SFT config when creating TableDescriptor in TestClone… #4226) commit 541d748 HBASE-26881 Backport HBASE-25368 to branch-2 (#4267) Fixups for precommit error prone, checkstyle, and javadoc warnings after applying cherry picks. Signed-off-by: Josh Elser <[email protected]> Reviewed-by: Wellington Ramos Chevreuil <[email protected]>
…eFileListFile (apache#4055) Signed-off-by: Wellington Ramos Chevreuil <[email protected]> Change-Id: I40e4cb597931ca395002f9abd052c6c7ade2c519
…eFileListFile