-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-27013 Introduce read all bytes when using pread for prefetch #4414
Conversation
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
if (remain <= extraLen) { | ||
// break for the "extra data" when hitting end of stream and remaining is necessary | ||
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I rephrase your comment: the client is asking for data which cannot possibly exist because the extraLen
would read off the end of the file. The client told us to try to read an extra header after the current datablock but there is no "next header" to read because we're at the end of the file.
I worry that this may mask a problem where a file is corrupt but we would miss the "premature EOF" IOException which should be thrown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you got it right. e.g. if there is a case that we read a HFile does not have the trailer, and just end with the last data block, we're hitting here.
I added that because I found this logic exists already as part of BlockIOUtils#readWithExtraOnHeap (see below).
hbase/hbase-common/src/main/java/org/apache/hadoop/hbase/io/util/BlockIOUtils.java
Lines 154 to 156 in f6e9d3e
if (bytesRemaining <= extraLen) { | |
// We could not read the "extra data", but that is OK. | |
break; |
I thought this is good for handling some case that we don't expect automatically, but we can remove it to let the premature EOF throws back to the client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a fair point -- I would have expected that we were very explicit and did exactly what the caller said. I think you did the right thing. May be good to capture that in a comment :)
conf.setBoolean(HConstants.HFILE_PREAD_ALL_BYTES_ENABLED_KEY, readAllBytes); | ||
FileSystem fs = TEST_UTIL.getTestFileSystem(); | ||
Path path = new Path(TEST_UTIL.getDataTestDirOnTestFS(), testName.getMethodName()); | ||
Random rand = new Random(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may be value in being able to specify a specific seed here, such that if you do see a test failure in a specific case, we can actually try to reproduce that failure.
FileSystem fs = HFileSystem.get(conf); | ||
FSDataOutputStream os = fs.create(path); | ||
HFileContext meta = new HFileContextBuilder().withHBaseCheckSum(true) | ||
.withCompression(compressAlgo).withBytesPerCheckSum(HFile.DEFAULT_BYTES_PER_CHECKSUM).build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.withBytesPerCheckSum(HFile.DEFAULT_BYTES_PER_CHECKSUM)
is unnecessary. As is withHBaseCheckSum(true)
FSDataInputStream is = fs.open(path); | ||
HFileContext fileContext = | ||
new HFileContextBuilder().withHBaseCheckSum(true) | ||
.withCompression(Compression.Algorithm.GZ).build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Compression.Algorithm.GZ
was passed into writeBlocks(..)
. Could pass it into this method too so that you avoid having to change the constant in two places.
💔 -1 overall
This message was automatically generated. |
spotless check is saying something unrelated to my change......should I fix all of them ? I mean, I ran |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
@Apache9 or @ndimiduk , what is the right way to run
|
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two more requests, but then I think this is good to try out.
hfs.getConf().getBoolean(HConstants.HFILE_PREAD_ALL_BYTES_ENABLED_KEY, | ||
HConstants.HFILE_PREAD_ALL_BYTES_ENABLED_DEFAULT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking some more, we should really do this once and not for every call to readAtOffset(..)
. What about moving this logic into the ReaderContext
and cache it there to avoid the expensive lookup into the Hadoop Configuration object.
FileSystem fs = TEST_UTIL.getTestFileSystem(); | ||
Path path = new Path(TEST_UTIL.getDataTestDirOnTestFS(), testName.getMethodName()); | ||
// give a fixed seed such we can see failure easily. | ||
Random rand = new Random(5685632); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: What about choosing a random seed (maybe System.currentTimeMillis()
) in testPreadWithoutReadFullBytes()
and testPreadWithReadFullBytes()
and pass that seed into testPreadReadFullBytesInternal(..)
. As long as we log the seed we get the benefit of randomly choosing different data every time but a human can come back and replay the same exact seed (as we'd have the seed logged in surefire-reports).
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the lifting into ReaderContext, thanks for doing that. I think the config addition looks good to me.
I think you're missing the addition to update HFileReaderImpl, where we created the Runnable to pass to PrefetchExecutor?
do you mean hbase/hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/ReaderContextBuilder.java Lines 92 to 95 in 9c8c9e7
|
My bad -- Stephen chatted me and we got on the same page. I thought Stephen was going to override the config in PrefetchExecutor/HFileReaderImpl automatically, but he clarified that the expectation was to just do it via hbase-site.xml. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
@taklwu you're running spotless correctly. Our pre-commit runs all the checks twice -- first against the target branch without the patch, then against the target branch with the patch. The spotless failure I see in the comment just before you asked was in the unpatched target branch, not the results of your patch -- someone pushed a commit that has not been cleaned up by spotless. Because that complaint went away in later pre-check runs, I assume that someone fixed the formatting on the target branch out-of-band of this PR. |
Thanks @ndimiduk for the explanation, spotless:apply is really handy that could save our time on checkstyle. |
…4414) - introduce optional flag `hfile.pread.all.bytes.enabled` for pread that must read full bytes with the next block header Signed-off-by: Josh Elser <[email protected]>
…4414) - introduce optional flag `hfile.pread.all.bytes.enabled` for pread that must read full bytes with the next block header Signed-off-by: Josh Elser <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I don't see a test for is working with zero byte files; they are a bit of a corner case. All attempts to read data from a zero byte file should fail meaningfully.
I see you are still using read(offset, byte[]), not readFully(). any specific reason? We can be a bit more efficient in readfully as we know we have exclusive access to the stream.
int remain = necessaryLen + extraLen; | ||
byte[] buf = new byte[remain]; | ||
int bytesRead = 0; | ||
while (bytesRead < necessaryLen) { | ||
int lengthMustRead = readAllBytes ? remain : necessaryLen; | ||
while (bytesRead < lengthMustRead) { | ||
int ret = dis.read(position + bytesRead, buf, bytesRead, remain); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason for not using dis.readFully() into the byte array?
@@ -257,11 +279,12 @@ private static boolean preadWithExtraOnHeap(ByteBuff buff, FSDataInputStream dis | |||
} | |||
|
|||
private static boolean preadWithExtraDirectly(ByteBuff buff, FSDataInputStream dis, long position, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in line 305 on this method, the NPE raised should be included as the cause of the IOE. that way if the input stream code raises NPEs, debugging it becomes possible
private long writeBlocks(Configuration conf, Random rand, Compression.Algorithm compressAlgo, | ||
Path path) throws IOException { | ||
FileSystem fs = HFileSystem.get(conf); | ||
FSDataOutputStream os = fs.create(path); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
safer within try/finally or try with resources
introduce a optional flag
hfile.pread.all.bytes.enabled
and default is disabled.