-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support reading block based log format #249
Conversation
This commit is used for building basics for the new DIO feature, including: * [New] DataLayout, representing the arrangement on data (records), in alignment or not. * Refactor LogFileFormat / LogFileContext. Signed-off-by: Lucasliang <[email protected]>
…data layout. Signed-off-by: Lucasliang <[email protected]>
Signed-off-by: Lucasliang <[email protected]>
…`data_layout` in `stress`. Signed-off-by: Lucasliang <[email protected]>
Signed-off-by: Lucasliang <[email protected]>
Codecov Report
@@ Coverage Diff @@
## master #249 +/- ##
==========================================
+ Coverage 96.99% 97.21% +0.21%
==========================================
Files 30 30
Lines 10491 10951 +460
==========================================
+ Hits 10176 10646 +470
+ Misses 315 305 -10
Continue to review full report at Codecov.
|
More uts for covering the abnormal code paths wait to be supplemented. |
Please hold on. |
…records in alignment. Signed-off-by: Lucasliang <[email protected]>
Signed-off-by: Lucasliang <[email protected]>
Done for it. |
Signed-off-by: Lucasliang <[email protected]>
Signed-off-by: Lucasliang <[email protected]>
Signed-off-by: Lucasliang <[email protected]>
Signed-off-by: Lucasliang <[email protected]>
Signed-off-by: Lucasliang <[email protected]>
Signed-off-by: Lucasliang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a bit more on this and I think RocksDB's format is not very efficient (many memcpy-s). And we require accessing raw entry data via a file block handle, so the data of one log batch should stay contiguous.
So there's no need to handling the Type
as RocksDB does. The way I see it, the only work is to add a padding detection to the existing reader.
When reading a log batch, first peek(offset, end_of_current_block)
and construct header. If this slice turns out to be padded with zeros, then go to the next block.
After the header is ready, we get the rest of the data just like before.
Signed-off-by: Lucasliang <[email protected]>
Agree!
About the padding detection, u menthioned before:
Thanks for your suggestion, making me find an existed bug in the preivous detection strategy. And this bug has been fixed in this commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You also need to test the decoding properly. Since the write flow is not implemented, you need to manually assemble a block-based log file content. Some cases need to be covered:
- [ batch padding ]
- [ batch1 batch2_header ][ batch2_middle ][ batch2_rest ]
- [ batch1 batch2_header_part1 ][ batch2_header_part2 ] (this should fail)
- [ batch non_zero_padding ] (fail too)
Got it.
|
…fety. Signed-off-by: Lucasliang <[email protected]>
Corner cases have been supplemented in this commit. |
…clean. Signed-off-by: Lucasliang <[email protected]>
Signed-off-by: Lucasliang <[email protected]>
Signed-off-by: Lucasliang <[email protected]>
… invalid DataLayout. Signed-off-by: Lucasliang <[email protected]>
…ering Version::V1 files with abnormal DataLayout. Signed-off-by: Lucasliang <[email protected]>
Signed-off-by: Lucasliang <[email protected]>
Signed-off-by: Lucasliang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rest LG
Signed-off-by: Lucasliang <[email protected]>
PR Description
For closing #246, this pr is set. This pr is used for building basics for the new DIO feature, including:
stress
tool for the newalignment-mode
reading.