-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use segmented Slice in SliceDictionaryWriter #15956
Use segmented Slice in SliceDictionaryWriter #15956
Conversation
dac8e5c
to
668c781
Compare
668c781
to
1bbc8e0
Compare
1bbc8e0
to
4762596
Compare
Here is the performance comparison before and after this change. Base is before this change. Segmented is with this change. Note Direct in base and direct in segmented has no change between them and they should be 100% and the result is close enough.
|
4762596
to
942e3ca
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly nits
presto-orc/src/main/java/com/facebook/presto/orc/writer/SegmentedSliceBlockBuilder.java
Outdated
Show resolved
Hide resolved
presto-orc/src/main/java/com/facebook/presto/orc/writer/SegmentedSliceBlockBuilder.java
Outdated
Show resolved
Hide resolved
presto-orc/src/test/java/com/facebook/presto/orc/writer/TestSegmentedSliceBlockBuilder.java
Outdated
Show resolved
Hide resolved
presto-orc/src/test/java/com/facebook/presto/orc/writer/TestSegmentedSliceBlockBuilder.java
Outdated
Show resolved
Hide resolved
public BlockBuilder newBlockBuilderLike(BlockBuilderStatus blockBuilderStatus, int expectedEntries) | ||
{ | ||
if (blockBuilderStatus != null) { | ||
throw new UnsupportedOperationException("Not yet implemented"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blockBuilderStatus
is actually fairly important. QQ: Is newBlockBuilderLike
used anywhere in orc package? If not, just throw?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This blockBuilder is only used by the SliceDictionaryBuilder and it passes in null for the blockBuilderStatus.
presto-orc/src/main/java/com/facebook/presto/orc/writer/SegmentedSliceBlockBuilder.java
Show resolved
Hide resolved
|
||
private final DynamicSliceOutput openSliceOutput; | ||
|
||
private int openSegment; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/openSegment/openSegmentIndex, if I read the code correctly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, renamed it.
presto-orc/src/main/java/com/facebook/presto/orc/writer/SegmentedSliceBlockBuilder.java
Outdated
Show resolved
Hide resolved
offsets[openSegment][openSegmentOffset] = openSliceOutput.size(); | ||
if (openSegmentOffset == SegmentHelper.SEGMENT_SIZE) { | ||
// Add the current finalized slice to closedSlices | ||
Slice slice = openSliceOutput.copySlice(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do a copy to save space? We usually don't call this method to avoid heavy GC. Maybe slice()
is good enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slice() gives the view of the object. After the segment is full, the bytes[] are copied to the copySllice and the dynamicSlliceOuptut is reset and reused for the new segment.
presto-orc/src/main/java/com/facebook/presto/orc/writer/SegmentedSliceBlockBuilder.java
Outdated
Show resolved
Hide resolved
942e3ca
to
8c884ea
Compare
21cd7b1
to
9bf70ba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nits only
presto-orc/src/main/java/com/facebook/presto/orc/writer/SegmentedSliceBlockBuilder.java
Outdated
Show resolved
Hide resolved
presto-orc/src/test/java/com/facebook/presto/orc/writer/TestSegmentedSliceBlockBuilder.java
Outdated
Show resolved
Hide resolved
presto-orc/src/test/java/com/facebook/presto/orc/writer/TestSegmentedSliceBlockBuilder.java
Outdated
Show resolved
Hide resolved
Store elements of dictionary in Segmented Slices, instead of one contiguous segment. When the number of elements in the dictionary is less than 100,000 there is no noticeable performance degradation. When the number of elements in the dictionary reaches 10,000,000 sorting/comparing the element needs to compute segment/offset which makes it worse by 10%. But this is an unlikely case.
9bf70ba
to
16e9b84
Compare
Store elements of dictionary in Segmented Slices, instead of
one contiguous segment. When the number of elements
in the dictionary is less than 100,000 there is no noticeable
performance degradation. When the number of elements in the
dictionary reaches 10,000,000 sorting/comparing the element
needs to compute segment/offset which makes it worse by 10%.
But this is an unlikely case.
Test plan -
Added new test cases for the SegmentedSlices.
Dictionary is covered by existing tests.