Reduce code duplication in VectorizedParquetDefinitionLevelReader #11661
+247
−244
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In
VectorizedParquetDefinitionLevelReader
, there are a number of nested reader classes; these extend one of two nested base classes,NumericBaseReader
andBaseReader
. Each of these base classes definenextBatch
andnextDictEncodedBatch
methods. These 4 methods (NumericBaseReader::nextBatch
,NumericBaseReader::nextDictEncodedBatch
,BaseReader::nextBatch
,BaseReader::nextDictEncodedBatch
) are actually structurally similar:We can separate out the differences into what gets called in the RLE and PACKED cases for each of the two base classes, and even there, for dict encoded batches, there is no difference in logic between
NumericBaseReader
andBaseReader
. Thus there is quite a bit of duplication that can be reduced.This PR simply refactors the common logic into a superclass (
CommonBaseReader
) ofNumericBaseReader
andBaseReader
, in anextCommonBatch
method thatnextBatch
andnextDictEncodedBatch
delegate to.No tests are added as there is no functionality change. The methods are exercised by existing tests.
Additional notes:
NumericBaseReader
andBaseReader
both define abstractnextDictEncodedVal
methods with the same set of parameters but in different order! In this refactor, we define a commonnextDictEncodedVal
for them inCommonBaseReader
.NumericBaseReader
andBaseReader
each define abstractnextVal
methods as well, but with different sets of parameters, and these we have left alone.