-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Filter pipeline support for datatype conversions based on filtered output datatype. #4165
Conversation
This pull request has been linked to Shortcut Story #24079: DoubleDelta filter typecheck should account for output type of FloatScaleFilter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving these comments as a first pass review. I'll re-review deeper when I'm not sitting in my truck outside the doctor's office.
9ca7ca7
to
069855c
Compare
tiledb/sm/filter/filter_pipeline.cc
Outdated
@@ -221,6 +259,11 @@ Status FilterPipeline::filter_chunks_forward( | |||
&output_metadata, | |||
&output_data)); | |||
|
|||
// Final tile type will be the output type of last filter in pipeline. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we do this in the writer when constructing the tile? I'd like to avoid changing the type after the tile was created.
tiledb/sm/filter/filter_pipeline.cc
Outdated
if (last_filter) { | ||
void* output_chunk_buffer = | ||
static_cast<char*>(tile->data()) + chunk_data.chunk_offsets_[i]; | ||
RETURN_NOT_OK(output_data.set_fixed_allocation( | ||
output_chunk_buffer, chunk.unfiltered_data_size_)); | ||
reader_stats->add_counter( | ||
"read_unfiltered_byte_num", chunk.unfiltered_data_size_); | ||
// Restore tile datatype to it's initial schema value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the tile created by the reader should have the right type from the get go. My understanding is that the filter pipeline will use temporary buffers until we reach the last filter, at which point we'll copy to the final tile. Am I incorrect? I don't think we should change the tile type after it's been created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just two minor changes to revert.
3076f33
to
71af9fc
Compare
eaaa2a6
to
c36371d
Compare
This reverts commit c5ba20c. + Clang format after merge
Passes filter output datatype to next filter(s) in pipeline, adding support for type conversions within a pipeline based on filter output datatype. Also refactors pipeline validation, adding
Filter::accepts_input_datatype
andFilter::output_datatype
from #4057. Deriving filters can override these methods if they are restricted to certain input datatypes and / or convert tile datatype when the filter is applied.TYPE: IMPROVEMENT
DESC: Filter pipeline support for datatype conversions based on filtered output datatype.