-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aggregates: use tile metadata. #4468
Conversation
This pull request has been linked to Shortcut Story #35771: Aggregates use fragment metadata when possible.. |
This allows the readers to use the tile metadata to do aggregation. The tiles still get loaded into memory even when not required but that will be addressed in a follow up change. --- TYPE: IMPROVEMENT DESC: Aggregates: use tile metadata.
4f3d499
to
3b9063f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the great work. I left 2 comments related to improving code comments in some places and 1 extra comment regarding a performance concern in the dense changes.
start, | ||
end - start + 1, | ||
iter.pos_in_tile(), | ||
stride, | ||
iter.cell_slab_coords().data(), | ||
dest_ptr)); | ||
|
||
// If any cell doesn't match the query condition, signal |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIU, this flag is used to know that if QC is applied successfully (it filters out at least 1 result) on a set of results, you cannot use frag meta statistics in aggregations. Isn't it expensive to do this iteration even though there might even be no use for the set_qc_filtered_results
information?
Can we not run the loop when it's not needed and when it is needed, make apply_dense
compute the information instead of looping again outside apply_dense logic?
I remember from the last batch of performance fixes in dense that this is a sensitive area and I'm a bit worried we might regress (If I understand correcly here and in the worst case, we'll end up looping over the entire result set).
This allows the readers to use the tile metadata to do aggregation. The tiles still get loaded into memory even when not required but that will be addressed in a follow up change.
TYPE: IMPROVEMENT
DESC: Aggregates: use tile metadata.