-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-17642: [C++] Add ordered aggregation #14352
Changes from 10 commits
8abc366
9af5d1a
fd3b3e8
ec4e1ab
0aaed62
6901b1a
ace4ef9
43949ad
c90d2bf
8f280ee
60c70ca
c9abbd6
26f991c
9940fa2
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
@@ -180,7 +180,7 @@ struct ARROW_EXPORT ExecBatch { | |||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||
explicit ExecBatch(const RecordBatch& batch); | ||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||
static Result<ExecBatch> Make(std::vector<Datum> values); | ||||||||||||||||||||||||||||||||
static Result<ExecBatch> Make(std::vector<Datum> values, int64_t length = -1); | ||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||
Result<std::shared_ptr<RecordBatch>> ToRecordBatch( | ||||||||||||||||||||||||||||||||
std::shared_ptr<Schema> schema, MemoryPool* pool = default_memory_pool()) const; | ||||||||||||||||||||||||||||||||
|
@@ -233,6 +233,17 @@ struct ARROW_EXPORT ExecBatch { | |||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||
ExecBatch Slice(int64_t offset, int64_t length) const; | ||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||
Result<ExecBatch> SelectValues(const std::vector<int>& ids) const { | ||||||||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I know there aren't many comment blocks in this file but Also, is there any particular reason to have the implementation in the header file? |
||||||||||||||||||||||||||||||||
std::vector<Datum> selected_values(ids.size()); | ||||||||||||||||||||||||||||||||
for (size_t i = 0; i < ids.size(); i++) { | ||||||||||||||||||||||||||||||||
if (ids[i] < 0 || static_cast<size_t>(ids[i]) >= values.size()) { | ||||||||||||||||||||||||||||||||
return Status::Invalid("ExecBatch invalid value selection: ", ids[i]); | ||||||||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||||||||
selected_values[i] = values[ids[i]]; | ||||||||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Minor nit: this seems slightly more readable with a foreach loop. |
||||||||||||||||||||||||||||||||
return ExecBatch(std::move(selected_values), length); | ||||||||||||||||||||||||||||||||
} | ||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||
/// \brief A convenience for returning the types from the batch. | ||||||||||||||||||||||||||||||||
std::vector<TypeHolder> GetTypes() const { | ||||||||||||||||||||||||||||||||
std::vector<TypeHolder> result; | ||||||||||||||||||||||||||||||||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naively one looking at this might be confused why you now need both
ExecBatch::Make
andExecBatch::ExecBatch
since both take a vector of values and a length.Looking closer it seems the
Make
function does the extra work of verifying that the length of the datums match the given length.Could you add some comments explaining this for future readers?