Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-6090: [Rust] [DataFusion] Physical plan for HashAggregate #5191

Closed
wants to merge 6 commits into from

Conversation

andygrove
Copy link
Member

@andygrove andygrove commented Aug 25, 2019

This PR implements the HashAggregate execution plan.

@kszucs
Copy link
Member

kszucs commented Aug 29, 2019

@ursabot build

@andygrove andygrove force-pushed the ARROW-6090 branch 2 times, most recently from 94608b9 to 8464ff4 Compare September 15, 2019 17:42
@andygrove andygrove changed the title ARROW-6090: [Rust] [DataFusion] Physical plan for HashAggregate [WIP] ARROW-6090: [Rust] [DataFusion] Physical plan for HashAggregate and SUM aggregate function Sep 15, 2019
@andygrove andygrove changed the title ARROW-6090: [Rust] [DataFusion] Physical plan for HashAggregate and SUM aggregate function ARROW-6090: [Rust] [DataFusion] Physical plan for HashAggregate Sep 19, 2019
@andygrove
Copy link
Member Author

@paddyhoran @nevi-me @sunchao This PR has now been rebased and is ready for review. Thanks!

Copy link
Contributor

@paddyhoran paddyhoran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than the comments above LGTM.

I would just check the examples before 0.15 release as you have re-arranged a few things lately. I created this issue for running the DataFusion examples in CI.

@@ -1465,7 +1465,8 @@ mod tests {
}

fn load_csv(filename: &str, schema: &Arc<Schema>) -> Rc<RefCell<dyn Relation>> {
let ds = CsvBatchIterator::new(filename, schema.clone(), true, &None, 10);
let ds =
CsvBatchIterator::try_new(filename, schema.clone(), true, &None, 10).unwrap();
Copy link
Contributor

@paddyhoran paddyhoran Sep 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't you propagate this error instead of using unwrap, this is the error from File::open so it's reasonable that load functions would have to deal with this, including load_csv.

If there is a strong reason why you don't want to maybe just switch to expect.

Also, why are you not exposing has_header, etc.? In particular, why is 10 used as the batch size?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is test code to support the unit tests and not part of the actual product ... but you are right, it would be better to have this method return a Result. I will fix that tonight.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is test code to support the unit tests and not part of the actual product

Ahh, ok sorry. It's hard to see the context while reviewing on github. No need to change if it's test code.

Copy link
Contributor

@paddyhoran paddyhoran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending CI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants