Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-15040: [R] Enable write_csv_arrow to take a Dataset or arrow_dplyr_query as input #11971

Closed
wants to merge 13 commits into from

Conversation

thisisnic
Copy link
Member

No description provided.

@github-actions
Copy link

@github-actions
Copy link

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

Copy link
Member

@jonkeane jonkeane left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is exciting! One comment about types. It also looks like there are some(?) related segfaults in the CI

r/R/csv.R Outdated Show resolved Hide resolved
@paleolimbot
Copy link
Member

It may be worth rebasing this and seeing if it still segfaults! I know there were some updates to the Scanner that fixed some segfaults for me when I was writing tests for the Python bridge (from the Python side). Another way to write this test that would be independent of that is to Scan a Table or InMemoryDataset rather than a FileSystemDataset.

I'm also a fan of the "allow a general record batch reader" approach, which might also make this easier to test.

@paleolimbot
Copy link
Member

Also linking ARROW-15128 since it seems like a blocker for this PR.

@thisisnic thisisnic force-pushed the ARROW-15040_rb_reader branch from b45f9e6 to 43f08f3 Compare February 1, 2022 12:50
@thisisnic
Copy link
Member Author

Thanks for the suggestion @paleolimbot but the segfault is still happening after a rebase.

@thisisnic thisisnic force-pushed the ARROW-15040_rb_reader branch from ff62a45 to 01e1ae5 Compare February 1, 2022 18:10
@thisisnic
Copy link
Member Author

Have now updated this and removed the blocker ticket which isn't a blocker!

@thisisnic thisisnic requested a review from jonkeane February 2, 2022 13:37
r/R/csv.R Outdated Show resolved Hide resolved
r/R/csv.R Outdated Show resolved Hide resolved
r/tests/testthat/test-csv.R Outdated Show resolved Hide resolved
@thisisnic thisisnic changed the title ARROW-15040: [R] Enable write_csv_arrow to take a RecordBatchReader as input ARROW-15040: [R] Enable write_csv_arrow to take a Dataset or arrow_dplyr_query as input Feb 11, 2022
Copy link
Member

@nealrichardson nealrichardson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also write a test for accepting RecordBatchReader, and need to update the docs too

r/R/csv.R Outdated Show resolved Hide resolved
r/R/csv.R Outdated Show resolved Hide resolved
r/R/csv.R Outdated Show resolved Hide resolved
thisisnic and others added 2 commits February 11, 2022 15:38
Copy link
Member

@nealrichardson nealrichardson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of final suggestions

r/tests/testthat/test-csv.R Outdated Show resolved Hide resolved
r/tests/testthat/test-csv.R Outdated Show resolved Hide resolved
@thisisnic thisisnic closed this in 13045f4 Mar 1, 2022
@ursabot
Copy link

ursabot commented Mar 1, 2022

Benchmark runs are scheduled for baseline = 676b49f and contender = 13045f4. 13045f4 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.63% ⬆️0.67%] test-mac-arm
[Finished ⬇️0.0% ⬆️3.21%] ursa-i9-9960x
[Finished ⬇️0.3% ⬆️0.68%] ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants