Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

write_csv/json/parquet isn't cancel safe #5178

Closed
DDtKey opened this issue Feb 4, 2023 · 1 comment · Fixed by #5196
Closed

write_csv/json/parquet isn't cancel safe #5178

DDtKey opened this issue Feb 4, 2023 · 1 comment · Fixed by #5196
Labels
bug Something isn't working

Comments

@DDtKey
Copy link
Contributor

DDtKey commented Feb 4, 2023

Describe the bug

DataFrame::write_csv/json/parquet isn't cancelling safe due to spawned tasks. See these lines: https://github.com/apache/arrow-datafusion/blob/master/datafusion/core/src/physical_plan/file_format/csv.rs#L300-L307

It's actually bad because impossible to wrap such functions to timeout or select! for example.

To Reproduce
Wrap it to select or timeout and check that spawned tasks are still working after cancelling by another branch.

Expected behavior

It should be safe-to-cancel these methods.

Additional context

Possible ways to fix:

  1. Use stop/cancel token/channel to stop spawned task on drop of outside variable. Like these: CancellationToken(with drop-guard) or stop-token
  2. Wrap spawned tasks(i.e JoinHandle) to future with Drop and call abort for them.

Not sure if there are any other spawns which should be covered as well.

@DDtKey DDtKey added the bug Something isn't working label Feb 4, 2023
@DDtKey DDtKey changed the title write_csv/json/parquet isn't cancelling safe write_csv/json/parquet isn't cancel safe Feb 4, 2023
@DDtKey
Copy link
Contributor Author

DDtKey commented Feb 6, 2023

I can prepare PR to cover this issue if nobody don't mind / already on it 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant