-
Notifications
You must be signed in to change notification settings - Fork 701
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix: Prevent truncated Parquet files in S3 after failed CreateMultipa…
…rtUpload (#2993) During a call to s3.to_parquet(), if the size of the data exceeds 5MB a multi-part upload operation will be initiated. If the S3 call to CreateMultipartUpload fails (such as with a 503 SlowDown error) then the incomplete Parquet file data was being written to S3 using 'put_object' during close(). This resulted in broken Parquet files in S3, causing errors when queried by services like Athena. Now, the data buffer is cleared at the end of the call to flush() -- even when an exception occurs.
- Loading branch information
Showing
1 changed file
with
36 additions
and
31 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters