Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Fix BaseSQLToGCSOperator approx_max_file_size_bytes (#25469)
* Fix BaseSQLToGCSOperator approx_max_file_size_bytes When using the parquet file_format, using `tmp_file_handle.tell()` always points to the beginning of the file after the data has been saved and therefore is not a good indicator for the files current size. Save the current file pointer position and set the file pointer position to `os.SEEK_END`. file_size is set to the new position, and the file pointer's position goes back to the saved position. Currently, after a parquet write operation the pointer is set to 0, and therefore, simply executing `tmp_file_handle.tell()` is not sufficient to determine the current size. This sequence is added to allow file splitting when the export format is set to parquet.
- Loading branch information