-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sftp_to_s3 stream file option #17609
Conversation
Adds the option to stream the file directly from sftp client to s3 rather than first copy to a local temporary file. This is required whenever the size of the file exceeds the temporary storage of the worker.
fixed self.use_temp_file reference error
Two things:
|
Is there any advantage on saving the file locally in a temporary manner? I am wondering if it makes sense to just change the way it uploads the file to S3 without giving the option to store the temporary file in local system |
I think the main reason are implementation details of the So if you have a fast (local network) sftp connection, downloading the file first and then uploading the local file might significantly speed up the transfer, as |
Adds the option to stream the file directly from sftp client to s3 rather than first copy to a local temporary file. This is required whenever the size of the file exceeds the temporary storage of the worker.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.