-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability for S3 commands to increase retry count #1092
Comments
Correct me if I am wrong - would this potentially help issue of ~190GB upload to s3 bucket in region: us standard via aws s3 cp DATA.csv s3://BUCKET_NAME/data.csv ? Creates about 900 parts. It gets to about 15 of 900 parts before failing with: upload failed: ./DATA.csv to s3://BUCKET_NAME/data.csv Thank you. |
^ And includes well over 5 retries in --debug (I meant to include that). |
Is there any progress or plan about the feature release? |
For the case of large files, it seems from this line that if any part of an upload fails, the whole thing is cancelled: https://github.com/aws/aws-cli/blob/develop/awscli/customizations/s3/tasks.py#L259 The problem here is that for an unreliable internet connection (e.g. fails every 10 minutes) and a large file, there is a very high chance that at least one part of a multipart upload is going to fail. This means that the whole upload gets cancelled, i.e. a very low chance of success. Could these failed parts be re-queued instead of causing cancellation? |
Also looking at the code, it seems there are only retries for downloads, not uploads - https://github.com/aws/aws-cli/blob/develop/awscli/customizations/s3/tasks.py. This means that despite the mulitpart upload feature, large files are very unlikely to succeed if there are issues with the network connection - if any part fails then the whole is cancelled. |
I'd be willing to work on this. However, I'd need some guidance:
|
@kyleknap I'm offering to work on this - if someone can answer my questions above, I can get going. There are two separate features I guess:
Do you want me to create a new issue for part 1) ? |
Here are some responses to your previous question:
I think it should be fine to keep tracking this on this issue. No need for a new issue to be opened. I think being able to hook into the botocore logic that I linked with a value that you can provide for max retries would be the best approach, and I believe that was what James was referring to when he first opened the issue. |
Great, thanks so much for the response, hopefully I'll get time to look at this over the Christmas period. |
Working out how to configure the There is no documentation for how to do this kind of thing - https://botocore.readthedocs.org/en/latest/index.html - and I generally have the principle of "docs or it doesn't exist". But digging deeper, here is the chain I followed:
The So I can't see any way to configure this programmatically, without changes to botocore. |
In case anyone else is looking for a workaround, I've found that the |
@spookylukey Mind entering a botocore issue for us? Sounds like this would fit perfectly in the Config class: http://botocore.readthedocs.org/en/latest/reference/config.html |
Good Morning! We're closing this issue here on GitHub, as part of our migration to UserVoice for feature requests involving the AWS CLI. This will let us get the most important features to you, by making it easier to search for and show support for the features you care the most about, without diluting the conversation with bug reports. As a quick UserVoice primer (if not already familiar): after an idea is posted, people can vote on the ideas, and the product team will be responding directly to the most popular suggestions. We’ve imported existing feature requests from GitHub - Search for this issue there! And don't worry, this issue will still exist on GitHub for posterity's sake. As it’s a text-only import of the original post into UserVoice, we’ll still be keeping in mind the comments and discussion that already exist here on the GitHub issue. GitHub will remain the channel for reporting bugs. Once again, this issue can now be found by searching for the title on: https://aws.uservoice.com/forums/598381-aws-command-line-interface -The AWS SDKs & Tools Team This entry can specifically be found on UserVoice at: https://aws.uservoice.com/forums/598381-aws-command-line-interface/suggestions/33168364-add-ability-for-s3-commands-to-increase-retry-coun |
Based on community feedback, we have decided to return feature requests to GitHub issues. |
Hi, is there any movement on this? I have a spotty connection and literally are unable to download any file that’s larger than a few hundred MiB from S3. |
We've seen several issues opened now where, due to a number of variables, the max number of attempts, which is currently 5, is too low. This can be due to a less reliable WAN link, the available resources on the machine running the commands not being sufficient, the parallelism for S3 transfers being too high, etc.
To help with this issue, we should provide some sort of mechanism that allows a user to bump up the retry count. The main use case would be when transferring either a large amount of files or large files. In these scenarios you're more willing to retry as many times as needed to get the request to succeed.
See:
#1065
The text was updated successfully, but these errors were encountered: