-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
aws s3 cp --recursive not downloading all small files, when it says it does #3087
Comments
You rerun the commands with the |
Extremely large amount of output, let me look thru |
What are you looking for? A bunch of items relating region redirect, and stuff like below
|
What version of the CLI were you using that was running into these errors? Sounds like an upgrade might have resolved it, but it would be good to know for sure. |
Closing due to inactivity. |
I have the same problem. Trying to copy all files from a bucket to another one with --recursive but at the end some of the files are not copied, and no error is reported.
|
I also have this issue. At the moment I have a feeling it might be to do with S3's eventual consistency model and me doing something weird when poking the files into S3. In case this helps anyone. |
Facing the same issue. ls shows all directories and files. CP or sync only does some of them No error. Also, there is no policy set up for inclusion-exclusion. |
The same problem here, awscli==1.16.74 @JordonPhillips please open this issue |
The same problem here, awscli==1.16.186 |
Maybe it is caused by Python version. I had this problem on EC2-Linux, with 1.18 awscli/Python3.7. But after install Python3.7 and awscli 1.19 by pip. It works - aws s3 copy will download all files. |
Same issue, it works in commandline but not work in bash script, no errors |
@debu99 - if you are still experiencing this issue, can you please open up a new bug report with our template so we can get all of the information from you? A minimal reproducible example where it does work and does not work would be most useful. Thanks! |
I fixed the issue, it is due to \r in end of every line in the filelist file, I remove it and use aws sync, and then my bash script works |
* sam pipeline bootstrap (aws#2811) * two-stages-pipeline plugin * typos * add docstring * make mypy happy * removing swap file * delete the two_stages_pipeline plugin as the pipeline-bootstrap command took over its responsibility * remove 'get_template_function_runtimes' function as the decision is made to not process the SAM template during pipeline init which was the only place we use the function * sam pipeline bootstrap command * move the pipelineconfig.toml file to .aws-sam * UX - rewriting Co-authored-by: Chris Rehn <[email protected]> * UX improvements * make black happy * apply review comments * UX - rewriting Co-authored-by: Chris Rehn <[email protected]> * refactor * Apply review comments * use python way of array elements assignments * Update samcli/lib/pipeline/bootstrap/stage.py Co-authored-by: _sam <[email protected]> * apply review comments * typo * read using utf-8 * create and user a safe version of the save_config method * apply review comments * rename _get_command_name to _get_command_names * don't save generated ARNs for now, will save during init * Revert "don't save generated ARNs for now, will save during init" This reverts commit d184e164022d9560131c62a826436edbc93da189. * Notify the user to rotate periodically rotate the IAM credentials * typo * Use AES instead of KMS for S3 SSE * rename Ecr to ECR and Iam to IAM * Grant lambda service explicit permissions to thhe ECR instead of relying on giving this permissions on ad-hoc while creating the container images Co-authored-by: Chris Rehn <[email protected]> Co-authored-by: _sam <[email protected]> * sam pipeline init command (aws#2831) * sam pipeline init command * apply review comments * apply review comments * display a message that we have successfully created the pipeline configuration file(s). * doc typo * Let 'sam pipeline init' prefills pipeline's infrastructure resources… (aws#2894) * Let 'sam pipeline init' prefills pipeline's infrastructure resources' values from 'sam pipeline bootstrap' results. * save bootstrapped sateg region * make black happy * exclude non-dict keys from samconfig.get_env_names method. * Rename the pipeline 'Stage' concept to 'Environment' (aws#2908) * Rename the pipeline 'Stage' concept to 'Environment' * typo * Rename --environment-name argument to --environment * Sam pipelines ux rename ecr repo to image repository (aws#2910) * Rename ecr-repo to image-repository * UT Fixes * typo * typo * feat: Support creating pipeline files directly into . without hooks (aws#2911) * feat: Support creating pipeline files directly into . without hooks * Integration test for pipeline init and pipeline bootstrap (aws#2841) * Expose Environment._get_stack_name for integ test to predict stack name * Add integ test for pipeline bootstrap * Add init integ test * small UX improvements: (aws#2914) * small UX improvements: 1. show a message when the user cancels a bootstrapping command. 2. Don't prompt for CI/CD provider or provider templates if there is only one choice. 3. Make PipelineFileAlreadyExistsError a UserError. 4. use the Colored class instead of fg='color' when prompting a colored message. 5. Fix a bug where we were not allowing empty response for not required questions. * Fix Integration Test: We now don't ask the user to select a provider's pipeline template if there is only one * Add docs for PipelineFileAlreadyExistsError * make black happy * Sam pipelines s3 security (aws#2975) * Deny non https requests for the artifacts S3 bucket * enable bucket serverside logging * add integration tests for artifacts bucket SSL-only requests and access logging * typo * Ensure the ArtifactsLoggingBucket denies non ssl requests (aws#2976) * Sam pipelines ux round 3 (aws#2979) * rename customer facing message 'CI/CD provider' to 'CI/CD system' * add a note about what 'Environment Name' is during the pipeline bootstrap guided context * Apply suggestions from code review typo Co-authored-by: Chris Rehn <[email protected]> Co-authored-by: Chris Rehn <[email protected]> * let pipeline IAM user assume only IAM roles tagged with Role=pipeline-execution-role (aws#2982) * Adding AWS_ prefix to displayed out. (aws#2993) Co-authored-by: Tarun Mall <[email protected]> * Add region to pipeline bootstrap interactive flow (aws#2997) * Ask AWS region in bootstrap interactive flow * Read default region from boto session first * Fix a unit test * Inform write to pipelineconfig.toml at the end of bootstrap (aws#3002) * Print info about pipelineconfig.toml after resources are bootstrapped * Update samcli/commands/pipeline/bootstrap/cli.py Co-authored-by: Chris Rehn <[email protected]> Co-authored-by: Chris Rehn <[email protected]> * List detected env names in pipeline init when prompt to input the env name (aws#3000) * Allow question.question can be resolved using key path * Pass the list of env names message (environment_names_message) into pipeline init interactive flow context * Update samcli/commands/pipeline/init/interactive_init_flow.py Co-authored-by: Chris Rehn <[email protected]> * Fix unit test (trigger pr builds) * Fix integ test * Fix integ test Co-authored-by: Chris Rehn <[email protected]> * Adding account id to bootstrap message. (aws#2998) * Adding account id to bootstrap message. * adding docstring * Addressing PR comments. * Adding unit tests. * Fixing unit tests. Co-authored-by: Tarun Mall <[email protected]> * Cfn creds fix (aws#3014) * Removing pipeline user creds from cfn output. This maintains same user exp. Co-authored-by: Tarun Mall <[email protected]> * Ux bootstrap revamp 20210706 (aws#3021) * Add intro paragraph to bootstrap * Add switch account prompt * Revamp stage definition prompt * Revamp existing resources prompt * Revamp security prompt * Allow answers to be changed later * Add exit message for bootstrap * Add exit message for bootstrap (1) * Add indentation to review values * Add "Below is the summary of the answers:" * Sweep pylint errors * Update unit tests * Update samcli/commands/pipeline/bootstrap/guided_context.py Co-authored-by: Chris Rehn <[email protected]> * Update samcli/commands/pipeline/bootstrap/guided_context.py Co-authored-by: Chris Rehn <[email protected]> * Update samcli/commands/pipeline/bootstrap/guided_context.py Co-authored-by: Chris Rehn <[email protected]> * Update samcli/commands/pipeline/bootstrap/guided_context.py Co-authored-by: Chris Rehn <[email protected]> * Update samcli/commands/pipeline/bootstrap/guided_context.py Co-authored-by: Chris Rehn <[email protected]> * Update samcli/commands/pipeline/bootstrap/guided_context.py Co-authored-by: Chris Rehn <[email protected]> * Update samcli/commands/pipeline/bootstrap/guided_context.py Co-authored-by: Chris Rehn <[email protected]> * Update samcli/commands/pipeline/bootstrap/guided_context.py Co-authored-by: Chris Rehn <[email protected]> * Update samcli/commands/pipeline/bootstrap/cli.py Co-authored-by: Chris Rehn <[email protected]> * Update unit tests * Add bold to other literals Co-authored-by: Chris Rehn <[email protected]> * Adding account condition for CFN execution role. (aws#3027) Co-authored-by: Tarun Mall <[email protected]> * pipeline UX revamp 20210707 (aws#3031) * Allow running bootstrap inside pipeline init * Select account credential source within bootstrap * Add bootstrap decorations within pipeline init * Removing ip range option from bootstrap. (aws#3036) * Removing ip range option from bootstrap. * Fixing unit test from UX PR. Co-authored-by: Tarun Mall <[email protected]> * Fix toml file incorrect read/write in init --bootstrap (aws#3037) * Temporarily removing account fix. (aws#3038) Co-authored-by: Tarun Mall <[email protected]> * Rename environment to stage (aws#3040) * Improve account source selection (aws#3042) * Fixing various cosmetics UX issues with pipeline workflow. (aws#3046) * Fixing credential to credentials * Forcing text color to yellow. * Adding new line after stage diagram. * Adding extra line after checking bootstrap message. * Renaming config -> configuration * account source -> credential source * Removing old message. * Fixing indentation in list. * Fixing bunch of indentation. * fixing f string Co-authored-by: Tarun Mall <[email protected]> * Auto skip questions if stage detected (aws#3045) * Autofill question if default value is presented * Allow to use index to select stage names (aws#3051) * Updating message when bootstrap stages are missing. (aws#3058) * Updating message when bootstrap stages are missing. * Fixing indendation Co-authored-by: Tarun Mall <[email protected]> * Fixing bootstrap integ tests. (aws#3061) * Fixing bootstrap integ tests. * Cleaning up some integ tests. * Using environment variables when running integ test on CI. * Using expression instead of full loop. * Adding instruction to use default profile on local. Co-authored-by: Tarun Mall <[email protected]> * Fix bootstrap test region (#3064) * Fix bootstrap region in integ test * Fix regions in non-interactive mode as well * Add more pipeline init integ test (aws#3065) * Fix existing pipeline init integ test * Add more pipeline init integ tests * Config file bug (aws#3066) * Validating config file after bootstrap stack creation. * Validating config file after bootstrap. Co-authored-by: Tarun Mall <[email protected]> * Fix pipeline init integ test because of pipelineconfig file exists (aws#3067) * Make stage name randomized to avoid race condition among multi canary runs (aws#3078) * Load number of stages from pipeline template (aws#3059) * Load number of stages from templates * Rename variable and add debug log * Add encoding to open() * Allow roles with Tag aws-sam-pipeline-codebuild-service-role to assume PipelineExecutionRole (aws#2950) * pipeline init UX: Ask to confirm when file exists (aws#3079) * Ask to confirm overriding if files already exist, or save to another directory * Add doc links (aws#3087) * Adding accidentally removed tests back. (aws#3088) Co-authored-by: Tarun Mall <[email protected]> Co-authored-by: elbayaaa <[email protected]> Co-authored-by: Chris Rehn <[email protected]> Co-authored-by: Ahmed Elbayaa <[email protected]> Co-authored-by: Tarun <[email protected]> Co-authored-by: Tarun Mall <[email protected]>
Check your logs.
|
We are in the process of writing a few thousand ~500 byte files to an s3 bucket/folder.
Currently In the console it shows 280 files (we assume this is correct as we have not wrote them all yet) in the folder.
aws s3 ls bucket/folder/ --recursive
shows 280 files.aws s3 cp bucket/folder/ ./ --recursive
shows its copied 280 files on the command line. You can read the number explicitly and I counted lines of "copied file" output in console.However in mac os (right click get info) it shows 211 files.
Additionally a
ls . | wc -l
shows 211 files.I have tried reducing number of concurrent to 1. Cannot understand why it would show completed when the file doesn't exist on the local. This is over wifi btw, so if there is no checking if a file was downloaded correctly, maybe thats it? Unsure where to look next
UPDATE: We put together a hack to get around this bug... basically, if you run:
aws s3 ls bucket/folder/ --recursive --dryrun >> filestodownload.txt
It shows all the files you want to download, and saves it in a text file thats easy to parse. We then parsed it, and did a separate aws s3 cp command for each individual file. All downloaded successfully, albeit it was slow... so now working on concurrency... but files are still MIA (even though we have local checks after download the file size > 0 and all succeed.
Really confusing...
UPDATE 2: We spun up a AWS AMI and repeated process above. It worked as expected. IDK, seems to be related to Mac? I am on 10.12.4 Sierra.
The text was updated successfully, but these errors were encountered: