Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

amazon.aws.s3_object copy mode repeatedly updates objects that were uploaded in multiple parts #2016

Closed
1 task done
colin-nolan opened this issue Mar 12, 2024 · 2 comments · Fixed by #2024
Closed
1 task done
Assignees
Labels
jira verified WIP Work in progress

Comments

@colin-nolan
Copy link

colin-nolan commented Mar 12, 2024

Summary

Using amazon.aws.s3_object copy mode with objects that were uploaded in multiple parts (e.g. as happens with uploads via the web UI) results in the objects being copied every time the module is used - including when the corresponding objects exists in the source bucket with the same content.

I suspect that the issue is due to how the Etag is generated for the source vs how it gets generated for the copy.

Issue Type

Bug Report

Component Name

s3_object

Ansible Version

$ ansible --version
ansible [core 2.16.4]
  config file = <redacted>
  configured module search path = <redacted>
  ansible python module location = <redacted>
  ansible collection location = <redacted>
  executable location = <redacted>
  python version = 3.12.1 (main, Mar  5 2024, 15:57:44) [Clang 15.0.0 (clang-1500.1.0.2.5)] (<redacted>)
  jinja version = 3.1.3
  libyaml = True

Collection Versions

$ ansible-galaxy collection list

# <redacted>/ansible_collections
Collection                               Version
---------------------------------------- -------
amazon.aws                               7.4.0  

# <redacted>/ansible_collections
Collection                               Version
---------------------------------------- -------
amazon.aws                               7.3.0  
ansible.netcommon                        5.3.0  
ansible.posix                            1.5.4  
ansible.utils                            2.12.0 
ansible.windows                          2.2.0  
arista.eos                               6.2.2  
awx.awx                                  23.8.1 
azure.azcollection                       1.19.0 
check_point.mgmt                         5.2.2  
chocolatey.chocolatey                    1.5.1  
cisco.aci                                2.8.0  
cisco.asa                                4.0.3  
cisco.dnac                               6.11.0 
cisco.intersight                         2.0.7  
cisco.ios                                5.3.0  
cisco.iosxr                              6.1.1  
cisco.ise                                2.7.0  
cisco.meraki                             2.17.2 
cisco.mso                                2.5.0  
cisco.nxos                               5.3.0  
cisco.ucs                                1.10.0 
cloud.common                             2.1.4  
cloudscale_ch.cloud                      2.3.1  
community.aws                            7.1.0  
community.azure                          2.0.0  
community.ciscosmb                       1.0.7  
community.crypto                         2.18.0 
community.digitalocean                   1.26.0 
community.dns                            2.8.1  
community.docker                         3.8.0  
community.general                        8.4.0  
community.grafana                        1.8.0  
community.hashi_vault                    6.1.0  
community.hrobot                         1.9.0  
community.library_inventory_filtering_v1 1.0.0  
community.libvirt                        1.3.0  
community.mongodb                        1.7.1  
community.mysql                          3.9.0  
community.network                        5.0.2  
community.okd                            2.3.0  
community.postgresql                     3.4.0  
community.proxysql                       1.5.1  
community.rabbitmq                       1.2.3  
community.routeros                       2.13.0 
community.sap                            2.0.0  
community.sap_libs                       1.4.2  
community.sops                           1.6.7  
community.vmware                         4.2.0  
community.windows                        2.1.0  
community.zabbix                         2.3.1  
containers.podman                        1.12.0 
cyberark.conjur                          1.2.2  
cyberark.pas                             1.0.25 
dellemc.enterprise_sonic                 2.4.0  
dellemc.openmanage                       8.7.0  
dellemc.powerflex                        2.1.0  
dellemc.unity                            1.7.1  
f5networks.f5_modules                    1.28.0 
fortinet.fortimanager                    2.4.0  
fortinet.fortios                         2.3.5  
frr.frr                                  2.0.2  
gluster.gluster                          1.0.2  
google.cloud                             1.3.0  
grafana.grafana                          2.2.5  
hetzner.hcloud                           2.5.0  
hpe.nimble                               1.1.4  
ibm.qradar                               2.1.0  
ibm.spectrum_virtualize                  2.0.0  
ibm.storage_virtualize                   2.2.0  
infinidat.infinibox                      1.4.3  
infoblox.nios_modules                    1.6.1  
inspur.ispim                             2.2.0  
inspur.sm                                2.3.0  
junipernetworks.junos                    5.3.1  
kubernetes.core                          2.4.1  
lowlydba.sqlserver                       2.3.1  
microsoft.ad                             1.4.1  
netapp.aws                               21.7.1 
netapp.azure                             21.10.1
netapp.cloudmanager                      21.22.1
netapp.elementsw                         21.7.0 
netapp.ontap                             22.10.0
netapp.storagegrid                       21.12.0
netapp.um_info                           21.8.1 
netapp_eseries.santricity                1.4.0  
netbox.netbox                            3.17.0 
ngine_io.cloudstack                      2.3.0  
ngine_io.exoscale                        1.1.0  
openstack.cloud                          2.2.0  
openvswitch.openvswitch                  2.1.1  
ovirt.ovirt                              3.2.0  
purestorage.flasharray                   1.26.0 
purestorage.flashblade                   1.15.0 
purestorage.fusion                       1.6.1  
sensu.sensu_go                           1.14.0 
splunk.es                                2.1.2  
t_systems_mms.icinga_director            2.0.1  
telekom_mms.icinga_director              1.35.0 
theforeman.foreman                       3.15.0 
vmware.vmware_rest                       2.3.1  
vultr.cloud                              1.12.1 
vyos.vyos                                4.1.0  
wti.remote                               1.0.5  

AWS SDK versions

$ pip show boto boto3 botocore
WARNING: Package(s) not found: boto
Name: boto3
Version: 1.34.59
Summary: The AWS SDK for Python
Home-page: https://github.com/boto/boto3
Author: Amazon Web Services
Author-email: 
License: Apache License 2.0
Location: <redacted>/.venv/lib/python3.12/site-packages
Requires: botocore, jmespath, s3transfer
Required-by: 
---
Name: botocore
Version: 1.34.59
Summary: Low-level, data-driven core of boto 3.
Home-page: https://github.com/boto/botocore
Author: Amazon Web Services
Author-email: 
License: Apache License 2.0
Location: <redacted>/.venv/lib/python3.12/site-packages
Requires: jmespath, python-dateutil, urllib3
Required-by: boto3, s3transfer

Configuration

$ ansible-config dump --only-changed
CONFIG_FILE() = <redacted>/ansible.cfg
DEFAULT_INVENTORY_PLUGIN_PATH(<redacted>/ansible.cfg) = ['<redacted>/ansible/plugins/inventory']
DUPLICATE_YAML_DICT_KEY(<redacted>/ansible/ansible.cfg) = ignore
INVENTORY_IGNORE_EXTS(<redacted>/ansible/ansible.cfg) = ["{{(REJECT_EXTS + ('.orig'", '.cfg', "'.retry'))}}"]

OS / Environment

MacOS 14.3 (23D56)

Steps to Reproduce

  1. Create a file, where size(file) < 5GB (the copy limit), e.g. head -c 64MB < /dev/zero > 64MB-zero.bin
  2. Perform a multipart upload of the file to an S3 bucket. If you use the web UI, that appears to upload it in 16MB parts.
  3. Observe the Etag, e.g. for the 64MB zero file, I have 05c46bd967d2892191397a04e43821b9-4. According to Amazon:

Amazon S3 calculates the MD5 digest of each individual part. MD5 digests are used to determine the ETag for the final object. Amazon S3 concatenates the bytes for the MD5 digests together and then calculates the MD5 digest of these concatenated values. The final step in creating the ETag is when Amazon S3 adds a dash with the total number of parts to the end.

  1. Use amazon.aws.s3_object to copy the file to another bucket.
- name: copy file that was uploaded in parts
  amazon.aws.s3_object:
    bucket: target-bucket
    mode: copy
    copy_src:
      bucket: source-bucket
      prefix: 64MB-zero.bin
  1. Repeat the above and observe the task shows change each time (i.e. it's not idempotent). The timestamp on the target object is updated but the contents are not, suggesting a copy operation occurred needlessly.
  2. Observe the etag on the target bucket does not match that of the source, namely it is not in multi-part form, e.g. the zeros file has etag: e78585b8bfda6036cfd818710a210f23 (MD5 of 64MB of zeros).

Expected Results

Module is idempotent, and does not repeatidly copy identical files.

Actual Results

Copy operation performed on files uploaded in multipart, regardless of their state in the target bucket.

Code of Conduct

  • I agree to follow the Ansible Code of Conduct
@GomathiselviS GomathiselviS added needs_verified Some one might want to take a look at this and reproduce it to confirm jira and removed needs_triage labels Mar 19, 2024
@abikouo abikouo self-assigned this Mar 21, 2024
@abikouo abikouo added WIP Work in progress verified and removed needs_verified Some one might want to take a look at this and reproduce it to confirm labels Mar 21, 2024
@colin-nolan
Copy link
Author

Thank you for taking the time to work on this @abikouo.

@abikouo
Copy link
Contributor

abikouo commented Mar 25, 2024

@colin-nolan could you please give a try using #2024? Thanks

softwarefactory-project-zuul bot pushed a commit that referenced this issue Apr 11, 2024
…2024)

s3_object - fix copy idempotency issue with multipart upload object

SUMMARY

Fixes #2016
To ensure the idempotency when copying objects created with multipart upload, the idempotency of the source object will be computed locally using object content as it differs from what is stored on AWS object header.

ISSUE TYPE


Bugfix Pull Request

COMPONENT NAME

s3_object
ADDITIONAL INFORMATION

Reviewed-by: Colin Nolan
Reviewed-by: Bikouo Aubin
Reviewed-by: Helen Bailey <[email protected]>
patchback bot pushed a commit that referenced this issue Apr 11, 2024
…2024)

s3_object - fix copy idempotency issue with multipart upload object

SUMMARY

Fixes #2016
To ensure the idempotency when copying objects created with multipart upload, the idempotency of the source object will be computed locally using object content as it differs from what is stored on AWS object header.

ISSUE TYPE

Bugfix Pull Request

COMPONENT NAME

s3_object
ADDITIONAL INFORMATION

Reviewed-by: Colin Nolan
Reviewed-by: Bikouo Aubin
Reviewed-by: Helen Bailey <[email protected]>
(cherry picked from commit 37804cc)
softwarefactory-project-zuul bot pushed a commit that referenced this issue Apr 19, 2024
…2024) (#2053)

[PR #2024/37804cc7 backport][stable-7] s3_object - fix copy idempotency issue with multipart upload object

This is a backport of PR #2024 as merged into main (37804cc).
SUMMARY

Fixes #2016
To ensure the idempotency when copying objects created with multipart upload, the idempotency of the source object will be computed locally using object content as it differs from what is stored on AWS object header.

ISSUE TYPE


Bugfix Pull Request

COMPONENT NAME

s3_object
ADDITIONAL INFORMATION

Reviewed-by: Helen Bailey <[email protected]>
Reviewed-by: Mark Chappell
abikouo added a commit to abikouo/amazon.aws that referenced this issue Oct 15, 2024
…-collections#2150)

SUMMARY
Closes ansible-collections#2120
Closes ansible-collections#2019
Closes ansible-collections#2016
Prepare modules autoscaling_instance_refresh and autoscaling_instance_refresh_info for promotion:

Refactor modules to use common code from ansible_collections.amazon.aws.plugins.module_utils.autoscaling
Add type hinting
Update integration tests

ISSUE TYPE

Feature Pull Request

Reviewed-by: GomathiselviS
Reviewed-by: Bikouo Aubin
Reviewed-by: Alina Buzachis

This commit was initially merged in https://github.com/ansible-collections/community.aws
See: ansible-collections/community.aws@d59fa93
abikouo added a commit to abikouo/amazon.aws that referenced this issue Oct 15, 2024
…-collections#2150)

SUMMARY
Closes ansible-collections#2120
Closes ansible-collections#2019
Closes ansible-collections#2016
Prepare modules autoscaling_instance_refresh and autoscaling_instance_refresh_info for promotion:

Refactor modules to use common code from ansible_collections.amazon.aws.plugins.module_utils.autoscaling
Add type hinting
Update integration tests

ISSUE TYPE

Feature Pull Request

Reviewed-by: GomathiselviS
Reviewed-by: Bikouo Aubin
Reviewed-by: Alina Buzachis

This commit was initially merged in https://github.com/ansible-collections/community.aws
See: ansible-collections/community.aws@d59fa93
abikouo added a commit to abikouo/amazon.aws that referenced this issue Oct 15, 2024
…-collections#2150)

SUMMARY
Closes ansible-collections#2120
Closes ansible-collections#2019
Closes ansible-collections#2016
Prepare modules autoscaling_instance_refresh and autoscaling_instance_refresh_info for promotion:

Refactor modules to use common code from ansible_collections.amazon.aws.plugins.module_utils.autoscaling
Add type hinting
Update integration tests

ISSUE TYPE

Feature Pull Request

Reviewed-by: GomathiselviS
Reviewed-by: Bikouo Aubin
Reviewed-by: Alina Buzachis

This commit was initially merged in https://github.com/ansible-collections/community.aws
See: ansible-collections/community.aws@d59fa93
abikouo added a commit to abikouo/amazon.aws that referenced this issue Oct 15, 2024
…-collections#2150)

SUMMARY
Closes ansible-collections#2120
Closes ansible-collections#2019
Closes ansible-collections#2016
Prepare modules autoscaling_instance_refresh and autoscaling_instance_refresh_info for promotion:

Refactor modules to use common code from ansible_collections.amazon.aws.plugins.module_utils.autoscaling
Add type hinting
Update integration tests

ISSUE TYPE

Feature Pull Request

Reviewed-by: GomathiselviS
Reviewed-by: Bikouo Aubin
Reviewed-by: Alina Buzachis

This commit was initially merged in https://github.com/ansible-collections/community.aws
See: ansible-collections/community.aws@d59fa93
abikouo added a commit to abikouo/amazon.aws that referenced this issue Oct 16, 2024
…-collections#2150)

SUMMARY
Closes ansible-collections#2120
Closes ansible-collections#2019
Closes ansible-collections#2016
Prepare modules autoscaling_instance_refresh and autoscaling_instance_refresh_info for promotion:

Refactor modules to use common code from ansible_collections.amazon.aws.plugins.module_utils.autoscaling
Add type hinting
Update integration tests

ISSUE TYPE

Feature Pull Request

Reviewed-by: GomathiselviS
Reviewed-by: Bikouo Aubin
Reviewed-by: Alina Buzachis

This commit was initially merged in https://github.com/ansible-collections/community.aws
See: ansible-collections/community.aws@d59fa93
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
jira verified WIP Work in progress
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants