-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
(aws-cdk/lib/api/garbage-collection): (Garbage collection for ECR prints incorrect number of assets/images deleted and runs indefinitely) #32498
Comments
Hi @tainsworth102 , thanks for reaching out. Although I did not have those many images, i tried with a small set.Here is a snippet (since I had only 2 images, it asked for deletion and then deleted)- I assume it should have tagged the image as well as it was not used since long but the After going through the CDK Doc for garbage collection, it seems that -
So considering these , AFAIU, in your command - CDK GC should tag the image which does not seem to be the case. Will try to repro this scenario with more images and share my findings. |
@tainsworth102 , I tried deleting images in other region and got this -
Seems like there is an issue with tagging , for the images which are getting deleted. However I am not sure of whether in situation where many 1000 images exist, how the percentage would be displayed. For smaller number, percentage seems correct . I am marking this issue as P2 which means it would be on team's radar , won't be immediately addressed by the team. However contributions reg the resolution, are welcome from the community as well team. Hope that is helpful. |
I was able to reproduce the bug, and I'll open a PR to fix it. |
…e collector for ECR (#32679) ### Issue # (if applicable) Closes #32498 ### Reason for this change When `listImagesCommand` returns nextToken in the `readRepoInBatches` function, nextToken is not passed as an argument for the subsequent `listImagesCommand` execution, causing `listImagesCommand` to continue executing. https://github.com/aws/aws-cdk/blob/v2.173.4/packages/aws-cdk/lib/api/garbage-collection/garbage-collector.ts#L621 According to the `listImagesCommand` documentation, if maxResults is not specified, a maximum of 100 images will be returned, so this bug requires at least 100 images in the asset repository. https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-ecr/Interface/ListImagesCommandInput/ #### Reproduction Steps The following bash script and Dockerfile saved locally and executed, will push 120 container images to the asset repository. ```bash #!/usr/bin/env bash set -eu ACCOUNT_ID="your account id" REGION="your region" REPO_NAME="cdk-hnb659fds-container-assets-${ACCOUNT_ID}-${REGION}" IMAGE_NAME="test-image" AWS_PROFILE="your AWS profile" echo "Logging in to ECR..." aws ecr get-login-password --region "${REGION}" --profile "${AWS_PROFILE}" \ | docker login --username AWS --password-stdin "${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com" for i in $(seq 1 120); do hash=$(head -c 32 /dev/urandom | xxd -p -c 64) echo "Building and pushing image with tag: ${hash}" touch "${i}.txt" docker build \ --build-arg BUILD_NO="${i}" \ -t "${IMAGE_NAME}:${i}" \ . docker tag "${IMAGE_NAME}:${i}" \ "${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com/${REPO_NAME}:${hash}" docker push \ "${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com/${REPO_NAME}:${hash}" rm "${i}.txt" sleep 0.01 done echo "Done!" ``` ```dockerfile FROM scratch ARG BUILD_NO ENV BUILD_NO=${BUILD_NO} COPY ${BUILD_NO}.txt / ``` You can reproduce this bug by running the following command after the images have been pushed. ```bash $ cdk gc aws://{account id}/{region} --type ecr --unstable=gc --created-buffer-days 0 --action full --confirm=true ``` ### Description of changes Fix the problem of correctly handling nextToken when executing `listImagesCommand` in the `readRepoInBatches` function. ### Describe any new or updated permissions being added Nothing. ### Description of how you validated changes Verifying that this bug has been fixed using the CLI integration tests is difficult, so only unit tests are added. ### Checklist - [x] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md) ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Comments on closed issues and PRs are hard for our team to see. |
…e collector for ECR (#32679) ### Issue # (if applicable) Closes #32498 ### Reason for this change When `listImagesCommand` returns nextToken in the `readRepoInBatches` function, nextToken is not passed as an argument for the subsequent `listImagesCommand` execution, causing `listImagesCommand` to continue executing. https://github.com/aws/aws-cdk/blob/v2.173.4/packages/aws-cdk/lib/api/garbage-collection/garbage-collector.ts#L621 According to the `listImagesCommand` documentation, if maxResults is not specified, a maximum of 100 images will be returned, so this bug requires at least 100 images in the asset repository. https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/Package/-aws-sdk-client-ecr/Interface/ListImagesCommandInput/ #### Reproduction Steps The following bash script and Dockerfile saved locally and executed, will push 120 container images to the asset repository. ```bash #!/usr/bin/env bash set -eu ACCOUNT_ID="your account id" REGION="your region" REPO_NAME="cdk-hnb659fds-container-assets-${ACCOUNT_ID}-${REGION}" IMAGE_NAME="test-image" AWS_PROFILE="your AWS profile" echo "Logging in to ECR..." aws ecr get-login-password --region "${REGION}" --profile "${AWS_PROFILE}" \ | docker login --username AWS --password-stdin "${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com" for i in $(seq 1 120); do hash=$(head -c 32 /dev/urandom | xxd -p -c 64) echo "Building and pushing image with tag: ${hash}" touch "${i}.txt" docker build \ --build-arg BUILD_NO="${i}" \ -t "${IMAGE_NAME}:${i}" \ . docker tag "${IMAGE_NAME}:${i}" \ "${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com/${REPO_NAME}:${hash}" docker push \ "${ACCOUNT_ID}.dkr.ecr.${REGION}.amazonaws.com/${REPO_NAME}:${hash}" rm "${i}.txt" sleep 0.01 done echo "Done!" ``` ```dockerfile FROM scratch ARG BUILD_NO ENV BUILD_NO=${BUILD_NO} COPY ${BUILD_NO}.txt / ``` You can reproduce this bug by running the following command after the images have been pushed. ```bash $ cdk gc aws://{account id}/{region} --type ecr --unstable=gc --created-buffer-days 0 --action full --confirm=true ``` ### Description of changes Fix the problem of correctly handling nextToken when executing `listImagesCommand` in the `readRepoInBatches` function. ### Describe any new or updated permissions being added Nothing. ### Description of how you validated changes Verifying that this bug has been fixed using the CLI integration tests is difficult, so only unit tests are added. ### Checklist - [x] My code adheres to the [CONTRIBUTING GUIDE](https://github.com/aws/aws-cdk/blob/main/CONTRIBUTING.md) and [DESIGN GUIDELINES](https://github.com/aws/aws-cdk/blob/main/docs/DESIGN_GUIDELINES.md) ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Describe the bug
When running cdk gc for ECR, the number of assets is misprinted and the files scanned exceeds the number in the bootstrap repository. This results in the percentage of files scanned exceeding 100.00% and therefore command runs indefinitely. Additionally, images are not being tagged in any attempted run, it is jumping straight to deleting the a random number of unused images which is not reflected in the print statement.
Regression Issue
Last Known Working CDK Version
No response
Expected Behavior
The printed output should have stated:
[100.00%] 136 files scanned: 3 assets (0.56 GiB) tagged, 0 assets (0.00 GiB) deleted.
Current Behavior
[735.29%] 1000 files scanned: 0 assets (0.00 GiB) tagged, 30 assets (5.63 GiB) deleted.
The printed output was incorrect and rather than tagging it began deleting straight away.
Reproduction Steps
cdk gc aws://
<my-account-id>
/<my-only-used-region>
--type ecr --unstable=gc --created-buffer-days 0 --action full --confirm=truePossible Solution
No response
Additional Information/Context
When running this for an account which has an ECR repo with ~8000 images the progression printing is again displayed incorrectly. As the progression cycled and increased with the loop, the number progression progression prints for each progression iteration double and the number of images deleted cumulatively decreases until it deletes no images for each iteration.
CDK CLI Version
2.172.0
Framework Version
No response
Node.js Version
v20.11.1
OS
Ubuntu 22.04.4 LTS
Language
Python
Language Version
No response
Other information
No response
The text was updated successfully, but these errors were encountered: