Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(ECR): (Custom::ECRAutoDeleteImages fails on repo rename) #26711

Closed
duffzm opened this issue Aug 10, 2023 · 2 comments · Fixed by #26742
Closed

(ECR): (Custom::ECRAutoDeleteImages fails on repo rename) #26711

duffzm opened this issue Aug 10, 2023 · 2 comments · Fixed by #26742
Labels
@aws-cdk/aws-ecr Related to Amazon Elastic Container Registry bug This issue is a bug. effort/medium Medium work item – several days of effort p1

Comments

@duffzm
Copy link

duffzm commented Aug 10, 2023

Describe the bug

Within the same deployment environment I perform cdk deploy for ECR repositories and images stacks

  1. using code from a default branch and then
  2. I run it again (after successful deployment) to a feature branch.

The default branch stacks deploy successfully but the feature branch deployment fails to update my stacks due to what seems to be a renaming issue on ECR repositories in connection with Custom::ECRAutoDeleteImages resources . This is my first error

The following resource(s) failed to update: [basimagerepoAutoDeleteImagesCustomResource9BC77A26].

Because of

Received response status [FAILED] from custom resource. Message returned: AccessDeniedException: User: arn:aws:sts::1234567890:assumed-role/my-stack-prefix-CustomECRAutoDeleteImage-1ESL2R8Y9CLAL/my-stack-prefix-infrastr-CustomECRAutoDeleteImage-D9sNwuezjpai is not authorized to perform: ecr:DescribeRepositories on resource: arn:aws:ecr:us-east-1:543276908693:repository/my/base_image because no identity-based policy allows the ecr:DescribeRepositories action at throwDefaultError (/var/runtime/node_modules/@aws-sdk/smithy-client/dist-cjs/default-error-handler.js:8:22) at deserializeAws_json1_1DescribeRepositoriesCommandError (/var/runtime/node_modules/@aws-sdk/client-ecr/dist-cjs/protocols/Aws_json1_1.js:1212:51) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async /var/runtime/node_modules/@aws-sdk/middleware-serde/dist-cjs/deserializerMiddleware.js:7:24 at async /var/runtime/node_modules/@aws-sdk/middleware-signing/dist-cjs/middleware.js:13:20 at async StandardRetryStrategy.retry (/var/runtime/node_modules/@aws-sdk/middleware-retry/dist-cjs/StandardRetryStrategy.js:51:46) at async /var/runtime/node_modules/@aws-sdk/middleware-logger/dist-cjs/loggerMiddleware.js:6:22 at async f (/var/task/index.js:1:3557) at async Runtime.handler (/var/task/index.js:1:1152) (RequestId: af95f9de-6958-4e02-af53-84a6a54d9b2c)

and then, when the rollback is attempted, the rollback fails on

The following resource(s) failed to update: [baseimagerepoAutoDeleteImagesCustomResource9BC77A26].

because of

Received response status [FAILED] from custom resource. Message returned: AccessDeniedException: User: arn:aws:sts::1234567890:assumed-role/my-stack-prefix-CustomECRAutoDeleteImage-1ESL2R8Y9CLAL/my-stack-prefix-CustomECRAutoDeleteImage-D9sNwuezjpai is not authorized to perform: ecr:DescribeRepositories on resource: arn:aws:ecr:us-east-1:1234567890:repository/my/new/base_image because no identity-based policy allows the ecr:DescribeRepositories action at throwDefaultError (/var/runtime/node_modules/@aws-sdk/smithy-client/dist-cjs/default-error-handler.js:8:22) at deserializeAws_json1_1DescribeRepositoriesCommandError (/var/runtime/node_modules/@aws-sdk/client-ecr/dist-cjs/protocols/Aws_json1_1.js:1212:51) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async /var/runtime/node_modules/@aws-sdk/middleware-serde/dist-cjs/deserializerMiddleware.js:7:24 at async /var/runtime/node_modules/@aws-sdk/middleware-signing/dist-cjs/middleware.js:13:20 at async StandardRetryStrategy.retry (/var/runtime/node_modules/@aws-sdk/middleware-retry/dist-cjs/StandardRetryStrategy.js:51:46) at async /var/runtime/node_modules/@aws-sdk/middleware-logger/dist-cjs/loggerMiddleware.js:6:22 at async f (/var/task/index.js:1:3557) at async Runtime.handler (/var/task/index.js:1:1152) (RequestId: a6111bf7-1f99-4a5c-b9c4-224a039a04a0)

Note 1. that the the two error messages are different in that the initial failure includes my old repo name but the rollback failure includes the new name. Here they are side-by-side. The first one is UPDATE_FAILED, the second is UPDATE_ROLLBACK_FAILED

...User: arn:aws:sts::1234567890:assumed-role/my-stack-prefix-CustomECRAutoDeleteImage-1ESL2R8Y9CLAL/my-stack-prefix-CustomECRAutoDeleteImage-D9sNwuezjpai is not authorized to perform: ecr:DescribeRepositories on resource: arn:aws:ecr:us-east-1:1234567890:repository/my/base_image
...User: arn:aws:sts::1234567890:assumed-role/my-stack-prefix-CustomECRAutoDeleteImage-1ESL2R8Y9CLAL/my-stack-prefix-CustomECRAutoDeleteImage-D9sNwuezjpai is not authorized to perform: ecr:DescribeRepositories on resource: arn:aws:ecr:us-east-1:1234567890:repository/my/new/base_image

Note2. that all the newly named repos/images I created in ECR appeare to have deployed successfully, even though some them belonged to the failed stacks. So the issue seems to be isolated to the Custom::ECRAutoDeleteImages resource.

Note3. Termination protection is not enabled on these tasks or repositories and specifically we have on our ECR construct.

removal_policy = aws_cdk.RemovalPolicy.DESTROY
auto_delete_images = True

Expected Behavior

  1. Previously named repos would be destroyed along with any images they house
  2. New repos/images successful
  3. Custom::ECRAutoDeleteImages updated successfully

Current Behavior

See description section. Full errors logged there.

Reproduction Steps

cdk deploy with

props = {}
props["repository_name"] = "name1"
props["removal_policy"] = aws_cdk.RemovalPolicy.DESTROY
props["auto_delete_images"] = True

Repository(
    scope=scope,
    id=construct_id,
    **props,
    **kwargs,
)

then presumably another deploy (in same stack) with

props = {}
props["repository_name"] = "name2"
props["removal_policy"] = aws_cdk.RemovalPolicy.DESTROY
props["auto_delete_images"] = True

Repository(
    scope=scope,
    id=construct_id,
    **props,
    **kwargs,
)

Possible Solution

Race condition with custom resource settings needs resolve.

Additional Information/Context

None

CDK CLI Version

2.2.200

Framework Version

2.89.0

Node.js Version

16

OS

Linux

Language

Python

Language Version

3.10.11

Other information

No response

@duffzm duffzm added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Aug 10, 2023
@github-actions github-actions bot added the @aws-cdk/aws-ecr Related to Amazon Elastic Container Registry label Aug 10, 2023
@pahud
Copy link
Contributor

pahud commented Aug 11, 2023

I guess it's because the AutoDeleteImagesCustomResourc provider is only allowed to ecr:BatchDeleteImage on name1, when the repository is updated like that, the repo name will be changed to name2 and the custom resource can't describe images on name1 because name1 does not exist any more.

A quick hack is to allow the provider to describe all images but maybe we have better solutions.

@pahud pahud added p1 effort/medium Medium work item – several days of effort and removed needs-triage This issue or PR still needs to be triaged. labels Aug 11, 2023
@mergify mergify bot closed this as completed in #26742 Aug 17, 2023
mergify bot pushed a commit that referenced this issue Aug 17, 2023
This PR fixes the bug that ECRAutoDeleteImages fails on repo rename.

The customResource depends on the role, and when the repository name changes, the role is updated to match the new repository instead of the old one, before customResource runs and the old repository is deleted.

It was difficult to delete the old repo before the role update ran, so I changed the resource of the role to a wildcard.

Closes #26711.

----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
@github-actions
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-ecr Related to Amazon Elastic Container Registry bug This issue is a bug. effort/medium Medium work item – several days of effort p1
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants