-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cdk deploy: Waiter times out on clusterautoscaler #856
Comments
Looking at the cloud formation on the aws web interface and looking at your stack. Look for anything that has failed what reason does it say? I have not come across this exact issue but from experience this feels like a IAM permission issue for your account. |
Hey Asim, thanks for the insight. The AWS web interface had the following in CloudFormation for the error:
The user I'm using from the cli is an admin user, which I believe only prevents one from seeing billing. The module of course does spin up many IAM roles that it's using, are you thinking that it might be one of those? |
Hmm, it does not look like a perms issue then if you are using admin. The only thing that I think may help your case is to delete the stack completely and try again. This may involve deleting some resources manually. Otherwise, I am not sure what the issue could be. Sorry that I cannot be of more help. |
@bconner22 I would recommending to do a full cleanup and run again. I would assume this to be a temporary onetime issue. Please keep us posted. |
Crossposting as I believe these two issues are related: |
@bconner22 as stated in the #894, concurrency executions service quota per account may be the issue. Another possible root cause is the default quota of 1000 is exhausted in the account because of other lambda functions deployed in the same account (this could be sporadic). |
Describe the bug
Following this link.
I did this yesterday afternoon, and again this morning, the stack failed the same way
From the Cloudformation console:
2023-10-10 09:36:50 UTC-0500 eksblueprintblueprintsaddonclusterautoscalersamanifestblueprintsaddonclusterautoscalersaServiceAccountResource72D82586
CREATE_FAILED
Received response status [FAILED] from custom resource. Message returned: TimeoutError: {"state":"TIMEOUT","reason":"Waiter has timed out"} at checkExceptions (/var/runtime/node_modules/@aws-sdk/util-waiter/dist-cjs/waiter.js:26:30) at waitUntilFunctionActiveV2 (/var/runtime/node_modules/@aws-sdk/client-lambda/dist-cjs/waiters/waitForFunctionActiveV2.js:52:46) at process.processTicksAndRejections (node:internal/process/task_queues:95:5) at async defaultInvokeFunction (/var/task/outbound.js:1:875) at async invokeUserFunction (/var/task/framework.js:1:2192) at async onEvent (/var/task/framework.js:1:369) at async Runtime.handler (/var/task/cfn-response.js:1:1573)
From my cli:
Do you wish to deploy these changes (y/n)? y
eks-blueprint: deploying... [1/1]
eks-blueprint: creating CloudFormation changeset...
[█████████████████████████████████▎························] (46/80)
9:36:50 AM | CREATE_FAILED | Custom::AWSCDK-EKS-KubernetesResource | eks-blueprint/blue...e/Resource/Default
Received response status [FAILED] from custom resource. Message returned: TimeoutError: {"state":"TIMEOUT","reason":
9:36:50 AM | CREATE_FAILED | Custom::AWSCDK-EKS-KubernetesResource | eksblueprintbluepr...ntResource72D82586
Received response status [FAILED] from custom resource. Message returned: TimeoutError: {"state":"TIMEOUT","reason"
:"Waiter has timed out"}
at checkExceptions (/var/runtime/node_modules/@aws-sdk/util-waiter/dist-cjs/waiter.js:26:30)
at waitUntilFunctionActiveV2 (/var/runtime/node_modules/@aws-sdk/client-lambda/dist-cjs/waiters/waitForFunctionActi
veV2.js:52:46)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async defaultInvokeFunction (/var/task/outbound.js:1:875)
at async invokeUserFunction (/var/task/framework.js:1:2192)
at async onEvent (/var/task/framework.js:1:369)
at async Runtime.handler (/var/task/cfn-response.js:1:1573) (RequestId: 9c36b5b4-88cb-45af-b4cb-1f1056a35886)
❌ eks-blueprint failed: Error: The stack named eks-blueprint failed creation, it may need to be manually deleted from the AWS console: ROLLBACK_COMPLETE: Received response status [FAILED] from custom resource. Message returned: TimeoutError: {"state":"TIMEOUT","reason":"Waiter has timed out"}
at checkExceptions (/var/runtime/node_modules/@aws-sdk/util-waiter/dist-cjs/waiter.js:26:30)
at waitUntilFunctionActiveV2 (/var/runtime/node_modules/@aws-sdk/client-lambda/dist-cjs/waiters/waitForFunctionActiveV2.js:52:46)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async defaultInvokeFunction (/var/task/outbound.js:1:875)
at async invokeUserFunction (/var/task/framework.js:1:2192)
at async onEvent (/var/task/framework.js:1:369)
at async Runtime.handler (/var/task/cfn-response.js:1:1573) (RequestId: 9c36b5b4-88cb-45af-b4cb-1f1056a35886)
at FullCloudFormationDeployment.monitorDeployment (/usr/local/lib/node_modules/aws-cdk/lib/index.js:467:10232)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Object.deployStack2 [as deployStack] (/usr/local/lib/node_modules/aws-cdk/lib/index.js:470:179911)
at async /usr/local/lib/node_modules/aws-cdk/lib/index.js:470:163159
❌ Deployment failed: Error: The stack named eks-blueprint failed creation, it may need to be manually deleted from the AWS console: ROLLBACK_COMPLETE: Received response status [FAILED] from custom resource. Message returned: TimeoutError: {"state":"TIMEOUT","reason":"Waiter has timed out"}
at checkExceptions (/var/runtime/node_modules/@aws-sdk/util-waiter/dist-cjs/waiter.js:26:30)
at waitUntilFunctionActiveV2 (/var/runtime/node_modules/@aws-sdk/client-lambda/dist-cjs/waiters/waitForFunctionActiveV2.js:52:46)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async defaultInvokeFunction (/var/task/outbound.js:1:875)
at async invokeUserFunction (/var/task/framework.js:1:2192)
at async onEvent (/var/task/framework.js:1:369)
at async Runtime.handler (/var/task/cfn-response.js:1:1573) (RequestId: 9c36b5b4-88cb-45af-b4cb-1f1056a35886)
at FullCloudFormationDeployment.monitorDeployment (/usr/local/lib/node_modules/aws-cdk/lib/index.js:467:10232)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async Object.deployStack2 [as deployStack] (/usr/local/lib/node_modules/aws-cdk/lib/index.js:470:179911)
at async /usr/local/lib/node_modules/aws-cdk/lib/index.js:470:163159
The stack named eks-blueprint failed creation, it may need to be manually deleted from the AWS console: ROLLBACK_COMPLETE: Received response status [FAILED] from custom resource. Message returned: TimeoutError: {"state":"TIMEOUT","reason":"Waiter has timed out"}
at checkExceptions (/var/runtime/node_modules/@aws-sdk/util-waiter/dist-cjs/waiter.js:26:30)
at waitUntilFunctionActiveV2 (/var/runtime/node_modules/@aws-sdk/client-lambda/dist-cjs/waiters/waitForFunctionActiveV2.js:52:46)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async defaultInvokeFunction (/var/task/outbound.js:1:875)
at async invokeUserFunction (/var/task/framework.js:1:2192)
at async onEvent (/var/task/framework.js:1:369)
at async Runtime.handler (/var/task/cfn-response.js:1:1573) (RequestId: 9c36b5b4-88cb-45af-b4cb-1f1056a35886)
Expected Behavior
The cluster and addons to deploy
Current Behavior
Errors are above
Reproduction Steps
Follow https://aws-quickstart.github.io/cdk-eks-blueprints/getting-started/
Possible Solution
Does the waiter need to wait for longer?
Additional Information/Context
I'm in an AWS Orgs management account, using an IAM user, but otherwise the account is empty. The lambdas did appear to deploy correctly, and both they and the EKS cluster were in us-east-1. I did also
cdk bootstrap aws://<MY_ACCOUNT_NUMBER>/us-east-1
as I saw someone ask to confirm that on a similar issue.CDK CLI Version
2.99.1 (build b2a895e)
EKS Blueprints Version
1.12.0
Node.js Version
v20.8.0
Environment details (OS name and version, etc.)
OSX on Intel chip
Other information
No response
The text was updated successfully, but these errors were encountered: