Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When using terraform to apply cluster AWS resources changes, you must update the ASGs to use launch templates #10017

Closed
mmerrill3 opened this issue Oct 6, 2020 · 11 comments · Fixed by #10423
Labels
blocks-next kind/bug Categorizes issue or PR as related to a bug.

Comments

@mmerrill3
Copy link

1. What kops version are you running? The command kops version, will display
this information.

kops 1.19-alpha-4
2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

k8s 1.19.2
3. What cloud provider are you using?
aws
4. What commands did you run? What is the simplest way to reproduce this issue?
applied new terraform created from kops update-cluster --target=terraform, using terraform apply
5. What happened after the commands executed?
The existing launch configurations are still attached to the ASGs, so the deletion of them will fail. You have to update the ASGs to use the launch templates through the UI or aws CLI, then run terraform apply again.
6. What did you expect to happen?
terraform would not fail on applying the new cluster terraform file
7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

9. Anything else do we need to know?
This is really informational only, not a bug, but I'd like this known so others that use terraform know that they have to handle the switch from launch configurations to launch templates outside of terraform and kops, since terraform will fail deleting the old launch configurations b/c they are still assigned to ASGs.

@avdhoot
Copy link
Contributor

avdhoot commented Oct 12, 2020

similar issue faced when we updating cluster to k8s v1.19.2 using kops v1.19.0-alpha.4

W1011 14:32:15.309695   49861 executor.go:131] error running task "AutoscalingGroup/nodes-spot-r4-xlarge-us-west-2c.us-sandbox.foo.bar" (2m45s remaining to succeed): error updating AutoscalingGroup: InvalidQueryParameter: Incompatible launch template: You cannot use a launch template that is set to request Spot Instances (InstanceMarketOptions) when you configure an Auto Scaling group with a mixed instances policy. Add a different launch template to the group and try again.
        status code: 400, request id: 1e45a923-67e1-4b7a-af3a-8d631a7aa9b6
W1011 14:32:15.309727   49861 executor.go:131] error running task "AutoscalingGroup/master-us-west-2c.masters.us-sandbox.foo.bar" (2m45s remaining to succeed): error updating AutoscalingGroup: AccessDenied: You are not authorized to use launch template: master-us-west-2c.masters.us-sandbox.foo.bar-20201011142459
        status code: 403, request id: 06fc1fe6-0ebf-4d8d-9240-993e0ce642e9
W1011 14:32:15.309746   49861 executor.go:131] error running task "AutoscalingGroup/master-us-west-2b.masters.us-sandbox.foo.bar" (2m45s remaining to succeed): error updating AutoscalingGroup: AccessDenied: You are not authorized to use launch template: master-us-west-2b.masters.us-sandbox.foo.bar-20201011142459
        status code: 403, request id: 365f7ade-af2a-4f05-ab9c-f65dc8072856
W1011 14:32:15.309760   49861 executor.go:131] error running task "AutoscalingGroup/nodes.us-sandbox.foo.bar" (2m45s remaining to succeed): error updating AutoscalingGroup: AccessDenied: You are not authorized to use launch template: nodes.us-sandbox.foo.bar-20201011142459
        status code: 403, request id: 8609a55d-6db0-4d3b-bc02-9106439be35e
W1011 14:32:15.309776   49861 executor.go:131] error running task "AutoscalingGroup/nodes-spot-r4-xlarge-us-west-2b.us-sandbox.foo.bar" (2m45s remaining to succeed): error updating AutoscalingGroup: InvalidQueryParameter: Incompatible launch template: You cannot use a launch template that is set to request Spot Instances (InstanceMarketOptions) when you configure an Auto Scaling group with a mixed instances policy. Add a different launch template to the group and try again.
        status code: 400, request id: 6b839058-5462-4da3-903d-4c42c1be6443
W1011 14:32:15.309792   49861 executor.go:131] error running task "AutoscalingGroup/nodes-spot-r4-xlarge-us-west-2a.us-sandbox.foo.bar" (2m45s remaining to succeed): error updating AutoscalingGroup: InvalidQueryParameter: Incompatible launch template: You cannot use a launch template that is set to request Spot Instances (InstanceMarketOptions) when you configure an Auto Scaling group with a mixed instances policy. Add a different launch template to the group and try again.
        status code: 400, request id: 7ffc7938-e187-4af4-bc8d-023c78e463a9
W1011 14:32:15.309807   49861 executor.go:131] error running task "AutoscalingGroup/master-us-west-2a.masters.us-sandbox.foo.bar" (2m45s remaining to succeed): error updating AutoscalingGroup: AccessDenied: You are not authorized to use launch template: master-us-west-2a.masters.us-sandbox.foo.bar-20201011142500
        status code: 403, request id: 891e2bee-9d52-4770-a8d4-05857e04f749

@rifelpet
Copy link
Member

So we discussed this a bit and came up with two ideas, perhaps you and others could provide input on them?

The problem is that Kops' migration from LaunchConfigurations to LaunchTemplates causes the generated terraform file to no longer include the LC. Terraform then tries to delete the LC even though it is still in use by the autoscaling groups until their instances have all be replaced with the new LT version.

  • Have Kops generate both LC and LT resources for some period of time, perhaps 3 minor releases or until Kops removes support for LCs. The LC would only be used during the first upgrade that migrates the ASGs from LT to LC, and then would be effectively unused.
  • Add to the Kops 1.19 release notes a required action to run terraform state rm the most recent LC resource prior to running terraform apply. This will orphan the LC so that terraform doesnt delete it. The user would need to manually cleanup the LC after rolling the cluster if desired.

@mmerrill3
Copy link
Author

@rifelpet I'm going to try out your second suggestion today. Whatever script I come up with to remove all of the LC resources prior to running terraform apply, I'll post it here.

@mmerrill3
Copy link
Author

@rifelpet thanks for you suggestions. I tried the following terraform updates, and it allowed to terraform apply to go through without any deletion errors.

terraform state list | grep aws_launch_configuration | awk '{ print $1 }' | while read line; do terraform state rm $line; done

I'll post the command I'll use with the aws cli to remove the old LCs.

@avdhoot
Copy link
Contributor

avdhoot commented Oct 20, 2020

@rifelpet 1st option is more transparent for users. 👍

@seh
Copy link
Contributor

seh commented Oct 20, 2020

Per @rifelpet's ideas above (in #10017 (comment)), when I ran into this problem, I used that second procedure: delete the Launch Configurations from Terraform's state using terraform state rm, orphaning them. Terraform was not smart enough to first create the Launch Templates, bind the templates to the ASG, and only then destroy the abandoned Launch Configurations. It always wanted to destroy the Launch Configurations first, before adjusting the ASGs that had been using them.

@dvdmuckle
Copy link

dvdmuckle commented Oct 21, 2020

I ran into the same issue regarding using LTs with mixed instance policies as @avdhoot, though I think it's a different issue from the one at hand here. This was with a cluster created and then updated with Version 1.19.0-alpha.5 (git-ea96bbd768de53eca962d0bb1c17883e18cdadd1) and K8s 1.19.3.

@seh
Copy link
Contributor

seh commented Oct 28, 2020

Here's my latest code handling this problem as part of a larger upgrade procedure.

# kops removes the Terraform configuration for the existing
# autoscaling launch configurations, so Terraform tries to delete
# them, only too early, before it reconfigures the ASGs using them to
# use launch templates instead. Preclude Terraform from making that
# mistake by removing the launch configuration resources from its
# state file, and then deleting these orphaned launch configurations
# after Terraform is done with the ASGs.
launch_configurations_to_delete=()
while read -r tf_address lc_name; do
  # Even when there's no output from jq, the "read" command still
  # enters the loop once with an empty line.
  if [ -z "${lc_name}" ]; then
    break
  fi
  launch_configurations_to_delete+=("${lc_name}")
  terraform state rm "${tf_address}"
done  <<< \
      "$(terraform show -json |
         jq --raw-output '.values.root_module.resources[]
                          | select(.type == "aws_launch_configuration")
                          | [.address, .values.name]
                          | @tsv')" \
  || : # NB: The "read" command fails upon reaching EOF.

terraform apply -auto-approve

for lc_name in "${launch_configurations_to_delete[@]}"; do
  AWS_PAGER='' aws autoscaling delete-launch-configuration --launch-configuration-name "${lc_name}"
done
rm data/aws_launch_configuration*user_data

@rifelpet rifelpet added blocks-next kind/bug Categorizes issue or PR as related to a bug. labels Oct 31, 2020
@hakman
Copy link
Member

hakman commented Nov 12, 2020

Would you mind testing again with the 1.19.0-beta.2 release?

@avdhoot
Copy link
Contributor

avdhoot commented Nov 22, 2020

@hakman still same issue with 1.19.0-beta.2

Update: Please ignore above comment. I was confused with other issue.

@hakman
Copy link
Member

hakman commented Nov 23, 2020

@avdhoot what terraform version are you using?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocks-next kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants