Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
pkg/destroy/aws: Use the resource-groups service for tag->ARN lookup
I tried resourcegroups as well, but it doesn't support IAM roles or hosted zones: FATAL Failed to destroy cluster: {"ResourceTypeFilters":["AWS::EC2::Instance","AWS::EC2::InternetGateway","AWS::EC2::NatGateway","AWS::EC2::RouteTable","AWS::EC2::SecurityGroup","AWS::EC2::Subnet","AWS::EC2::VPC","AWS::EC2::VPNGateway","AWS::Route53::HostedZone","AWS::S3::Bucket"],"TagFilters":[{"Key":"openshiftClusterID","Values":["609cbd6b-af81-4278-9285-30314d973020"]}]}: BadRequestException: Query not valid: One or more resource types are not valid: AWS::Route53::HostedZone status code: 400, request id: a1388303-14a7-11e9-b813-d172f51a0eff So with this commit I've gone with resourcegroupstaggingapi (which doesn't support IAM roles either, but at least supports hosted zones). Ingress has been doing something similar since cluster-ingress-operator@8fdcf1a0 (Optimize private zone discovery, 2019-12-20, cluster-ingress-operator#93), so we no longer need to worry about having insufficient permissions for this action. The main benefit of this change is that it efficiently finds resources we want to remove, while previously there was a fair bit of "iterate over everything to find what you're looking for". However, most of our reported throttling issues are with Route 53, and this commit doesn't completely solve that. With this commit, looking up the tagged private zone is efficient, but looking up the associated public zone requires iteration. With this commit, we limit that iteration to zones with our expected name, but accounts where many users are using the same base domain may have quite a few of those. The us-east-1 business is because Route 53 zone lookups by tag seem to require that region. For example, with a us-west-1 cluster: $ aws --region us-west-1 resourcegroupstaggingapi get-resources --query "ResourceTagMappingList[?Tags[? Key == 'openshiftClusterID' && Value == '7302acb5-5935-4121-834a-c24f42b00acb']].ResourceARN" --output json | jq -r '.[]' | grep vpc arn:aws:ec2:us-west-1:...:vpc/vpc-0d601149e4d8c7bb6 the associated zone does not appear with a us-west-1 tag search: $ aws --region us-west-1 resourcegroupstaggingapi get-resources --query "ResourceTagMappingList[?Tags[? Key == 'openshiftClusterID' && Value == '7302acb5-5935-4121-834a-c24f42b00acb']].ResourceARN" --output json | jq -r '.[]' | grep route53 but it does appear with a us-east-1 tag search: $ aws --region us-east-1 resourcegroupstaggingapi get-resources --query "ResourceTagMappingList[?Tags[? Key == 'openshiftClusterID' && Value == '7302acb5-5935-4121-834a-c24f42b00acb']].ResourceARN" --output json | jq -r '.[]' | grep route53 arn:aws:route53:::hostedzone/Z286EJRZI9ESRI This is potentially backed up by [2], although clearer AWS docs would be nice ;). The nextTagClients business allows us to filter the tagClients slice without allocation, and is based on [3]. I've dropped the code dissociating network interfaces, originally from openshift/hive@ca98a4fd (add initial hiveutil, 2018-09-20, openshift/hive#8), because it didn't seem necessary. These network interfaces are associated with load balancers, instances, and other things we have code to directly delete. And the network interfaces don't block deleting those resources. The fact that this code successfully removes EIPs in continuous-integration testing without blocking on them being returned by the tag search shows that the previous code wasn't required. [1]: https://godoc.org/github.com/aws/aws-sdk-go/service/resourcegroups [2]: https://docs.aws.amazon.com/general/latest/gr/rande.html#r53_region Requests for hosted zones, resource record sets, health checks, and cost allocation tags use the following endpoint: Region Name: US East (N. Virginia) Region: us-east-1 Endpoint: route53.amazonaws.com Protocol: HTTPS [3]: https://github.com/golang/go/wiki/SliceTricks#filtering-without-allocating
- Loading branch information