-
Notifications
You must be signed in to change notification settings - Fork 584
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release-1.5] feat: external load balancer garbage collection (part 2) - new gc service #3632
[release-1.5] feat: external load balancer garbage collection (part 2) - new gc service #3632
Conversation
@richardcase: This issue is currently awaiting triage. If CAPA/CAPI contributors determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lgtm |
/test ? |
@sedefsavas: The following commands are available to trigger required jobs:
The following commands are available to trigger optional jobs:
Use
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
We might need to backport a linter update: #3617 |
#3637 should fix it. |
This introduces a new garbage collection service which will be used later by the controllers to do clean up of AWS resources for child/tenant clusters that where created via the CCM. Initially we support clearing up load balancers (classic elb, nlb, alb) and security groups. It works in 2 phases depending on if a child cluster is being created/update (i.e. **Reconcile**) or deleted (i.e. **ReconcileDelete**). When a cluster is created the cluster controllers will call **Reconcile**. Its purpose is to deteremine if the cluster should be garbage collected and if it should then it needs marked. If the gc feature is enabled we will operate an opt-out model, so all clusters will be garbage collected unless they exlicitly opt out. To opt out the infra cluster must have the `aws.cluster.x-k8s.io/external-resource-gc` annotation with a value of false. Otherwise a cluster is marked as requiring gc by adding the `awsexternalresourcegc.infrastructure.cluster.x-k8s.io` finalizer to the infra cluster. When a cluster is deleted the cluster controllers will call **ReconcileDelete**. The first job is to identify the AWS resources that where created by the CCM in the child/tenant cluster. This is done by using the AWS resource tagging api and getting resources with the kubernetes cluster owned label. The resources that are returned are then put into buckets for each AWS resource type (i.e. ec2, elasticloadbalancing) and then these buckets of resource ARNs are passed to a function for that specific AWS service which will do teh actual API calls to clear up the AWS resources. The reason we use buckets is that the order in which you delete services can matter, for example load balancers must be deleted before target groups. The service will be use by a later change to the controllers. Signed-off-by: Richard Case <[email protected]>
Changed the gc service based on the lateset proposal update & review feedback. It now uses a model where "the user decides whether to enable the gc at any time before they delete the cluster.". This translates into the `ReconcileDelete` now: - Checks to see ig the cluster should be garbage collected by looking at the gc annotation (if it exists). - Does resource cleanup Signed-off-by: Richard Case <[email protected]>
The the cleanup functions now accepted the full list of AWS resources and then they filter which resources they want to delete themselves. Signed-off-by: Richard Case <[email protected]>
74c8471
to
a65ccb2
Compare
/test pull-cluster-api-provider-aws-e2e-blocking-release-1-5 |
/test pull-cluster-api-provider-aws-verify-release-1-5 |
@richardcase some issue with multitenancy test case 🤔 ? |
/test pull-cluster-api-provider-aws-e2e-blocking-release-1-5 |
All passing now :) /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: richardcase The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
This is a manual cherry-pick of #3610
What type of PR is this?
/kind feature
What this PR does / why we need it:
This introduces a new garbage collection service which will be used later by the controllers to do clean up of AWS resources for workload clusters that were created via the CCM. Initially, we support deleting load balancers (classic elb, nlb, alb), target groups and security groups.
The service is design to be called by the infra cluster reconcilers when a cluster is delete. The entry point is ReconcileDelete into the service.
When a cluster is deleted the cluster controllers will call ReconcileDelete. The first job is to determine if the workload cluster being deleted should be garbage collected. A cluster will be garbage collected if either of these are true:
aws.cluster.x-k8s.io/external-resource-gc
annotation is absentaws.cluster.x-k8s.io/external-resource-gc
annotation exists and its value is set to trueIf a cluster is to be garbage collected then we need to identify the AWS resources that where created by the CCM in the workload cluster. This is done by using the AWS resource tagging api and getting resources with the Kubernetes cluster owned label (i.e.
kubernetes.io/cluster/[CLUSTERNAME]
). The resources that are returned are converted and then the list of them is passed to the defined cleanup functions in order. The order of the functions matter because some AWS resources cannot be delete before others. For example, target groups can only be deleted after load balancers. The cleanup function will then iterate over the resources and decide if it wants to handle that resource and delete it from AWS.The service will be use by a later PR to the controllers and so is initially unused.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Relates #1718
Special notes for your reviewer:
This part 2 of 4 of changes (i.e. a stack) to implement the garbage collection. This is work to split up the original pr #3518
Checklist: