-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ROX-17987: Remove unused Operator deployments #1118
Conversation
@@ -129,6 +130,24 @@ func (u *ACSOperatorManager) ListVersionsWithReplicas(ctx context.Context) (map[ | |||
return versionWithReplicas, nil | |||
} | |||
|
|||
// DeleteOperator removes specified operator deployment from the cluster | |||
func (u *ACSOperatorManager) DeleteOperator(ctx context.Context, version string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is this function executed and operator versions passed as an input parameter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be executed here https://github.com/stackrox/acs-fleet-manager/pull/1091/files#diff-4a5f912092e037464fef8f19fd9a883a0587f4c5dedc2d15c40b5a4109551addR40-R41
#1091 depends on this method
@@ -129,6 +130,24 @@ func (u *ACSOperatorManager) ListVersionsWithReplicas(ctx context.Context) (map[ | |||
return versionWithReplicas, nil | |||
} | |||
|
|||
// DeleteOperator removes specified operator deployment from the cluster | |||
func (u *ACSOperatorManager) DeleteOperator(ctx context.Context, version string) error { | |||
depName := operatorDeploymentPrefix + "-" + version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've expected a function which will receive all expected operator versions, list all installed deployments and tries to delete all deployments which are not used anymore.
Similar to a garbage collector.
Additionally the computation of the name does not match the computation of the helm chart (i.e. the chart truncates the deployment name).
To keep the logic in one place please move the name computation to Go instead of the Helm template and reuse the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've update the method. It can handle multiple deletion and it expects a slice of images to delete. I think it's convenient to use mages because they are the real representation of the operator deployment and images are used to install multiple operator versions. So it's reasonable to send images to delete function as well.
Also deploymentName
chart val is added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good overall, only some nit-picks.
|
||
// delete multiple versions | ||
err = u.InstallOrUpgrade(ctx, operatorImages, crdTag1) | ||
require.NoError(t, err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you assert the operators were installed?
Imho it is not always necessary to execute InstallOrUpgrade
, you can also directly pass Deployment object to the fake client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like your suggestion about passing Deployment object to the fake client.
I also added small check that only deployment is delete (e.g. serviceAccount still persists)
for _, deployment := range deployments.Items { | ||
for _, container := range deployment.Spec.Template.Spec.Containers { | ||
if container.Name == "manager" && slices.Contains(images, container.Image) { | ||
deleteDeps = append(deleteDeps, deployment.Name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can assume that all deployments are operator deployments because you've selected them by using the label rhacs-operator
.
To check whether an operator should be garbage collected you can re-compute the name instead of using the images to track an unused deployment.
Alternateviley you could also maintain a data structure internally which maps deployment names to images before the Helm chart is applied, like a cache.
WDYTH about these approaches?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've change the method so now it cleans up unused operators. Although, I still stick to use images for checking what to delete because parsing deployment names introduces new potentials error to check.
Do you think it's better to call parseOperatorImages
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think it's better to call parseOperatorImages?
What do you mean? 🤔
Although, I still stick to use images for checking what to delete because parsing deployment names introduces new potentials error to check.
Sounds good!
Co-authored-by: Simon Bäumer <[email protected]>
@@ -115,6 +117,46 @@ func (u *ACSOperatorManager) ListVersionsWithReplicas(ctx context.Context) (map[ | |||
return versionWithReplicas, nil | |||
} | |||
|
|||
// RemoveUnusedOperators removes unused operator deployments from the cluster | |||
func (u *ACSOperatorManager) RemoveUnusedOperators(ctx context.Context, desiredImages []string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
curiosity: What was the reason for you to not go with GarbageCollectOperators
or similar?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't want to put Garbage next to the Operator.
Seriously though, garbage collect
sounds like it's about interacting with memory, having cycles/references/generations. But we just deleting deployment(s) and letting k8s do the rest for us
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kurlov, SimonBaeumer The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Co-authored-by: Simon Bäumer <[email protected]>
New changes are detected. LGTM label has been removed. |
Description
Add method for removing unused operator deployments based on desired slice of operator images. So any deployed image which is not presented in the desired slice should be removed.
Checklist (Definition of Done)
- [ ] Added test description underTest manual
- [ ] Documentation added if necessary (i.e. changes to dev setup, test execution, ...)ROX-12345: ...
- [ ] Discussed security and business related topics privately. Will move any security and business related topics that arise to private communication channel.- [ ] Add secret to app-interface Vault or Secrets Manager if necessary- [ ] RDS changes were e2e tested manually- [ ] Check AWS limits are reasonable for changes provisioning new resourcesTest manual