Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(cli): add kit prune and kit squash commands to manage IntegrationKits #4552

Closed
wants to merge 4 commits into from

Conversation

johnpoth
Copy link
Member

Hi

This builds on #3979 and splits the gc command into kit prune and kit squash as suggested. I've also added some tests (needs container-tools/kind-action#14 to be released)

Thanks !

Release Note

NONE

@github-actions
Copy link
Contributor

🐫 Thank you for contributing! 🐫

Code Coverage Report ✔️

  • Coverage unchanged.

Copy link
Contributor

@squakez squakez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work done, but IMO there are a few points we should discuss:

  1. We decided in the roadmap development to limit the addition of feature in the CLI, trying to move as much as we can into Camel JBang. In the future we may remove at all the kamel cli. Adding more code that will be dismissed soon is not a good idea.
  2. We are adding a lot of direct logic from the CLI to the registry directly. This is highly unsecure. We are already discussing the opportunity to change the way we access the registry from the operator via Spectrum (we're even discussing to deprecate it soon). In this PR we're having a direct access with go-containerregistry directly from the user CLI, neither from the operator. This is potentially dangerous.
  3. I think that depending on the way the user is modelling its deployment, this could lead to inconsistencies. The kamel promote and several suggestions we've provided recently around GitOps are telling the user to share registry among namespaces and even among separate clusters. We cannot assume a Kit (and the container) is only used by a single namespace or even cluster.
  4. Although the idea of squashing or deleting could be good from a user perspective is quite a limitation for the operator. The possibility to reuse middle containers avoid the need to recreate from scratch. The more Kits an operator has available, the less work to recreate one from scratch will require.

I really appreciate the work done, but unfortunately it does not match the direction and the design choices we're trying to follow lately.

@johnpoth
Copy link
Member Author

Thanks for the feedback! Here are some comments inline

Thanks for the work done, but IMO there are a few points we should discuss:

  1. We decided in the roadmap development to limit the addition of feature in the CLI, trying to move as much as we can into Camel JBang. In the future we may remove at all the kamel cli. Adding more code that will be dismissed soon is not a good idea.

So the idea here is to move this logic to the Operator in the future. For now, having this in the CLI gives us more control until we reach a state where this can be automated IMO.

  1. We are adding a lot of direct logic from the CLI to the registry directly. This is highly unsecure. We are already discussing the opportunity to change the way we access the registry from the operator via Spectrum (we're even discussing to deprecate it soon). In this PR we're having a direct access with go-containerregistry directly from the user CLI, neither from the operator. This is potentially dangerous.

Why is this highly unsecure/potentially dangerous? As @lburgazzoli pointed out, a lot of projects use the Image Registry for storage. It is an exposed service after all. I am probably missing something though...

  1. I think that depending on the way the user is modelling its deployment, this could lead to inconsistencies. The kamel promote and several suggestions we've provided recently around GitOps are telling the user to share registry among namespaces and even among separate clusters. We cannot assume a Kit (and the container) is only used by a single namespace or even cluster.

For sure! There is a TODO to add the Namespaces to search for. I guess a list of Clusters could be added to the list...

  1. Although the idea of squashing or deleting could be good from a user perspective it is quite a limitation for the operator. The possibility to reuse middle containers avoid the need to recreate from scratch. The more Kits an operator has available, the less work to recreate one from scratch will require.

Agreed. That being said there has to be some strategy to limit the proliferation of Integration Kits as mentioned in other stories (#254 and #2736) .... or is it no longer a bug, it's a feature?

I really appreciate the work done, but unfortunately it does not match the direction and the design choices we're trying to follow lately.

All in all I agree it's important we are on the same page before solving the problem,

Thanks !

@claudio4j
Copy link
Contributor

The possibility to clean old unused images and kits is important and would add value to Camel K.
I think there is a long way to dismiss kamel cli, and we can label this feature as experimental, for the sole purpose to gather feedback.
About the squash operation I am worried that once it's called, there may be downtime to the integration pod. I will do some testing to see how it works.

@squakez
Copy link
Contributor

squakez commented Jul 11, 2023

Thanks @johnpoth for understanding. Let me try to add some more information which I hope are good to put everybody on the same page. I agree we need to find some good solution to the problem of compacting the proliferation of containers, but it needs to be aligned with the design and the best practices we want to put in place.

Let's start from the design. My concern is that we're allowing a direct access from the CLI to the Container Registry which may lead to inconsistencies. As an analogy, it would be like having a database application and let the UI directly alter the data bypassing the controller logic. If we want to add some API to manipulate the registry, this has to be done on top of the operator which is our controller. Let's imagine the situation where you're deleting an IntegrationKit. This is okey, and while deleting the related container image, this is failing because some network problem. We'd leave this in inconsistent state.

The other important point is the fact that you really don't know who is using your container image. You're making the assumption that it's just linked to an Integration. However the reality is that the image could be used in any other cluster. I cannot see how really consistent could be the operation against a shared container image. We need to think this in a company which may have separate departments that may not necessarily know what each others are doing.

In my opinion, a possible strategy for this feature could be to limit the discover of IntegrationKits and Builds which are no longer used and perform a prune of them (only the Kubernetes objects, not the images) warning the user that this could lead to inconsistencies anyway. Ideally we can output a list of images which are bound to those Kits as well, to be submitted for a registry cleaning. The cleaning of a container registry should be performed according to the policies of the companies and likely the departments in charge with their tools and scripts. Even if we provide this feature as in the PR I hardly doubt that any company would let the user perform operation in a production container registry so lightly.

Finally a remark about the future of the CLI. It's true we are not dismissing it in the short term, but, the more work we add on the CLI now, the longer this term is getting. If we define and agree a roadmap is in order to provide a list of priorities on which we should focus. Planning and discussing take time, so, once we have some agreement for the year, we should follow it as much as we can. At this stage we have a lot of other priorities that would deserve developers attention IMO.

Hope it clarifies.

@johnpoth
Copy link
Member Author

The possibility to clean old unused images and kits is important and would add value to Camel K. I think there is a long way to dismiss kamel cli, and we can label this feature as experimental, for the sole purpose to gather feedback. About the squash operation I am worried that once it's called, there may be downtime to the integration pod. I will do some testing to see how it works.

Thanks for the feedback!

The Integration Pod will be redeployed it it's image is deemed to be squashed. Here's the output from a squash run from one of the IT tests:

kamel kit squash 

The following Integration Kits will be squashed:
k in namespace: camel-k, f in namespace: camel-k, b in namespace: camel-k, will all be squashed into Integration Kit: k in namespace: camel-k

The following Integrations will updated with a new squashed Image and redeployed:
k in namespace: camel-k

The following Integration Kits will be deleted:
e in namespace: camel-k
f in namespace: camel-k
b in namespace: camel-k
h in namespace: camel-k
i in namespace: camel-k

The following Images will be deleted from the Image Registry:
k3d-registry:39472/camel-k/camel-k-kit-cimi34mdvpte2dpsk8tg@sha256:e733ef1dae77e09406dee5aaad6c75a28487907470aae2f0d800ba6f8f1461a8
k3d-registry:39472/camel-k/camel-k-kit-cimi34mdvpte2dpsk8tg@sha256:65883ff8118672bf57cff2e4cb021441e1b851d29036ecf4ea04629a895978cb
k3d-registry:39472/camel-k/camel-k-kit-cimi34mdvpte2dpsk8tg@sha256:af7495bf814c49f88fbd3dd8a67a68b8214f3a664f68dcd4181b07e6241bc031
k3d-registry:39472/camel-k/camel-k-kit-cimi34mdvpte2dpsk8tg@sha256:8d9fb10e38ed2bd4abf385ba71112a6875658d4323fcfb10c0bf672461b6a0f8
k3d-registry:39472/camel-k/camel-k-kit-cimi34mdvpte2dpsk8tg@sha256:5b0511adc66eee9927bffc1b020f6f2fff12878f9d6dcb43224b813311289284

Continue Y/N ?

Or a prune run:

kamel kit prune 

The following Integration Kits will be deleted:
kit-cimi34mdvpte2dpsk8tg in namespace: camel-k
b in namespace: camel-k
e in namespace: camel-k
f in namespace: camel-k
d in namespace: camel-k
h in namespace: camel-k
i in namespace: camel-k

The following Images will no longer be used by camel-k and can be deleted from the Image Registry:
k3d-registry:39472/camel-k/camel-k-kit-cimi34mdvpte2dpsk8tg@sha256:938a4ba8d320d381950de54a47d8529cc83d3936fd6b1181ec341a49756d304d
k3d-registry:39472/camel-k/camel-k-kit-cimi34mdvpte2dpsk8tg@sha256:af7495bf814c49f88fbd3dd8a67a68b8214f3a664f68dcd4181b07e6241bc031
k3d-registry:39472/camel-k/camel-k-kit-cimi34mdvpte2dpsk8tg@sha256:e733ef1dae77e09406dee5aaad6c75a28487907470aae2f0d800ba6f8f1461a8
k3d-registry:39472/camel-k/camel-k-kit-cimi34mdvpte2dpsk8tg@sha256:65883ff8118672bf57cff2e4cb021441e1b851d29036ecf4ea04629a895978cb
k3d-registry:39472/camel-k/camel-k-kit-cimi34mdvpte2dpsk8tg@sha256:9631f0cf35ceec77a7c21c36e7c0598d8aa17c4ca7386f7489d974cda5eb7889
k3d-registry:39472/camel-k/camel-k-kit-cimi34mdvpte2dpsk8tg@sha256:8d9fb10e38ed2bd4abf385ba71112a6875658d4323fcfb10c0bf672461b6a0f8
k3d-registry:39472/camel-k/camel-k-kit-cimi34mdvpte2dpsk8tg@sha256:5b0511adc66eee9927bffc1b020f6f2fff12878f9d6dcb43224b813311289284

Continue Y/N ?

In both cases, the user can then choose to continue or not.

This would eventually be moved to the Operator once we gather more feedback. The user would still be able to trigger a run by creating a CR which the Operator would watch...

Thanks !

@johnpoth
Copy link
Member Author

Thanks @johnpoth for understanding. Let me try to add some more information which I hope are good to put everybody on the same page. I agree we need to find some good solution to the problem of compacting the proliferation of containers, but it needs to be aligned with the design and the best practices we want to put in place.

Let's start from the design. My concern is that we're allowing a direct access from the CLI to the Container Registry which may lead to inconsistencies. As an analogy, it would be like having a database application and let the UI directly alter the data bypassing the controller logic. If we want to add some API to manipulate the registry, this has to be done on top of the operator which is our controller. Let's imagine the situation where you're deleting an IntegrationKit. This is okey, and while deleting the related container image, this is failing because some network problem. We'd leave this in inconsistent state.

So the long term goal would be to move this to the Operator

The other important point is the fact that you really don't know who is using your container image. You're making the assumption that it's just linked to an Integration. However the reality is that the image could be used in any other cluster. I cannot see how really consistent could be the operation against a shared container image. We need to think this in a company which may have separate departments that may not necessarily know what each others are doing.

This would be an admin task for sure

In my opinion, a possible strategy for this feature could be to limit the discovery of IntegrationKits and Builds which are no longer used and perform a prune of them (only the Kubernetes objects, not the images) warning the user that this could lead to inconsistencies anyway. Ideally we can output a list of images which are bound to those Kits as well, to be submitted for a registry cleaning. The cleaning of a container registry should be performed according to the policies of the companies and likely the departments in charge with their tools and scripts. Even if we provide this feature as in the PR I hardly doubt that any company would let the user perform operation in a production container registry so lightly.

Yeah most registries I've seen have the delete operation disabled

Finally a remark about the future of the CLI. It's true we are not dismissing it in the short term, but, the more work we add on the CLI now, the longer this term is getting. If we define and agree a roadmap is in order to provide a list of priorities on which we should focus. Planning and discussing take time, so, once we have some agreement for the year, we should follow it as much as we can. At this stage we have a lot of other priorities that would deserve developers attention IMO.

Not sure about removing the CLI... we'll see !

Hope it clarifies.

Thanks!

@oscerd
Copy link
Contributor

oscerd commented Jul 11, 2023

Having both the CLIs, kamel and camel-jbang, will require a lot of work for maintenance.. It would be better to focus on moving some of the stuff we have in jbang cli. It won't be easy: the more we add on kamel CLI, the more it will be hard to move.

@johnpoth
Copy link
Member Author

All in all I think the consensus for now is to rely on custom scripts to scrap unused IntegrationKits/Images. The Image Registry admin could then delete the Images and optimize the remaining ones.

I think we can close this PR for the time being and maybe add a comment on open the issues

Thanks for the feedback!

@squakez
Copy link
Contributor

squakez commented Jul 11, 2023

All in all I think the consensus for now is to rely on custom scripts to scrap unused IntegrationKits/Images. The Image Registry admin could then delete the Images and optimize the remaining ones.

I think we can close this PR for the time being and maybe add a comment on open the issues

Thanks for the feedback!

IMO we can still leverage the scraping part and providing an output with the affected images. Or as you suggest, it would be the equivalent of an external script taking care of the same.

@claudio4j
Copy link
Contributor

Looking closely the proposed code, I would say this operation should really be in the controller code and not in the client side.
There is the Garbage collection of unused containers and images from kubernetes doc that advises against any external cleanup, which Pasquale made an important consideration.
We can create a doc on how to prune unused resources.

@johnpoth
Copy link
Member Author

Sounds good! Closing this one for now..

Thanks for the feedback!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants