-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Option for transformer annotations #4267
Proposal: Option for transformer annotations #4267
Conversation
@natasha41575: GitHub didn't allow me to request PR reviews from the following users: yuwenma. Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lgtm |
@yuwenma: changing LGTM is restricted to collaborators In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
90c93d2
to
a2fe986
Compare
kustomize doesn't currently support comments. | ||
|
||
We also considered making this a flag to `kustomize build` rather than additional field, but we would like to align | ||
with the kustomize way of avoiding build-time side effects and have everything declared explicitly in the kustomization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the solution you proposed here is more consistent with Kustomize than either alternative. If this truly is a debug-only feature, something else we could do is a separate command like kustomize cfg explain . --resource=Deployment/default/foo
(would print meta, origin and transformer info instead of the resource bodies). That would probably require the same annotations under the hood. It could also be used to prove out the usefulness/completeness of the data before we add it to the build command.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to extend the motivations of this feature a little bit:
- This feature is an extension of the previous
originAnnotations
. It may make less sense to split the feature to a separate command and not provide an equivalence declarative approach. - The
kustomize cfg
approach is harder to scale for a large set of resources, which is how the debugging can mostly help with. For example, when it's hard to triage the origin and transformer of 500 deployments from 1000 resource files, users would be unwillingly to run thekustomize cfg
cmd 500 times I guess? - "annotations" is designed to attach metadata, no size-limitations. It is ideal for this kind of tracing info and many other tools or libs use it the same way. e.g. Enterprise level: Google Config Sync, OSS: tekton, skaffold.
- When thinking about this approach in the bigger picture, say working in a gitOps CICD approach, the git diff of different kustomize output will not only shows the config change, but the buildMetadata origin changes. It will greatly benefit those users who apply kustomize in their enterprise CICD workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to all points Yuwen described above. While I think something like kustomize cfg explain . --resource=Deployment/default/foo
could be useful in some scenarios, I particularly agree with Yuwen's points 2 and 3 above - scalability for a user needing to inspect a large set of resources, and that this metadata is useful for other tools and "annotations" is where they'd expect to find them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the proposed implementation clearly builds on originAnnotations, the primary use case set out sounds pretty different, so I think it's worth considering what the best way to achieve the end user goal in question would be rather than presupposing the implementaiton.
In case (2), what is the ideal user workflow in your opinion? Surely they aren't actually looking through all 500 deployments, but rather targeting one or two that have the problem they're investigating. If there's a use case where they really need to delve through all of the large dataset, let's add it.
Re (3), storage isn't actually free, no matter what API server validations will allow. I'm aware of cases of extremely verbose annotations causing significant scalability problems. So if we're designing this feature with the intent that end users will be committing and deploying the data we write, I think we should indeed be careful about how verbose we're being. On the contrary, if the 99% use case is to debug something before commit, we should be super verbose and as helpful as possible.
Further to that point: if the purpose is debugging, it would be most useful to use absolute paths, not paths relative to the parent Kustomization or the current Kustomize root. If we're writing something to be committed, we obviously cannot do that, and in fact should aim to make the annotation content as stable as possible.
Can you give a more detailed example of why (4) is desirable? By embedding what are effectively implementation details of the generation pipeline in the output, we will make it inherently less stable--not generally a good thing for gitOps. For example, renaming a file will cause a diff even though it does not affect the resources' content. Back to my point above, I think optimizing for content that will be committed is in tension with optimizing for debuggability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To add on to Yuwen's point 3, one of the use cases is Google's Config Sync, which takes a kustomization repo, automatically runs kustomize build
, and syncs it to a cluster. For these users, it is useful to have these annotations available in the cluster for all deployed resources. If a user notices that something is wrong in the cluster, Config Sync can quickly and easily show the user how the resource was created. An open source tool with a very similar requirement is skaffold, which can run kustomize and deploy the output to a cluster via skaffold deploy
. kubectl apply -k
also falls under this category.
For a user using kustomize CLI directly, a separate command to debug the output of kustomize makes sense. But for users using other tools that automatically run kustomize and deploy, it is better for them if the annotations persist to the cluster. Many of these users may not even have the kustomize CLI installed to their local machines, and so allowing other tools to add these annotations as part of kustomize build
is a huge benefit.
Edited to add: I am also open to adding both an option to buildMetadata to annotate resources and a separate kustomize command to provide users some flexbility in how they use the feature. Since both would be driven by the same annotations under the hood, I think this is a maintainable option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Natasha. Yeah, the actual use case is based on the fact that kustomize is integrated into with some other automated tools and users do not directly interact with kustomize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have updated the proposal to include a user story as described for our use cases, and added adding a separate kustomize command as a considered alternative. Let me know if you feel strongly about having both the command and the built option; otherwise I believe all of the feedback has been addressed and incorporated into the proposal.
kustomize doesn't currently support comments. | ||
|
||
We also considered making this a flag to `kustomize build` rather than additional field, but we would like to align | ||
with the kustomize way of avoiding build-time side effects and have everything declared explicitly in the kustomization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While the proposed implementation clearly builds on originAnnotations, the primary use case set out sounds pretty different, so I think it's worth considering what the best way to achieve the end user goal in question would be rather than presupposing the implementaiton.
In case (2), what is the ideal user workflow in your opinion? Surely they aren't actually looking through all 500 deployments, but rather targeting one or two that have the problem they're investigating. If there's a use case where they really need to delve through all of the large dataset, let's add it.
Re (3), storage isn't actually free, no matter what API server validations will allow. I'm aware of cases of extremely verbose annotations causing significant scalability problems. So if we're designing this feature with the intent that end users will be committing and deploying the data we write, I think we should indeed be careful about how verbose we're being. On the contrary, if the 99% use case is to debug something before commit, we should be super verbose and as helpful as possible.
Further to that point: if the purpose is debugging, it would be most useful to use absolute paths, not paths relative to the parent Kustomization or the current Kustomize root. If we're writing something to be committed, we obviously cannot do that, and in fact should aim to make the annotation content as stable as possible.
Can you give a more detailed example of why (4) is desirable? By embedding what are effectively implementation details of the generation pipeline in the output, we will make it inherently less stable--not generally a good thing for gitOps. For example, renaming a file will cause a diff even though it does not affect the resources' content. Back to my point above, I think optimizing for content that will be committed is in tension with optimizing for debuggability.
2a02b4c
to
07daa8b
Compare
07daa8b
to
3e94848
Compare
3e94848
to
ad29ef2
Compare
ad29ef2
to
542b7c7
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: KnVerey, natasha41575 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
This is a proposal to extend the
buildMetadata.originAnnotations
feature that we recently introduced. We would like to add the annotations to generated resources and also have an option to capture information about which transformers have processed each resource./cc @KnVerey
/cc @yuwenma
/cc @monopole
/hold to allow everyone to get their reviews in before this is merged