Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend kube-state-metrics to support Custom Resource metrics #1644

Merged
merged 4 commits into from
Jan 4, 2022

Conversation

Garrybest
Copy link
Member

What this PR does / why we need it:
Now kube-state-metrics does not support Kubernetes Custom Resource metrics. According to #1640, I try to make kube-state-metrics to be more flexible to support Custom Resource. Users only need to:

  1. Depend this repo in their own project.
  2. Implement the new interface customresource.RegistryFactory of their Custom Resources as a plugin factory to tell kube-state-metrics how to display custom resource metrics.
  3. Register the factories as a parameter by calling RunKubeStateMetrics.

The inspiration is from kube-scheduler out-of-tree plugin registry. The well-know project scheduler-plugin is based on this cool function. It adds new scheduler plugins without modifying any source codes in kube-scheduler.

I have taken Custom Resource samplev1alpha1.Foo in samplecontroller as an example in the interface comments and unit test.

How does this change affect the cardinality of KSM: (increases, decreases or does not change cardinality)
It does not directly change or affect the cardinality of existing metrics.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #1640

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 29, 2021
@k8s-ci-robot
Copy link
Contributor

Welcome @Garrybest!

It looks like this is your first PR to kubernetes/kube-state-metrics 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/kube-state-metrics has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Nov 29, 2021
@Garrybest
Copy link
Member Author

Garrybest commented Nov 29, 2021

The PR has been divided into 4 parts.

  1. Design a Custom Resource RegistryFactory for out-of-tree implements. Anyone who wants to extend kube-state-metrics should implement this interface to tell us how to get and display their metrics.
  2. Reform Builder to adapt custom resource registry. Now the Builder could register custom resource metric stores.
  3. Reform main function to adapt custom resource registry. Users could not depend main pkg, so I move the codes into a new pkg app and call the custom resource register functions for Builder.
  4. Add a custom resource unit test based on foo object in samplecontroller. The obj samplev1alpha1.Foo could be a nice example so I add k8s.io/sample-controller in go.mod and make a best practice in server_test.go.

@Garrybest
Copy link
Member Author

/cc @fpetkovski

// WithGenerateCustomResourceStoresFunc configures a custom generate custom resource store function
func (b *Builder) WithGenerateCustomResourceStoresFunc(f ksmtypes.BuildCustomResourceStoresFunc, u bool) {
b.buildCustomResourceStoresFunc = f
b.useAPIServerCache = u
Copy link
Contributor

@Serializator Serializator Nov 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

b.useAPIServerCache is used by (store.*Builder).WithGenerateStoresFunc as well, what would happen in the following scenario?

store := ...
store.WithGenerateStoresFunc(..., false)
store.WithGenerateCustomResourceStoresFunc(..., true)

(store.*Builder).useAPIServerCache is implicitly overwritten by the invocation of (store.*Builder).useAPIServerCache though the expectation could be that they both keep a separate property / variable for this (e.g. (store.*Builder).useAPIServerCacheForCustomResources).

It could also be that (store.*Builder).useAPIServerCache should not be too tightly coupled with the function and thus needs to be configured separately by adding an extra method to do so.

Or this might be an indication the custom resources like this deserve their own builder (instance), this way you wouldn't need to add respective methods for custom resources to the existing builder but create two separate builders and combine the stores later on.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Serializator for reminding! I have not even noticed that. I prefer this solution.

It could also be that (store.*Builder).useAPIServerCache should not be too tightly coupled with the function and thus needs to be configured separately by adding an extra method to do so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree when looking specifically at (store.*Builder).useAPIServerCache but I'd like to "widen" the "scope" of my initial feedback and look at the bigger picture in terms of extensibility of store.Builder itself.

I haven't looked through the issue or pull request completely so please correct me if I'm wrong! CR metrics are basically metrics like the built-in metrics by KSM but registrered a different way.

Why did you make the choice to extend the existing store.Builder rather than create a separate instance of it specifically for registering and working with CR metrics, and later on make some kind of "composite" store which combines the built-in and the CR metrics?

I'm asking in terms of future extensability. If there is ever a need to add an additional way of registering metrics then methods will again be added to the store.Builder to allow for it ("extending" the store.Builder) rather than "building upon it" and keepign the implementation of store.Builder light and clean.

I'd love to hear your point of view and opinion on this! I might be completely wrong 😉

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Serializator, it's really a good question.

I choose to extend store.Builder because IMO actually the Custom Resource is very similar to Kubernetes built-in resource. The only thing different is that we don't know their type, what their metrics look like, how to list and watch them(the client code is generated by code-generator). Once people tell us the necessary information, it behaves like a common built-in resource. It could be built and stored similarly, shared the same logic with store.Builder.

Rather than create a separate instance of it specifically for registering and working with CR metrics, and later on make some kind of "composite" store which combines the built-in and the CR metrics

I think this solution is more complicated. As I mentioned, the CR and built-in resource could share the same building logic once we have the information in registry_factory.go. Actually, what I'm thinking is all metrics could be registered by this interface including built-in resource. From the aspect of extensability, the inner resource could be added by this method. Actually, I suppose removing the built-in resource registry from store.Builder and keep the CustomResourceStores may be more light and clean?

Just some personal idea, lol 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that sense I understand and agree (not that you need my approval 😉) in terms of the extending the store.Builder and the why behind it.

(store.Builder).buildCustomResourceStores though is really similar to (store.Builder).buildStores, with the only difference being the "client" which is passed to the listWatchFunc. The crutch in this case is in the fact that (store.Builder).buildStores is aware of the client (k8s.io/client-go/kubernetes.Interface because (store.Builder).buildStores is responsible for calling the listWatchFunc function. Disappointing (😞 😆), maybe refactor in the future to abstract away the client (as this can differ now with the CR metrics).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @Serializator. You are right! We have the same opinions😃. That also distresses me!

These two functions are so similar but unfortunately we have no ideas about what type the customResourceClient is. If we merge them together, all types of listWatchFunc in internal/store directory must be revised. This revision may be a little bit huge and does not focus on custom resource extension. So I decided to make a optimization in the future. That would be better for reviewing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a TODO comment. We could optimize it later😉.

@Garrybest Garrybest force-pushed the pr_crd branch 2 times, most recently from 2ea2300 to 12822db Compare November 29, 2021 15:22
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 1, 2021
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 2, 2021
Copy link
Contributor

@fpetkovski fpetkovski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great start, thanks for taking the initiative 👍

pkg/app/server.go Outdated Show resolved Hide resolved
Copy link
Contributor

@fpetkovski fpetkovski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff, just one small nit from my side

main.go Outdated Show resolved Hide resolved
@Garrybest Garrybest force-pushed the pr_crd branch 2 times, most recently from 073a1d2 to 6f0d3e7 Compare December 8, 2021 09:09
@fpetkovski
Copy link
Contributor

Great work, /lgtm

/hold for others to take a look as well

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 8, 2021
@chrischdi
Copy link
Member

chrischdi commented Dec 9, 2021

First of all: thank you for working on this 👍 With this interface it is also possible to use the controller-runtime client :-)

I did take a look at it and have some feedback, but I think this could get fixed in future Pull Requests:

The flag --resources does not list the resource names of the CRs. Filtering for the CRs does work so its just about the help text.
Also in regards to that: maybe it would be handy if it is possible to disable the standard kube-state-metrics metric families by default (e.g. the main.go of kube-state-metrics could pass the core metrics to RunKubeStateMetrics the same way we can do for Custom Resources).

Edit: addition: it is possible to (copy and) re-define the flags and set --resources to custom values.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 9, 2021
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 10, 2021
@Garrybest
Copy link
Member Author

Hi @chrischdi, I really agree with you.

Now the flag --resources only list all built-in resources. I don't make a big revision here because I try to focus on CR registery and interface definition in this PR. So I just add all CRs into enabledResources but do not print them by default.

It is a good idea for us to fix it in the future PR. Thanks @chrischdi for your kind reminds. 😄

@chrischdi
Copy link
Member

Hi @chrischdi, I really agree with you.

Now the flag --resources only list all built-in resources. I don't make a big revision here because I try to focus on CR registery and interface definition in this PR. So I just add all CRs into enabledResources but do not print them by default.

It is a good idea for us to fix it in the future PR. Thanks @chrischdi for your kind reminds. smile

👍 yes I think that's totally fine.

I already found a workaround which I think is more than good enough:
https://github.com/chrischdi/cluster-api-state-metrics/blob/chrischdi/ksm-pr-adopt/main.go#L42

@Garrybest
Copy link
Member Author

I already found a workaround which I think is more than good enough:
https://github.com/chrischdi/cluster-api-state-metrics/blob/chrischdi/ksm-pr-adopt/main.go#L42

Great👍, that's what I'm thinking before! See #1644 (comment). The built-in resource could also implement the registry interface and be passed to RunKubeStateMetrics as a parameter like what you did. And the problem we mentioned could also be solved.

@Garrybest
Copy link
Member Author

Hi @brancz @dgrisonnet, would you please take a look? My appreciation. 😄

// WithCustomResourceStoreFactories returns configures a custom resource stores factory
func (b *Builder) WithCustomResourceStoreFactories(fs ...customresource.RegistryFactory) {
for i := range fs {
f := fs[i]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @chrischdi, I made a mistake here in the previous version. The f must be copied because the value of availableStores is a func. Otherwise, it will be registed the last ones when custom resources are more than one. I have fixed here, please pay attention. My apology.

@fpetkovski
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 16, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: fpetkovski, Garrybest

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 16, 2021
@Garrybest
Copy link
Member Author

Can we move forward?

@fpetkovski
Copy link
Contributor

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 4, 2022
@k8s-ci-robot k8s-ci-robot merged commit 081cc3c into kubernetes:master Jan 4, 2022
@Garrybest Garrybest deleted the pr_crd branch January 4, 2022 10:21
chrischdi added a commit to chrischdi/cluster-api-state-metrics that referenced this pull request Jan 4, 2022
Makes use of implementation at
github.com/kubernetes/kube-state-metrics/pull/1644.

Adds features of kube-state-metrics like sharding.

Signed-off-by: Schlotter, Christian <[email protected]>
chrischdi added a commit to chrischdi/cluster-api-state-metrics that referenced this pull request Jan 4, 2022
Makes use of implementation at
github.com/kubernetes/kube-state-metrics/pull/1644.

Adds features of kube-state-metrics like sharding.

Signed-off-by: Schlotter, Christian <[email protected]>
chrischdi added a commit to mercedes-benz/cluster-api-state-metrics that referenced this pull request Jan 4, 2022
Makes use of implementation at
github.com/kubernetes/kube-state-metrics/pull/1644.

Adds features of kube-state-metrics like sharding.

Signed-off-by: Schlotter, Christian <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extend kube-state-metrics to support Custom Resource metrics
5 participants