Modify autodiscovery for KSM considering new Arch #217

paologallinaharbur · 2021-10-04T11:17:11Z

Taking into account that only one component needs to scrape KSM and that the process needs to be long-running we could simplify greatly autodiscovery.

invidian · 2021-10-08T16:54:58Z

So KSM currently does not support HA, as stated in kubernetes/kube-state-metrics#611. Only sharding.

This means for new discovery, we should watch and scrape all endpoints for a given service (or just via label selector). Here is example headless service of KSM: https://github.com/kubernetes/kube-state-metrics/blob/master/examples/autosharding/service.yaml).

We should be able to do that simply using https://pkg.go.dev/k8s.io/client-go/informers/core/v1#NewFilteredEndpointsInformer, fetching list of endpoints from https://pkg.go.dev/k8s.io/client-go/listers/core/v1#EndpointsLister on every scrape iteration.

Initially, without #222, the start will be slower and more expensive, as we will have to wait for caches to sync, but with #222, we get proper caching for free.

invidian · 2021-10-13T12:55:44Z

From user perspective, we have the following configuration now available for KSM discovery:

User may specify single hardcoded KSM URL
User may specify single KSM pod label, which will be used for finding single KSM Pod
- In this mode, user may also specify KSM port, scheme and KSM namespace
User may specify single KSM pod label, which will be used for finding multiple KSM Pods
- In this mode, user may specify only KSM namespace (port and scheme are hardcoded)
By default the discovery:
- Tries fetching _http-metrics._tcp.kube-state-metrics.kube-system.svc.cluster.local SRV record.
- For each label name from "app.kubernetes.io/name", "k8s-app", "app" and label value of kube-state-metrics,
  it tries to fetch the service using the following label, then picks first service which has cluster IP and port
  named http-metrics, otherwise picks first TCP port of the service.

Above logic produces several inconsistencies I believe:

Why port and scheme settings do not apply for multiple KSMs?
Why KSM namespace does not apply for DNS discovery?
Why can't you specify multiple KSM URLs?
Why specifying pod label does not support multiple labels or standard string selector?

BTW since KSM 2.1, TLS is supported, but it doesn't seem to be wildly adopted (e.g. Helm charts does not support it).

We should probably decide what existing features or extra features we want to support in new discovery mechanism.

invidian · 2021-10-14T07:29:56Z

For new KSM discovery:

It should support multiple KSMs by default.
Discovery should be done by watching the Endpoint objects, which eliminates the need to configure the ports.
It should support configuration via string label selector.
What about port name in the endpoint? We can keep existing strategy.
TLS support seems optional.
Namespace setting can be easily ported.
We should support specifying multiple URLs for static targets, however, it should be part of a separate, "static" discoverer.

Possible challenges:

If we keep iterating over different labels, we need to do it before setting up the informers OR setup multiple informers for each label. As KSM endpoints could have both labels we iterate on, we should possibly list all of them (e.g. app=kube-state-metrics and app.kubernetes.io/name=kube-state-metrics), we should perhaps deduplicate them.

@roobre any thoughts on all above (including previous comments)?

roobre · 2021-10-14T11:57:57Z

Thanks for the recap @invidian!
Some thoughts I have for KSM discovery use cases:

As an nri-kubernetes user, I want to:

Get KSM discovered out of the box when it's installed together with the nri-bundle helm chart
Get KSM discovered if its not bundled, so I can reuse an existing KSM installation
Manually specify a KSM endpoint URL, which would override all discovery mechanisms
Make the autodiscovery cheap so it doesn't bring my cluster down if pod is restarted often (e.g. in CrashLoopBackoff)

On the technical side:

We want to have full support for sharded KSM in the future, but I wouldn't say it's required now. However, what we do know is that users often have two or more KSM instances in their cluster, either by mistake or to have multiple KSM versions. We must make sure that "automatic sharding support" does not conflict with this, far mor common, use case.
Right now we discover pods and then attack their podIPs, but I'm not sure if we want to keep doing that. I find more appealing to try to get services using labelSelector instead, or is this not good for some other reason I'm not seeing?
- Bonus: If we attack the service, we get KSM HA essentially for free, right?
It would be nice if the details of how service/pod is discovered, such as the labelSelector or port number/name were configurable, defaulting to the values KSM authors set on its helm chart.
Limiting the discovery to a namespace was done to avoid hammering the APIServer with expensive requests. I believe that this should be less pressing now since there will be few agents asking for KSM, but we can keep the option just in case it helps any corner case. I don't think it's hard to honor.

roobre · 2021-10-14T12:12:20Z

If we keep iterating over different labels, we need to do it before setting up the informers OR setup multiple informers for each label. As KSM endpoints could have both labels we iterate on, we should possibly list all of them (e.g. app=kube-state-metrics and app.kubernetes.io/name=kube-state-metrics), we should perhaps deduplicate them.

I believe those were set in an attempt to capture as much KSM installations as possible. We could keep the discovery tied to a list of labels, but perhaps it would make more sense to limit it to one (defaulting to what KSM itselfs sets) and make it configurable. Some users might need to tweak that but I do not think that as a huge problem.

Why port and scheme settings do not apply for multiple KSMs?

Multiple KSM support was a bit of a PoC so perhaps it never got fully integrated with previous or following features.

Why KSM namespace does not apply for DNS discovery?

I believe DNS was an attempt to create a "fast path" discovery, but in the end the default (hardcoded) value became outdated, and nowadays I don't think it's hitting for anybody out there.

Why can't you specify multiple KSM URLs?

Same as above, that feature was not finished and probably wasn't integrated with others. I am not sure we want to keep that though: For sharded KSM scenarios we probably want to autodiscover it, right?

Why specifying pod label does not support multiple labels or standard string selector?

I don't really know, but we definitely should specify a labelSelector.

invidian · 2021-10-14T13:40:35Z

We want to have full support for sharded KSM in the future, but I wouldn't say it's required now. However, what we do know is that users often have two or more KSM instances in their cluster, either by mistake or to have multiple KSM versions. We must make sure that "automatic sharding support" does not conflict with this, far mor common, use case.

Good point. They should be deployed until different label selector then I assume? If not, do we still want to scrape only one instance by default and conditionally enable support for multiple then?

Right now we discover pods and then attack their podIPs, but I'm not sure if we want to keep doing that. I find more appealing to try to get services using labelSelector instead, or is this not good for some other reason I'm not seeing?

Yes, we should be using service endpoints instead (which are effectively pod IPs as well, but

Bonus: If we attack the service, we get KSM HA essentially for free, right?

KSM does not support HA. Only sharding. So we shouldn't hit the service. Also KSM manifests only deploy headless services, so it's not even possible.

roobre · 2021-10-14T16:34:43Z

Also KSM manifests only deploy headless services, so it's not even possible.

Interesting, didn't know about that!

Yes, we should be using service endpoints instead (which are effectively pod IPs as well, but

LGTM

Good point. They should be deployed until different label selector then I assume? If not, do we still want to scrape only one instance by default and conditionally enable support for multiple then?

How does the KSM chart deploy the instances when sharding is enabled? Does it create one service for each shard?

If this is the case, then we would be in a compromise because assuming that all KSM services matching a label (or labels) are actually shards by default might be a risky move. If this is the case, and if the KSM chart does not provide another way to identify services which are shards (e.g. an annotation?) I'd suggest to add a boolean flag (-enable-sharded-ksm) and document that then we will scrape everything we find as opposed to the first match.

invidian · 2021-10-14T16:45:35Z

I believe those were set in an attempt to capture as much KSM installations as possible. We could keep the discovery tied to a list of labels, but perhaps it would make more sense to limit it to one (defaulting to what KSM itselfs sets) and make it configurable. Some users might need to tweak that but I do not think that as a huge problem.

I think that make a lot of sense and will reasonably simplify things.

How does the KSM chart deploy the instances when sharding is enabled? Does it create one service for each shard?

No, it creates a StatefulSet instead of a Deployment, so each pod is a separate shard. Then headless service has all shards as the endpoints then.

If this is the case, then we would be in a compromise because assuming that all KSM services matching a label (or labels) are actually shards by default might be a risky move. If this is the case, and if the KSM chart does not provide another way to identify services which are shards (e.g. an annotation?) I'd suggest to add a boolean flag (-enable-sharded-ksm) and document that then we will scrape everything we find as opposed to the first match.

Good idea, however right now upstream manifests looks exactly the same for standalone and sharded deployment:

Upstream Helm chart also does not have any insights into deployed mode: https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-state-metrics/templates/service.yaml

There is also experimental autosharding which we should be considered.

invidian · 2021-10-15T07:26:03Z

So new discoverer:

✔️ Accepts label selector for KSM headless service endpoints.
✔️ Use app.kubernetes.io/name=kube-state-metrics selector by default.
✔️ Returns slice of HTTP clients.
✔️ By default returns only one client to avoid scraping duplicate data, unless sharded KSM flag is set.
✔️ From listed Endpoints objects:
- ❓ Pick first Endpoints object OR merge + de-duplicate all listed IP addresses?
- ❓ Merge ports + deduplicate, error on name conflict, try picking up port named http-metrics by default, otherwise picks first TCP one?
- ❓ We could as well iterate over everything until we find an endpoint which works.
✔️ Accepts sharded KSM flag to explicitly allow scraping multiple endpoints (user then must make sure endpoints does not provide duplicate data).
✔️ Take factory function as an argument, to balance between passing too many dependencies and putting too much responsibility into the discoverer and to be able to filter them by namespace:
```
type EndpointsLister interface {
  List(selector labels.Selector) (ret []*v1.Endpoints, err error)
}
func(namespace string, labelSelector string) EndpointsLister
```
✔️ Accepts KSM namespace for optimization.
✔️ (Nice to have) Accepts URL scheme (defaults to http, hardcoded in MVP).
✔️ (Nice to have) Support endpoint slices instead of just endpoints.
⚠️ Does not support static URLs, this will be part of different discoverer.
❌ Does not support DNS discovery.

roobre · 2021-10-15T09:02:48Z

By default returns only one client to avoid scraping duplicate data, unless sharded KSM flag is set.

And

We could as well iterate over everything until we find an endpoint which works.

Perhaps something we could do is make the discoverer return a slice of endpoints for the chosen port, and act on that flag in the client. The flag could switch between "scrape all" and "scrap in oder until one works" behavior. How does that sound?

try picking up port named http-metrics by default, otherwise picks first TCP one?

That seems reasonable to me. I'd put the port-picking logic on the discoverer so we don't try to access weird ports with the fallback mechanism.

Does not support static URLs, this will be part of different discoverer

Yes, I think it's better to treat a static URL as an override and skip the discovery proccess entirely.

Does not support DNS discovery.

Now that discovering will be cheap I don't see a reason for keeping it.

invidian · 2021-10-19T17:24:20Z

Also, my notes about existing data flow:

Constructing HTTPDoer.Do implementations creates round trippers for handling authentication.
Group() calls: /// MISSING ABSTRACTION for getting data in unified format into grouper.
- prometheus.Do() for metrics in Prometheus format.
- Other fetch functions to get metrics using HTTPDoer.Do()
prometheus.Do() calls HTTPDoer.Do()
HTTPDoer.Do() implementation calls prometheus.NewRequest()
HTTPDoer.Do() builds target URL. // MISSING ABSTRACTION for discovery process obtaining host with port.
HTTPDoer.Do() calls http.Client.Do() performing the actual request.
prometheus.Do() parses http.Response and filters results.
Other fetch functions also parse responses.

invidian · 2021-10-22T17:04:53Z

After looking with @paologallinaharbur and @roobre at #219 and #218, we figured that discovery strategy for each of the component is likely to be different, so the discoverers and clients may not necessarily return data in the same testable format. Right now discovery returns HTTPClient which is generic enough, but unfortunately too generic to be blackbox tested.

invidian added the feature request Categorizes issue or PR as related to a new feature or enhancement. label Oct 11, 2021

gsanchezgavier assigned invidian Oct 13, 2021

This was referenced Oct 22, 2021

Modify autodiscovery for kubelet considering new Arch #219

Closed

Modify autodiscovery for ControlPlane considering new Arch #218

Closed

invidian removed their assignment Oct 29, 2021

paologallinaharbur mentioned this issue Nov 8, 2021

KSM refactor #259

Merged

paologallinaharbur self-assigned this Nov 8, 2021

paologallinaharbur closed this as completed Nov 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modify autodiscovery for KSM considering new Arch #217

Modify autodiscovery for KSM considering new Arch #217

paologallinaharbur commented Oct 4, 2021

invidian commented Oct 8, 2021

invidian commented Oct 13, 2021 •

edited

Loading

invidian commented Oct 14, 2021

roobre commented Oct 14, 2021 •

edited by invidian

Loading

roobre commented Oct 14, 2021

invidian commented Oct 14, 2021

roobre commented Oct 14, 2021 •

edited

Loading

invidian commented Oct 14, 2021

invidian commented Oct 15, 2021

roobre commented Oct 15, 2021

invidian commented Oct 19, 2021

invidian commented Oct 22, 2021

Modify autodiscovery for KSM considering new Arch #217

Modify autodiscovery for KSM considering new Arch #217

Comments

paologallinaharbur commented Oct 4, 2021

invidian commented Oct 8, 2021

invidian commented Oct 13, 2021 • edited Loading

invidian commented Oct 14, 2021

roobre commented Oct 14, 2021 • edited by invidian Loading

roobre commented Oct 14, 2021

invidian commented Oct 14, 2021

roobre commented Oct 14, 2021 • edited Loading

invidian commented Oct 14, 2021

invidian commented Oct 15, 2021

roobre commented Oct 15, 2021

invidian commented Oct 19, 2021

invidian commented Oct 22, 2021

invidian commented Oct 13, 2021 •

edited

Loading

roobre commented Oct 14, 2021 •

edited by invidian

Loading

roobre commented Oct 14, 2021 •

edited

Loading