-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Telemetry module status as metric input to enable dashboarding and alerting on it #728
Comments
A simple test using kube-state-metrics proved that you can emit metrics in a consistent way across all modules. customResourceState:
enabled: true
config:
kind: CustomResourceStateMetrics
spec:
resources:
- groupVersionKind:
group: "operator.kyma-project.io"
kind: "Kyma"
version: "v1beta2"
labelsFromPath:
name: [metadata, name]
namespace: [metadata, namespace]
metrics:
- name: kyma_status_state
help: "current state of kyma"
each:
type: StateSet
stateSet:
labelName: state
path: [status,state]
list: [Error, Processing, Ready, Deleting, Warning]
- name: kyma_status_modules_state
help: "current module states"
each:
type: StateSet
stateSet:
labelName: state
valueFrom: [state]
path: [status, modules]
labelsFromPath:
module: [name]
list: [Error, Processing, Ready, Deleting, Warning]
- groupVersionKind:
group: "operator.kyma-project.io"
kind: "*"
version: "*"
labelsFromPath:
name: [metadata, name]
namespace: [metadata, namespace]
metrics:
- name: module_status_conditions
help: "conditions of Module CR"
each:
type: Gauge
gauge:
path: [status, conditions]
labelsFromPath:
type: [type]
reason: [reason]
valueFrom: [status] Running KSM with that config exposed following metrics: # HELP kube_customresource_module_status_conditions conditions of Module CR
# TYPE kube_customresource_module_status_conditions gauge
kube_customresource_module_status_conditions{customresource_group="operator.kyma-project.io",customresource_kind="ApplicationConnector",customresource_version="v1alpha1",name="applicationconnector-sample",namespace="kyma-system",reason="Verified",type="Installed"} 1
kube_customresource_module_status_conditions{customresource_group="operator.kyma-project.io",customresource_kind="BtpOperator",customresource_version="v1alpha1",name="btpoperator",namespace="kyma-system",reason="ReconcileSucceeded",type="Ready"} 1
kube_customresource_module_status_conditions{customresource_group="operator.kyma-project.io",customresource_kind="Eventing",customresource_version="v1alpha1",name="eventing",namespace="kyma-system",reason="Available",type="NATSAvailable"} 1
kube_customresource_module_status_conditions{customresource_group="operator.kyma-project.io",customresource_kind="Eventing",customresource_version="v1alpha1",name="eventing",namespace="kyma-system",reason="Deployed",type="PublisherProxyReady"} 1
kube_customresource_module_status_conditions{customresource_group="operator.kyma-project.io",customresource_kind="Eventing",customresource_version="v1alpha1",name="eventing",namespace="kyma-system",reason="Ready",type="WebhookReady"} 1
kube_customresource_module_status_conditions{customresource_group="operator.kyma-project.io",customresource_kind="Keda",customresource_version="v1alpha1",name="default",namespace="kyma-system",reason="Verified",type="Installed"} 1
kube_customresource_module_status_conditions{customresource_group="operator.kyma-project.io",customresource_kind="NATS",customresource_version="v1alpha1",name="eventing-nats",namespace="kyma-system",reason="Available",type="StatefulSet"} 1
kube_customresource_module_status_conditions{customresource_group="operator.kyma-project.io",customresource_kind="NATS",customresource_version="v1alpha1",name="eventing-nats",namespace="kyma-system",reason="Deployed",type="Available"} 1
kube_customresource_module_status_conditions{customresource_group="operator.kyma-project.io",customresource_kind="Serverless",customresource_version="v1alpha1",name="default",namespace="kyma-system",reason="Configured",type="Configured"} 1
kube_customresource_module_status_conditions{customresource_group="operator.kyma-project.io",customresource_kind="Serverless",customresource_version="v1alpha1",name="default",namespace="kyma-system",reason="Installed",type="Installed"} 1
kube_customresource_module_status_conditions{customresource_group="operator.kyma-project.io",customresource_kind="Telemetry",customresource_version="v1alpha1",name="default",namespace="kyma-system",reason="FluentBitDaemonSetReady",type="LogComponentsHealthy"} 1
kube_customresource_module_status_conditions{customresource_group="operator.kyma-project.io",customresource_kind="Telemetry",customresource_version="v1alpha1",name="default",namespace="kyma-system",reason="MetricPipelineReferencedSecretMissing",type="MetricComponentsHealthy"} 0
kube_customresource_module_status_conditions{customresource_group="operator.kyma-project.io",customresource_kind="Telemetry",customresource_version="v1alpha1",name="default",namespace="kyma-system",reason="TraceGatewayDeploymentReady",type="TraceComponentsHealthy"} 1
kube_customresource_module_status_conditions{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta1",name="default",namespace="kyma-system",reason="Ready",type="ModuleCatalog"} 1
kube_customresource_module_status_conditions{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta1",name="default",namespace="kyma-system",reason="Ready",type="Modules"} 0
kube_customresource_module_status_conditions{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta1",name="default",namespace="kyma-system",reason="Ready",type="SKRWebhook"} 1
# HELP kube_customresource_kyma_status_state current state of kyma
# TYPE kube_customresource_kyma_status_state stateset
kube_customresource_kyma_status_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",name="default",namespace="kyma-system",state="Deleting"} 0
kube_customresource_kyma_status_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",name="default",namespace="kyma-system",state="Error"} 0
kube_customresource_kyma_status_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",name="default",namespace="kyma-system",state="Processing"} 0
kube_customresource_kyma_status_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",name="default",namespace="kyma-system",state="Ready"} 0
kube_customresource_kyma_status_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",name="default",namespace="kyma-system",state="Warning"} 1
# HELP kube_customresource_kyma_status_modules_state current module states
# TYPE kube_customresource_kyma_status_modules_state stateset
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="api-gateway",name="default",namespace="kyma-system",state="Deleting"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="api-gateway",name="default",namespace="kyma-system",state="Error"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="api-gateway",name="default",namespace="kyma-system",state="Processing"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="api-gateway",name="default",namespace="kyma-system",state="Ready"} 1
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="api-gateway",name="default",namespace="kyma-system",state="Warning"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="application-connector",name="default",namespace="kyma-system",state="Deleting"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="application-connector",name="default",namespace="kyma-system",state="Error"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="application-connector",name="default",namespace="kyma-system",state="Processing"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="application-connector",name="default",namespace="kyma-system",state="Ready"} 1
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="application-connector",name="default",namespace="kyma-system",state="Warning"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="btp-operator",name="default",namespace="kyma-system",state="Deleting"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="btp-operator",name="default",namespace="kyma-system",state="Error"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="btp-operator",name="default",namespace="kyma-system",state="Processing"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="btp-operator",name="default",namespace="kyma-system",state="Ready"} 1
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="btp-operator",name="default",namespace="kyma-system",state="Warning"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="eventing",name="default",namespace="kyma-system",state="Deleting"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="eventing",name="default",namespace="kyma-system",state="Error"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="eventing",name="default",namespace="kyma-system",state="Processing"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="eventing",name="default",namespace="kyma-system",state="Ready"} 1
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="eventing",name="default",namespace="kyma-system",state="Warning"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="istio",name="default",namespace="kyma-system",state="Deleting"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="istio",name="default",namespace="kyma-system",state="Error"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="istio",name="default",namespace="kyma-system",state="Processing"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="istio",name="default",namespace="kyma-system",state="Ready"} 1
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="istio",name="default",namespace="kyma-system",state="Warning"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="keda",name="default",namespace="kyma-system",state="Deleting"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="keda",name="default",namespace="kyma-system",state="Error"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="keda",name="default",namespace="kyma-system",state="Processing"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="keda",name="default",namespace="kyma-system",state="Ready"} 1
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="keda",name="default",namespace="kyma-system",state="Warning"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="nats",name="default",namespace="kyma-system",state="Deleting"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="nats",name="default",namespace="kyma-system",state="Error"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="nats",name="default",namespace="kyma-system",state="Processing"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="nats",name="default",namespace="kyma-system",state="Ready"} 1
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="nats",name="default",namespace="kyma-system",state="Warning"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="serverless",name="default",namespace="kyma-system",state="Deleting"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="serverless",name="default",namespace="kyma-system",state="Error"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="serverless",name="default",namespace="kyma-system",state="Processing"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="serverless",name="default",namespace="kyma-system",state="Ready"} 1
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="serverless",name="default",namespace="kyma-system",state="Warning"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="telemetry",name="default",namespace="kyma-system",state="Deleting"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="telemetry",name="default",namespace="kyma-system",state="Error"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="telemetry",name="default",namespace="kyma-system",state="Processing"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="telemetry",name="default",namespace="kyma-system",state="Ready"} 0
kube_customresource_kyma_status_modules_state{customresource_group="operator.kyma-project.io",customresource_kind="Kyma",customresource_version="v1beta2",module="telemetry",name="default",namespace="kyma-system",state="Warning"} 1 Hereby, we could use a gauge as well instead of a stateset to not differentiate the states but just have an aggregated error or nor error A very simple dashboard in Cloud Logging on base of the data: |
In the otel-collector community the analogue receiver for KSM is the k8sclusterreceiver which has already a good coverage of metrics. However, there is no general solution yet to scrape CRD specific metrics comparable to KSM. |
This issue has been automatically marked as stale due to the lack of recent activity. It will soon be closed if no further activity occurs. |
The following extension for the apiVersion: telemetry.kyma-project.io/v1alpha1
kind: MetricPipeline
metadata:
name: sample
spec:
input:
kyma:
enabled: true
modules:
- telemetry Enabling the input should product the following metrics:
|
Conceptual phae is finished and we will start working on the topic. Target is Q3/24. |
One problem which turned out while putting the final pieces together are the RBAC settings. In order to access all modules in a dynamic way, the manager will require "list" permissions on all resources (originated by CRDs, not standard K8S types) with ClusterRole scope. Furthermore, it currently is not transparent on what the future of the module status is and from where to retrieve the information on available modules and where to find the status. |
We agreed on the following points:
With that, the following items need to be done to finish that epic:
|
Rolled out with 1.25.0 |
Problem
Every module in Kyma must report a status in some way which can be introspected by users. A module already can expose custom metrics on components and mark them with
prometheus.io/scrape
annotation as scrapable, so that users have a chance to get insights. With that approach, modules can expose advanced metric about the module where users need to know the metrics and be able to define thresholds in order to define alerts. For the not so much "advanced" scenario it will be helpfull to have metrics available which are harmonized across all modules and have a very simple threshold like "error" or "no error". That simple metric should be available if modules do not care yet about metric exposure. The user needs a way to collect these metrics so that he can have a unified dashboard and alert rules defined in his backendCriterias
Idea
Every module currently must reflect the current state in the moduleCR status by having a "state". It is recommended to also have some more advanced "conditions" with reasons available in the status like for example in telemetry:
Also the state of the module is reflected in the Kyma CR itself as well as the overall kyma state, like shown in the shortened example:
To reflect that status information via custom module metrics would require additional effort and an harmonized approach (metric syntax and semantics) across all modules, which will be very hard to achieve.
Instead we could offer a dedicated input to a MetricPipeline which will provide metrics for the kyma state itself and the state of all modules, based on the Kyma CR plus metrics for representing the individual module conditions. The metrics will be gauges with simple values of 0 or 1 for easy alerting. The relation to the used moduleCRs are available via the kyma status already.
An Example PIpeline can look like this:
Example metrics can look like that:
Items:
The text was updated successfully, but these errors were encountered: