-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(operator): Use safe bearer token authentication to scrape operator metrics #12164
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a code POV lgtm (didn't get to test it on a cluster). This is mainly for use cases where we deploy the LokiOperator in non openshift-
namespaces correct?
No this when we install on |
cb5ae36
to
4a065a4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me this needed a change to work (see comment).
operator/config/overlays/openshift/prometheus_service_monitor_patch.yaml
Show resolved
Hide resolved
operator/config/overlays/openshift/prometheus_service_monitor_patch.yaml
Outdated
Show resolved
Hide resolved
operator/config/overlays/community-openshift/prometheus_service_monitor_patch.yaml
Outdated
Show resolved
Hide resolved
operator/config/overlays/openshift/manager_related_image_patch.yaml
Outdated
Show resolved
Hide resolved
[release-5.6] Backport PR grafana#12164 and grafana#12216
[release-5.8] Backport PR grafana#12164 and grafana#12216
[release-5.7] Backport PR grafana#12164 and grafana#12216
What this PR does / why we need it:
In OpenShift clusters we have the option to scrape operator metrics either via cluster-monitoring (default case) or user-workload-monitoring (managed clusters, where users track operator metrics themselves). Until now the service monitor for scraping operator metrics was only compatible with cluster-monitoring that allows using
bearerTokenFile
andtlsConfig.caFile
. Both are not allowed when scraping with user-workload-monitoring. The Prometheus Operator in user-workload-monitoring is configured withArbitraryFSAccessThroughSMsConfig.Deny: true
which in turn disallows the prometheus binary to access it's own serviceaccount token to scrape metrics.Which issue(s) this PR fixes:
Fixes LOG-5165, Replaces #11680
Special notes for your reviewer:
The changeset below introduces a set of new manifests to make an explicit distinction which serviceaccount is used by the Loki Operator itself as well as which is used by prometheus to access metrics only, i.e.
loki-operator-controller-manager
is introduced to be used only by the Loki Operatormanager
container. This account is bound to RBAC listed in each supported bundleClusterServiceVersion
.loki-operator-controller-manager-metrics-reader
is introduced along with a secret that holds a long-lived API token and the service CA certificate. The token is referenced in theServiceMonitor
inauthorization.credentials
replacingbearerTokenFile
. The certificate is referenced in theServiceMonitor
intlsConfig.ca
replacingtlsConfig.caFile
. Also it is used by Prometheus to scrape metrics from the Loki Operatormanager
container only through thekube-rbac-proxy
sidecar. This serviceaccount is assigned in aClusterRoleBinding
namelyloki-operator-controller-manager-read-metrics
to get access to the Non-Resoure-URLget/metrics
.Checklist
CONTRIBUTING.md
guide (required)CHANGELOG.md
updatedadd-to-release-notes
labeldocs/sources/setup/upgrade/_index.md
production/helm/loki/Chart.yaml
and updateproduction/helm/loki/CHANGELOG.md
andproduction/helm/loki/README.md
. Example PRdeprecated-config.yaml
anddeleted-config.yaml
files respectively in thetools/deprecated-config-checker
directory. Example PR