Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance Grafana dashboards for kube-apiserver #3502

Merged
merged 2 commits into from
Feb 15, 2021

Conversation

rfranzke
Copy link
Member

@rfranzke rfranzke commented Feb 5, 2021

Co-Authored-By: @timuthy

How to categorize this PR?

/area monitoring
/kind enhancement
/priority normal

What this PR does / why we need it:
This PR improves the Grafana dashboards for the kube-apiserver, please find a few screenshots here:

g1

g6

g7

g8

g9

g10

g2

g3

g4

g5

Which issue(s) this PR fixes:
Similar to #2815

Release note:

The Grafana dashboards for the `kube-apiserver` have been enhanced and are now providing more information for the various metrics.

@rfranzke rfranzke requested a review from a team as a code owner February 5, 2021 04:51
@gardener-robot gardener-robot added area/monitoring Monitoring (including availability monitoring and alerting) related kind/enhancement Enhancement, improvement, extension priority/normal size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Feb 5, 2021
@timebertt
Copy link
Member

/assign

Copy link
Member

@timebertt timebertt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice job, @rfranzke and @timuthy!
I only have some nit comments.

@rfranzke
Copy link
Member Author

/squash

Copy link
Member

@istvanballok istvanballok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kudos! :)

Copy link
Contributor

@wyb1 wyb1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm but let's iterate over the dashboard again once this is rolled out and we have more experience with the data :)

@rfranzke
Copy link
Member Author

Yes, sure @wyb1, let's definitely do this! Appreciate it!

Copy link
Member

@timebertt timebertt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@rfranzke rfranzke merged commit 7bf01fe into gardener:master Feb 15, 2021
@rfranzke rfranzke deleted the enh/kapi-dashboard branch February 15, 2021 07:36
ezeeyahoo pushed a commit to ezeeyahoo/gardener that referenced this pull request Feb 17, 2021
* Enhance Grafana dashboards for kube-apiserver

Co-Authored-By: Tim Usner <[email protected]>

* Incorporate PR review feedback of @timebertt  and @wyb1

Co-authored-by: Tim Usner <[email protected]>
@gardener-robot gardener-robot added priority/3 Priority (lower number equals higher priority) and removed priority/3 Priority (lower number equals higher priority) labels Mar 8, 2021
krgostev pushed a commit to krgostev/gardener that referenced this pull request Apr 21, 2022
* Enhance Grafana dashboards for kube-apiserver

Co-Authored-By: Tim Usner <[email protected]>

* Incorporate PR review feedback of @timebertt  and @wyb1

Co-authored-by: Tim Usner <[email protected]>
krgostev pushed a commit to krgostev/gardener that referenced this pull request Jul 5, 2022
* Enhance Grafana dashboards for kube-apiserver

Co-Authored-By: Tim Usner <[email protected]>

* Incorporate PR review feedback of @timebertt  and @wyb1

Co-authored-by: Tim Usner <[email protected]>
rickardsjp added a commit to rickardsjp/gardener that referenced this pull request Mar 13, 2023
Gardener currently supports Kubernetes v1.20 to v1.25. Therefore, even
though some metrics are deprecated, we can't remove them.

In Kubernetes v1.24 `apiserver_dropped_requests_total` is deprecated in
favor of `apiserver_request_total`. It is hidden in v1.25. This commit
allowlists the deprecated metric, because it is used in a Grafana
dashboard. It was dropped from the allowlist by mistake in gardener#3502.

In Kubernetes v1.23, `apiserver_registered_watchers` is deprecated in favor
of `apiserver_longrunning_requests`. It is hidden in v1.24.

Co-authored-by: Istvan Zoltan Ballok <[email protected]>
Co-authored-by: Jeremy Rickards <[email protected]>
istvanballok added a commit to rickardsjp/gardener that referenced this pull request Mar 14, 2023
This panel was not showing any data in different Kubernetes versions for 2
reasons:

- apiserver_dropped_requests_total was not allow listed by mistake, see gardener#3502.
  This means that this panel was not showing any data in any Kubernetes version.

- apiserver_dropped_requests_total is deprecated in 1.24 and removed in 1.25. So
  in Kubernetes clusters >= 1.25, this panel would have been empty for this
  reason as well.

In a previous commit we allow listed apiserver_dropped_requests_total for older
clusters and in this commit we add a semantically similar query with the metric
apiserver_request_terminations_total.

Co-authored-by: Istvan Zoltan Ballok <[email protected]>
Co-authored-by: Jeremy Rickards <[email protected]>
rickardsjp added a commit to rickardsjp/gardener that referenced this pull request Mar 14, 2023
This panel was not showing any data in different Kubernetes versions for 2
reasons:

- apiserver_dropped_requests_total was not allow listed by mistake, see gardener#3502.
  This means that this panel was not showing any data in any Kubernetes version.

- apiserver_dropped_requests_total is deprecated in 1.24 and removed in 1.25. So
  in Kubernetes clusters >= 1.25, this panel would have been empty for this
  reason as well.

In a previous commit we allow listed apiserver_dropped_requests_total for older
clusters and in this commit we add a semantically similar query with the metric
apiserver_request_terminations_total.

Co-authored-by: Istvan Zoltan Ballok <[email protected]>
Co-authored-by: Jeremy Rickards <[email protected]>
rickardsjp added a commit to rickardsjp/gardener that referenced this pull request Mar 15, 2023
This panel was not showing any data in different Kubernetes versions for 2
reasons:

- apiserver_dropped_requests_total was not allow listed by mistake, see gardener#3502.
  This means that this panel was not showing any data in any Kubernetes version.

- apiserver_dropped_requests_total is deprecated in 1.24 and removed in 1.25. So
  in Kubernetes clusters >= 1.25, this panel would have been empty for this
  reason as well.

In a previous commit we allow listed apiserver_dropped_requests_total for older
clusters and in this commit we add a semantically similar query with the metric
apiserver_request_terminations_total.

Co-authored-by: Istvan Zoltan Ballok <[email protected]>
Co-authored-by: Jeremy Rickards <[email protected]>
rickardsjp added a commit to rickardsjp/gardener that referenced this pull request Mar 16, 2023
This panel was not showing any data in different Kubernetes versions for 2
reasons:

- apiserver_dropped_requests_total was not allow listed by mistake, see gardener#3502.
  This means that this panel was not showing any data in any Kubernetes version.

- apiserver_dropped_requests_total is deprecated in 1.24 and removed in 1.25. So
  in Kubernetes clusters >= 1.25, this panel would have been empty for this
  reason as well.

In a previous commit we allow listed apiserver_dropped_requests_total for older
clusters and in this commit we add a semantically similar query with the metric
apiserver_request_terminations_total.

Co-authored-by: Istvan Zoltan Ballok <[email protected]>
Co-authored-by: Jeremy Rickards <[email protected]>
rickardsjp added a commit to rickardsjp/gardener that referenced this pull request Mar 17, 2023
Gardener currently supports Kubernetes v1.20 to v1.25. Therefore, even
though some metrics are deprecated, we can't remove them.

In Kubernetes v1.24 `apiserver_dropped_requests_total` is deprecated in
favor of `apiserver_request_total`. It is hidden in v1.25. This commit
allowlists the deprecated metric, because it is used in a Grafana
dashboard. It was dropped from the allowlist by mistake in gardener#3502.

In Kubernetes v1.23, `apiserver_registered_watchers` is deprecated in favor
of `apiserver_longrunning_requests`. It is hidden in v1.24.

Co-authored-by: Istvan Zoltan Ballok <[email protected]>
Co-authored-by: Jeremy Rickards <[email protected]>
rickardsjp added a commit to rickardsjp/gardener that referenced this pull request Mar 17, 2023
This panel was not showing any data in different Kubernetes versions for 2
reasons:

- apiserver_dropped_requests_total was not allow listed by mistake, see gardener#3502.
  This means that this panel was not showing any data in any Kubernetes version.

- apiserver_dropped_requests_total is deprecated in 1.24 and removed in 1.25. So
  in Kubernetes clusters >= 1.25, this panel would have been empty for this
  reason as well.

In a previous commit we allow listed apiserver_dropped_requests_total for older
clusters and in this commit we add a semantically similar query with the metric
apiserver_request_terminations_total.

Co-authored-by: Istvan Zoltan Ballok <[email protected]>
Co-authored-by: Jeremy Rickards <[email protected]>
rickardsjp added a commit to rickardsjp/gardener that referenced this pull request Mar 17, 2023
Gardener currently supports Kubernetes v1.20 to v1.25. Therefore, even
though some metrics are deprecated, we can't remove them.

In Kubernetes v1.24 `apiserver_dropped_requests_total` is deprecated in
favor of `apiserver_request_total`. It is hidden in v1.25. This commit
allowlists the deprecated metric, because it is used in a Grafana
dashboard. It was dropped from the allowlist by mistake in gardener#3502.
The new metric, `apiserver_request_total`, is already allowlisted.

In Kubernetes v1.23, `apiserver_registered_watchers` is deprecated in favor
of `apiserver_longrunning_requests`. It is hidden in v1.24. This commit
adds the new metric, `apiserver_longrunning_requests` to the allowlist.

Co-authored-by: Istvan Zoltan Ballok <[email protected]>
Co-authored-by: Jeremy Rickards <[email protected]>
rickardsjp added a commit to rickardsjp/gardener that referenced this pull request Mar 17, 2023
This panel was not showing any data in different Kubernetes versions for 2
reasons:

- apiserver_dropped_requests_total was not allow listed by mistake, see gardener#3502.
  This means that this panel was not showing any data in any Kubernetes version.

- apiserver_dropped_requests_total is deprecated in 1.24 and removed in 1.25. So
  in Kubernetes clusters >= 1.25, this panel would have been empty for this
  reason as well.

In a previous commit we allow listed apiserver_dropped_requests_total for older
clusters and in this commit we add a semantically similar query with the metric
apiserver_request_terminations_total.

Co-authored-by: Istvan Zoltan Ballok <[email protected]>
Co-authored-by: Jeremy Rickards <[email protected]>
rickardsjp added a commit to rickardsjp/gardener that referenced this pull request Mar 17, 2023
This panel was not showing any data in different Kubernetes versions for 2
reasons:

- apiserver_dropped_requests_total was removed from the allowlist, see
  gardener#3502. This means that this panel was not showing any data in any
  Kubernetes version.

- apiserver_dropped_requests_total is deprecated in 1.24 and removed in
  1.25. So in Kubernetes clusters >= 1.25, this panel would have been empty
  for this reason as well.

The replacement metric `apiserver_request_terminations_total`, is already
allowlisted and available since Kubernetes v1.17, so we can simply use
that for a semantically similar query.

Co-authored-by: Istvan Zoltan Ballok <[email protected]>
Co-authored-by: Jeremy Rickards <[email protected]>
gardener-prow bot pushed a commit that referenced this pull request Mar 20, 2023
* Add API server metrics to allowlist

Gardener currently supports Kubernetes v1.20 to v1.25.

In Kubernetes v1.23, `apiserver_registered_watchers` is deprecated in favor
of `apiserver_longrunning_requests`. It is hidden in v1.24. This commit
adds the new metric, `apiserver_longrunning_requests` to the allowlist.

Co-authored-by: Istvan Zoltan Ballok <[email protected]>
Co-authored-by: Jeremy Rickards <[email protected]>

* Adjust the promql query to support all the K8s versions

The promql expression:

    sum by (group, version, kind) (apiserver_registered_watchers)
  + on () group_left ()
    absent(apiserver_longrunning_requests) * 0
or
  sum by (group, version, resource) (apiserver_longrunning_requests)

returns the result of the newer metric `apiserver_longrunning_requests` (>=1.23)
if present, otherwise it will return the `apiserver_registered_watchers` (<1.23).

Note that the "total" query used the "count" aggregation which was semantically
not meaningful. This aspect is also fixed in this commit: the registered
watchers / long running requests need to be added up to get the total value.

Co-authored-by: Istvan Zoltan Ballok <[email protected]>
Co-authored-by: Jeremy Rickards <[email protected]>

* Fix the "Dropped Requests" panel of Kubernetes API Server Details

This panel was not showing any data in different Kubernetes versions for 2
reasons:

- apiserver_dropped_requests_total was removed from the allowlist, see
  #3502. This means that this panel was not showing any data in any
  Kubernetes version.

- apiserver_dropped_requests_total is deprecated in 1.24 and removed in
  1.25. So in Kubernetes clusters >= 1.25, this panel would have been empty
  for this reason as well.

The replacement metric `apiserver_request_terminations_total`, is already
allowlisted and available since Kubernetes v1.17, so we can simply use
that for a semantically similar query.

Co-authored-by: Istvan Zoltan Ballok <[email protected]>
Co-authored-by: Jeremy Rickards <[email protected]>

---------

Co-authored-by: Istvan Zoltan Ballok <[email protected]>
andrerun pushed a commit to andrerun/gardener that referenced this pull request Jul 6, 2023
* Add API server metrics to allowlist

Gardener currently supports Kubernetes v1.20 to v1.25.

In Kubernetes v1.23, `apiserver_registered_watchers` is deprecated in favor
of `apiserver_longrunning_requests`. It is hidden in v1.24. This commit
adds the new metric, `apiserver_longrunning_requests` to the allowlist.

Co-authored-by: Istvan Zoltan Ballok <[email protected]>
Co-authored-by: Jeremy Rickards <[email protected]>

* Adjust the promql query to support all the K8s versions

The promql expression:

    sum by (group, version, kind) (apiserver_registered_watchers)
  + on () group_left ()
    absent(apiserver_longrunning_requests) * 0
or
  sum by (group, version, resource) (apiserver_longrunning_requests)

returns the result of the newer metric `apiserver_longrunning_requests` (>=1.23)
if present, otherwise it will return the `apiserver_registered_watchers` (<1.23).

Note that the "total" query used the "count" aggregation which was semantically
not meaningful. This aspect is also fixed in this commit: the registered
watchers / long running requests need to be added up to get the total value.

Co-authored-by: Istvan Zoltan Ballok <[email protected]>
Co-authored-by: Jeremy Rickards <[email protected]>

* Fix the "Dropped Requests" panel of Kubernetes API Server Details

This panel was not showing any data in different Kubernetes versions for 2
reasons:

- apiserver_dropped_requests_total was removed from the allowlist, see
  gardener#3502. This means that this panel was not showing any data in any
  Kubernetes version.

- apiserver_dropped_requests_total is deprecated in 1.24 and removed in
  1.25. So in Kubernetes clusters >= 1.25, this panel would have been empty
  for this reason as well.

The replacement metric `apiserver_request_terminations_total`, is already
allowlisted and available since Kubernetes v1.17, so we can simply use
that for a semantically similar query.

Co-authored-by: Istvan Zoltan Ballok <[email protected]>
Co-authored-by: Jeremy Rickards <[email protected]>

---------

Co-authored-by: Istvan Zoltan Ballok <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/monitoring Monitoring (including availability monitoring and alerting) related kind/enhancement Enhancement, improvement, extension size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants