Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add istio control plane dashboard and alert rule #478

Merged
merged 12 commits into from
Aug 28, 2024

Conversation

rgildein
Copy link
Contributor

@rgildein rgildein commented Jul 22, 2024

Dashboards

The source of this dashboard is 1.

I dropped two panels:

  • "Disk", due to missing container_fs_usage_bytes metrics
  • "Sidecar Injection" due to missing sidecar_injection_success_total and sidecar_injection_failure_total metrics

and did these changes:

  • change app="istiod" to juju-charm="istio-pilot"
  • change interval from 1m to 5m
  • used irate instead of rate for "Configuration Validation" panel
  • used galley_validation_config_updates and galley_validation_config_update_error metrics in "Configuration Validation" panel
  • grouped pilot version by model and tag in "Pilot Versions" panel, e.g. "sum(istio_build{component=\"pilot\"}) by (tag, juju_model)"
  • replace grouping by pod with grouping by instance, e.g. by (instance)

blocked by: #477

To test this you need to have cos + garafana-agent deployed and integrate grafana-agent with istio-pilot via grafana-dashboard. After that you should be seen dashboard:
Screenshot from 2024-07-22 14-23-26

And dashboard should look like this:

Screenshot from 2024-07-22 14-23-50

Alert rules

I found some alert rules in this source 2, however after testing only the IstioPilotDuplicateEntry worked. This was due to missing metrics. Maybe we can enable more metrics, e.g. controller queue metrics with ISTIO_ENABLE_CONTROLLER_QUEUE_METRICS env 3, but this is out of scope of this PR.


@rgildein rgildein added the enhancement New feature or request label Jul 22, 2024
@rgildein rgildein self-assigned this Jul 22, 2024
@rgildein rgildein requested a review from a team as a code owner July 22, 2024 12:25
rgildein added 2 commits July 24, 2024 23:29
After reading some article I think that rate was used here for good reason.
Copy link
Contributor

@orfeas-k orfeas-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a comment. Will test once #477 has been merged

@rgildein rgildein changed the title Add istio control plane dashboard Add istio control plane dashboard and alert rule Aug 6, 2024
Copy link
Member

@misohu misohu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested and its working ... we should always provide the steps to test locally for reviewer :)

@rgildein rgildein merged commit fbac2e3 into main Aug 28, 2024
19 checks passed
@rgildein rgildein deleted the chore/KF-6033/dashboard branch August 28, 2024 07:25
rgildein added a commit that referenced this pull request Aug 28, 2024
* Add istio control plane dashboard

The source of this dashboard is [1].

---
[1]: https://grafana.com/grafana/dashboards/7645-istio-control-plane-dashboard/

* Add aler rule from source [1] based on metrics [2]

Found some intteresting alert rules in [1], however based on istio
doc [2], only the last one will work, so I added it here.

---
[1]: https://samber.github.io/awesome-prometheus-alerts/rules#istio
[2]: https://istio.io/latest/docs/reference/commands/pilot-discovery/#metrics
rgildein added a commit that referenced this pull request Aug 28, 2024
* Add istio control plane dashboard

The source of this dashboard is [1].

---
[1]: https://grafana.com/grafana/dashboards/7645-istio-control-plane-dashboard/

* Add aler rule from source [1] based on metrics [2]

Found some intteresting alert rules in [1], however based on istio
doc [2], only the last one will work, so I added it here.

---
[1]: https://samber.github.io/awesome-prometheus-alerts/rules#istio
[2]: https://istio.io/latest/docs/reference/commands/pilot-discovery/#metrics
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants