Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation for deploying Telemetry services on Kubernetes #51

Merged
merged 4 commits into from
Dec 18, 2024

Conversation

aucahuasi
Copy link
Contributor

This PR enhances the documentation for deploying OpenTelemetry services for Graphistry in a Kubernetes environment. It includes detailed instructions for configuring Helm values, explains key environment variables, and provides an overview of the observability stack setup, making it easier to understand and implement the deployment process.

- **`telemetryEnv.grafana.GF_SERVER_ROOT_URL`** defines the root URL for accessing Grafana (e.g., `/grafana`).
- **`telemetryEnv.grafana.GF_SERVER_SERVE_FROM_SUB_PATH`** should be set to `true` if Grafana is accessed from a sub-path (e.g., `/grafana`) behind a reverse proxy or ingress.
7. **`telemetryEnv.dcgmExporter.DCGM_EXPORTER_CLOCK_EVENTS_COUNT_WINDOW_SIZE`**: This environment variable is used when `OTEL_CLOUD_MODE` is set to `true`, and the `dcgm-exporter` is deployed to export GPU metrics to Prometheus. It controls the frequency of GPU sampling to gather metrics. The value `1000` represents the window size for counting clock events on the GPU.
8. **`telemetryEnv.*.image`**: These values allow to change the image versions of the observability tools.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I think we should add section on how to re-configure caddy once the cluster is already up as discussed in the slack thread.

  2. I think the doc could organize the doc better, or break it multiple md files, as it's getting pretty long. If we keep it in one file, maybe add a TOC with links at the top to the different sections.

Besides that, LGTM, but I was only able to get grafana working out of the box, until we know how to configure a new Caddyfile. I will review again once that's figured out, but overall docs look fine to me.

I have some feedback on gke instructions that I will send in Slack,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @DataBoyTX ! Done!

@aucahuasi aucahuasi requested a review from DataBoyTX December 18, 2024 01:42
Copy link
Contributor

@DataBoyTX DataBoyTX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you Percy!

@aucahuasi aucahuasi merged commit d74c88a into master Dec 18, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants