Skip to content

Commit

Permalink
Merge branch 'main' into spring-boot-default-protocol
Browse files Browse the repository at this point in the history
  • Loading branch information
svrnm authored Apr 15, 2024
2 parents bcf27f0 + c83c483 commit 5d671f8
Show file tree
Hide file tree
Showing 19 changed files with 325 additions and 24 deletions.
2 changes: 1 addition & 1 deletion .gitmodules
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
[submodule "content-modules/opentelemetry-specification"]
path = content-modules/opentelemetry-specification
url = https://github.com/open-telemetry/opentelemetry-specification.git
spec-pin = v1.31.0
spec-pin = v1.32.0
[submodule "content-modules/community"]
path = content-modules/community
url = https://github.com/open-telemetry/community
Expand Down
222 changes: 222 additions & 0 deletions content/en/blog/2024/scaling-collectors.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,222 @@
---
title: Manage OpenTelemetry Collectors at scale with Ansible
linkTitle: Collectors at scale with Ansible
date: 2024-04-15
author: '[Ishan Jain](https://github.com/ishanjainn) (Grafana)'
cSpell:ignore: ansible associated Ishan ishanjainn Jain
---

You can scale the deployment of
[OpenTelemetry Collector](/docs/collector/deployment/) across multiple Linux
hosts through [Ansible](https://www.ansible.com/), to function both as
[gateways](/docs/collector/deployment/gateway/) and
[agents](/docs/collector/deployment/agent/) within your observability
architecture. Using the OpenTelemetry Collector in this dual capacity enables a
robust collection and forwarding of metrics, traces, and logs to analysis and
visualization platforms.

We outline a strategy for deploying and managing the OpenTelemetry Collector's
scalable instances throughout your infrastructure using Ansible. In the
following example, we'll use [Grafana](https://grafana.com/) as the target
backend for metrics.

## Prerequisites

Before we begin, make sure you meet the following requirements:

- Ansible installed on your base system
- SSH access to two or more Linux hosts
- Prometheus configured to gather your metrics

## Install the Grafana Ansible collection

The
[OpenTelemetry Collector role](https://github.com/grafana/grafana-ansible-collection/tree/main/roles/opentelemetry_collector)
is provided through the
[Grafana Ansible collection](https://docs.ansible.com/ansible/latest/collections/grafana/grafana/)
as of release 4.0.

To install the Grafana Ansible collection, run this command:

```sh
ansible-galaxy collection install grafana.grafana
```

## Create an Ansible inventory file

Next, gather the IP addresses and URLs associated with your Linux hosts and
create an inventory file.

1. Create an Ansible inventory file.

An Ansible inventory, which resides in a file named `inventory`, lists each
host IP on a separate line, like this (8 hosts shown):

```properties
10.0.0.1 # hostname = ubuntu-01
10.0.0.2 # hostname = ubuntu-02
10.0.0.3 # hostname = centos-01
10.0.0.4 # hostname = centos-02
10.0.0.5 # hostname = debian-01
10.0.0.6 # hostname = debian-02
10.0.0.7 # hostname = fedora-01
10.0.0.8 # hostname = fedora-02
```

2. Create an `ansible.cfg` file within the same directory as `inventory`, with
the following values:

```toml
[defaults]
inventory = inventory # Path to the inventory file
private_key_file = ~/.ssh/id_rsa # Path to private SSH Key
remote_user=root
```

## Use the OpenTelemetry Collector Ansible role

Next, define an Ansible playbook to apply your chosen or created OpenTelemetry
Collector role across your hosts.

Create a file named `deploy-opentelemetry.yml` in the same directory as your
`ansible.cfg` and `inventory` files:

```yaml
- name: Install OpenTelemetry Collector
hosts: all
become: true

tasks:
- name: Install OpenTelemetry Collector
ansible.builtin.include_role:
name: opentelemetry_collectorr
vars:
otel_collector_receivers:
hostmetrics:
collection_interval: 60s
scrapers:
cpu: {}
disk: {}
load: {}
filesystem: {}
memory: {}
network: {}
paging: {}
process:
mute_process_name_error: true
mute_process_exe_error: true
mute_process_io_error: true
processes: {}

otel_collector_processors:
batch:
resourcedetection:
detectors: [env, system]
timeout: 2s
system:
hostname_sources: [os]
transform/add_resource_attributes_as_metric_attributes:
error_mode: ignore
metric_statements:
- context: datapoint
statements:
- set(attributes["deployment.environment"],
resource.attributes["deployment.environment"])
- set(attributes["service.version"],
resource.attributes["service.version"])

otel_collector_exporters:
prometheusremotewrite:
endpoint: https://<prometheus-url>/api/prom/push
headers:
Authorization: 'Basic <base64-encoded-username:password>'

otel_collector_service:
pipelines:
metrics:
receivers: [hostmetrics]
processors:
[
resourcedetection,
transform/add_resource_attributes_as_metric_attributes,
batch,
]
exporters: [prometheusremotewrite]
```
{{% alert title="Note" %}}
Adjust the configuration to match the specific telemetry you intend to collect
as well as where you plan to forward it to. This configuration snippet is a
basic example designed for collecting host metrics that get forwarded to
Prometheus.
{{% /alert %}}
The previous configuration would provision the OpenTelemetry Collector to
collect metrics from the Linux host.
## Running the Ansible playbook
Deploy the OpenTelemetry Collector across your hosts by running the following
command:
```sh
ansible-playbook deploy-opentelemetry.yml
```

## Check your metrics in the backend

After your OpenTelemetry Collectors start sending metrics to Prometheus, follow
these steps to visualize them in Grafana:

### Set up Grafana

1. **Install Docker**: Make sure Docker is installed on your system.

2. **Run Grafana Docker Container**: Start a Grafana server with the following
command, which fetches the latest Grafana image:

```sh
docker run -d -p 3000:3000 --name=grafana grafana/grafana
```

3. **Access Grafana**: Open <http://localhost:3000> in your web browser. The
default login username and password are both `admin`.

4. **Change passwords** when prompted on first login -- pick a secure one!

For other installation methods and more detailed instructions, refer to the
[official Grafana documentation](https://grafana.com/docs/grafana/latest/#installing-grafana).

### Add Prometheus as a data source

1. In Grafana, navigate to **Connections** > **Data Sources**.
2. Click **Add data source** and select **Prometheus**.
3. In the settings, enter your Prometheus URL, for example,
`http://<your_prometheus_host>`, along with any other necessary details.
4. Select **Save & Test**.

### Explore your metrics

1. Go to the **Explore** page
2. In the Query editor, select your data source and enter the following query

```PromQL
100 - (avg by (cpu) (irate(system_cpu_time{state="idle"}[5m])) * 100)
```

This query calculates the average percentage of CPU time not spent in the
"idle" state, across each CPU core, over the last 5 minutes.

3. Explore other metrics and create dashboards to gain insights into your
system's performance.

This blog post illustrated how you can configure and deploy multiple
OpenTelemetry Collectors across various Linux hosts with the help of Ansible, as
well as visualize collected telemetry in Grafana. Incase you find this useful,
GitHub repository for
[OpenTelemetry Collector role](https://github.com/grafana/grafana-ansible-collection/tree/main/roles/opentelemetry_collector)
for detailed configuration options. If you have questions, You can connect with
me using my contact details at my GitHub profile
[@ishanjainn](https://github.com/ishanjainn).
2 changes: 1 addition & 1 deletion content/en/docs/collector/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: Collector
description: Vendor-agnostic way to receive, process and export telemetry data.
aliases: [collector/about]
cascade:
vers: 0.97.0
vers: 0.98.0
weight: 10
---

Expand Down
12 changes: 6 additions & 6 deletions content/en/docs/collector/deployment/agent.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,8 +42,8 @@ like so:
receivers:
otlp: # the OTLP receiver the app is sending traces to
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318

processors:
batch:
Expand All @@ -66,8 +66,8 @@ service:
receivers:
otlp: # the OTLP receiver the app is sending metrics to
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318

processors:
batch:
Expand All @@ -90,8 +90,8 @@ service:
receivers:
otlp: # the OTLP receiver the app is sending logs to
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318

processors:
batch:
Expand Down
1 change: 1 addition & 0 deletions content/en/docs/demo/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ found here:
- [Quote Service](services/quote/)
- [Recommendation Service](services/recommendation/)
- [Shipping Service](services/shipping/)
- [Image Provider Service](services/imageprovider/)

## Scenarios

Expand Down
2 changes: 2 additions & 0 deletions content/en/docs/demo/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ emailservice(Email Service):::ruby
frauddetectionservice(Fraud Detection Service):::kotlin
frontend(Frontend):::typescript
frontendproxy(Frontend Proxy <br/>&#40Envoy&#41):::cpp
imageprovider(Image Provider <br/>&#40nginx&#41):::cpp
loadgenerator([Load Generator]):::python
paymentservice(Payment Service):::javascript
productcatalogservice(Product Catalog Service):::golang
Expand All @@ -33,6 +34,7 @@ queue[(queue<br/>&#40Kafka&#41)]
Internet -->|HTTP| frontendproxy
frontendproxy -->|HTTP| frontend
loadgenerator -->|HTTP| frontendproxy
frontendproxy -->|HTTP| imageprovider
queue -->|TCP| accountingservice
queue -->|TCP| frauddetectionservice
Expand Down
4 changes: 2 additions & 2 deletions content/en/docs/demo/feature-flags.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,8 @@ cSpell:ignore: flagd loadgenerator OLJCESPC7Z
The demo provides several feature flags that you can use to simulate different
scenarios. These flags are managed by [`flagd`](https://flagd.dev), a simple
feature flag service that supports [OpenFeature](https://openfeature.dev). Flag
values are stored in the `demo.flagd.json` file. To enable a flag, change the
`defaultVariant` value in the config file for a given flag to "on".
values are stored in the `src/flagd/demo.flagd.json` file. To enable a flag,
change the `defaultVariant` value in the config file for a given flag to "on".

| Feature Flag | Service(s) | Description |
| ----------------------------------- | ---------------- | --------------------------------------------------------------------------------------------------------- |
Expand Down
10 changes: 10 additions & 0 deletions content/en/docs/demo/services/imageprovider.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
---
title: Image Provider Service
linkTitle: Image Provider
---

This service provides the images which are used in the frontend. The images are
statically hosted on a NGINX instance. The NGINX server is instrumented with the
[nginx-otel module](https://github.com/nginxinc/nginx-otel/tree/main).

[Image Provider service source](https://github.com/open-telemetry/opentelemetry-demo/blob/main/src/imageprovider/)
30 changes: 28 additions & 2 deletions content/en/docs/demo/services/quote.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,8 +92,34 @@ $span->addEvent('Quote processed, response sent back', [

## Metrics

TBD
In this demo, metrics are emitted by the batch trace and logs processors. The
metrics describe the internal state of the processor, such as number of exported
spans or logs, the queue limit, and queue usage.

You can enable metrics by setting the environment variable
`OTEL_PHP_INTERNAL_METRICS_ENABLED` to `true`.

A manual metric is also emitted, which counts the number of quotes generated,
including an attribute for the number of items.

A counter is created from the globally configured Meter Provider, and is
incremented each time a quote is generated:

```php
static $counter;
$counter ??= Globals::meterProvider()
->getMeter('quotes')
->createCounter('quotes', 'quotes', 'number of quotes calculated');
$counter->add(1, ['number_of_items' => $numberOfItems]);
```

Metrics accumulate and are exported periodically based on the value configured
in `OTEL_METRIC_EXPORT_INTERVAL`.

## Logs

TBD
The quote service emits a log message after a quote is calculated. The Monolog
logging package is configured with a
[Logs Bridge](/docs/concepts/signals/logs/#log-appender--bridge) which converts
Monolog logs into the OpenTelemetry format. Logs sent to this logger will be
exported via the globally configured OpenTelemetry logger.
2 changes: 1 addition & 1 deletion content/en/docs/demo/telemetry-features/trace-coverage.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ aliases: [trace_service_features, trace-features, ../trace-features]
| Checkout | Go |||| 🔕 | 🔕 | 🔕 ||
| Currency | C++ | 🔕 |||| 🔕 | 🔕 | 🚧 |
| Email | Ruby |||| 🔕 | 🔕 | 🔕 | 🚧 |
| Fraud Detection | Kotlin || 🚧 | 🚧 | 🚧 | 🚧 | 🚧 | 🚧 |
| Fraud Detection | Kotlin || 🚧 | 🚧 | 🚧 | | 🚧 | 🚧 |
| Frontend | JavaScript |||| 🔕 ||||
| Payment | JavaScript |||| 🔕 | 🔕 |||
| Product Catalog | Go || 🔕 || 🔕 | 🔕 | 🔕 | 🚧 |
Expand Down
2 changes: 1 addition & 1 deletion data/registry/instrumentation-dotnet-aspnetcore.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@ createdAt: 2022-11-07
package:
registry: nuget
name: OpenTelemetry.Instrumentation.AspNetCore
version: 1.8.0
version: 1.8.1
2 changes: 1 addition & 1 deletion data/registry/instrumentation-dotnet-aws.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,4 +19,4 @@ createdAt: 2022-10-28
package:
registry: nuget
name: OpenTelemetry.Instrumentation.AWS
version: 1.1.0-beta.3
version: 1.1.0-beta.4
2 changes: 1 addition & 1 deletion data/registry/instrumentation-dotnet-http.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@ createdAt: 2022-11-07
package:
registry: nuget
name: OpenTelemetry.Instrumentation.Http
version: 1.8.0
version: 1.8.1
Original file line number Diff line number Diff line change
Expand Up @@ -15,4 +15,4 @@ createdAt: 2022-10-28
package:
registry: nuget
name: OpenTelemetry.Instrumentation.AWS
version: 1.1.0-beta.3
version: 1.1.0-beta.4
2 changes: 1 addition & 1 deletion data/registry/instrumentation-dotnet-kafkaflow.yml
Original file line number Diff line number Diff line change
Expand Up @@ -20,4 +20,4 @@ isFirstParty: true
package:
name: KafkaFlow.OpenTelemetry
registry: nuget
version: 3.0.6
version: 3.0.7
Loading

0 comments on commit 5d671f8

Please sign in to comment.