Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: OpenTelemetry module integration #9062

Merged
merged 31 commits into from
Mar 22, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
edf88a1
OpenTelemetry module integration
esigo Sep 18, 2022
23f64ab
e2e test
esigo Sep 18, 2022
a9a3914
e2e test fix
esigo Sep 23, 2022
2526e49
Merge branch 'kubernetes:main' into otel2
esigo Sep 23, 2022
c8ec79a
default OpentelemetryConfig
esigo Sep 24, 2022
f03a819
e2e values
esigo Sep 24, 2022
840f005
mount otel module for otel test only
esigo Sep 25, 2022
3a9bca3
Merge branch 'kubernetes:main' into otel2
esigo Sep 28, 2022
dc319e7
propagate IS_CHROOT
esigo Sep 29, 2022
2d67d1c
propagate IS_CHROOT e2e test
esigo Sep 29, 2022
4e63238
code doc
esigo Sep 30, 2022
4eb38b8
comments
esigo Oct 1, 2022
5420d9a
golint
esigo Oct 5, 2022
ea3f215
opentelemetry doc
esigo Oct 11, 2022
ac09fcc
zipkin
esigo Oct 16, 2022
d9df16c
zipkin
esigo Oct 16, 2022
2ce5273
typo
esigo Nov 7, 2022
bce3358
Merge remote-tracking branch 'upstream/main' into otel2
esigo Dec 31, 2022
dbc5e4d
update e2e test OpenTelemetry value
esigo Dec 31, 2022
98551fa
Merge remote-tracking branch 'upstream/main' into otel2
esigo Jan 11, 2023
f6d0492
Merge remote-tracking branch 'upstream/main' into otel-9016-3
esigo Jan 15, 2023
4e71867
use opentelemetry value
esigo Jan 15, 2023
f5f3edb
Merge remote-tracking branch 'upstream/main' into otel2
esigo Feb 5, 2023
f3e609a
Merge remote-tracking branch 'upstream/main' into otel-9016-3
esigo Feb 5, 2023
dfb26e9
revert merge conflict
esigo Feb 5, 2023
bfdff0b
fix
esigo Feb 5, 2023
a1d9aab
format
esigo Feb 5, 2023
6b8f0c6
review comments
esigo Mar 21, 2023
8d4a2a1
Merge Documentation
esigo Mar 21, 2023
692bad9
Merge remote-tracking branch 'upstream/main' into otel2
esigo Mar 21, 2023
28372b0
clean
esigo Mar 21, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/images/otel-grafana-demo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/otel-jaeger-demo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/otel-zipkin-demo.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions docs/kubectl-plugin.md
Original file line number Diff line number Diff line change
Expand Up @@ -208,6 +208,7 @@ modsecurity
modules
nginx.conf
opentracing.json
opentelemetry.toml
owasp-modsecurity-crs
template
```
Expand Down
20 changes: 20 additions & 0 deletions docs/user-guide/nginx-configuration/annotations.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,8 @@ You can add these Kubernetes annotations to specific Ingress objects to customiz
|[nginx.ingress.kubernetes.io/enable-access-log](#enable-access-log)|"true" or "false"|
|[nginx.ingress.kubernetes.io/enable-opentracing](#enable-opentracing)|"true" or "false"|
|[nginx.ingress.kubernetes.io/opentracing-trust-incoming-span](#opentracing-trust-incoming-span)|"true" or "false"|
|[nginx.ingress.kubernetes.io/enable-opentelemetry](#enable-opentelemetry)|"true" or "false"|
|[nginx.ingress.kubernetes.io/opentelemetry-trust-incoming-span](#opentelemetry-trust-incoming-spans)|"true" or "false"|
|[nginx.ingress.kubernetes.io/enable-influxdb](#influxdb)|"true" or "false"|
|[nginx.ingress.kubernetes.io/influxdb-measurement](#influxdb)|string|
|[nginx.ingress.kubernetes.io/influxdb-port](#influxdb)|string|
Expand Down Expand Up @@ -821,6 +823,24 @@ sometimes need to be overridden to enable it or disable it for a specific ingres
nginx.ingress.kubernetes.io/opentracing-trust-incoming-span: "true"
```

### Enable Opentelemetry

Opentelemetry can be enabled or disabled globally through the ConfigMap but this will sometimes need to be overridden
to enable it or disable it for a specific ingress (e.g. to turn off telemetry of external health check endpoints)

```yaml
nginx.ingress.kubernetes.io/enable-opentelemetry: "true"
```

### Opentelemetry Trust Incoming Span

The option to trust incoming trace spans can be enabled or disabled globally through the ConfigMap but this will
sometimes need to be overridden to enable it or disable it for a specific ingress (e.g. only enable on a private endpoint)

```yaml
nginx.ingress.kubernetes.io/opentelemetry-trust-incoming-spans: "true"
```

### X-Forwarded-Prefix Header
To add the non-standard `X-Forwarded-Prefix` header to the upstream request with a string value, the following annotation can be used:

Expand Down
53 changes: 53 additions & 0 deletions docs/user-guide/nginx-configuration/configmap.md
100755 → 100644
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,19 @@ The following table shows a configuration option's name, type, and the default v
|[datadog-operation-name-override](#datadog-operation-name-override)|string|"nginx.handle"|
|[datadog-priority-sampling](#datadog-priority-sampling)|bool|"true"|
|[datadog-sample-rate](#datadog-sample-rate)|float|1.0|
|[enable-opentelemetry](#enable-opentelemetry)|bool|"false"|
|[opentelemetry-trust-incoming-span](#opentelemetry-trust-incoming-span)|bool|"true"|
|[opentelemetry-operation-name](#opentelemetry-operation-name)|string|""|
|[opentelemetry-config](#/etc/nginx/opentelemetry.toml)|string|"/etc/nginx/opentelemetry.toml"|
|[otlp-collector-host](#otlp-collector-host)|string|""|
|[otlp-collector-port](#otlp-collector-port)|int|4317|
|[otel-max-queuesize](#otel-max-queuesize)|int||
|[otel-schedule-delay-millis](#otel-schedule-delay-millis)|int||
|[otel-max-export-batch-size](#otel-max-export-batch-size)|int||
|[otel-service-name](#otel-service-name)|string|"nginx"|
|[otel-sampler](#otel-sampler)|string|"AlwaysOff"|
|[otel-sampler-parent-based](#otel-sampler-parent-based)|bool|"false"|
|[otel-sampler-ratio](#otel-sampler-ratio)|float|0.01|
|[main-snippet](#main-snippet)|string|""|
|[http-snippet](#http-snippet)|string|""|
|[server-snippet](#server-snippet)|string|""|
Expand Down Expand Up @@ -1009,6 +1022,46 @@ If true disables client-side sampling (thus ignoring `sample_rate`) and enables
Specifies sample rate for any traces created.
This is effective only when `datadog-priority-sampling` is `false` _**default:**_ 1.0

## enable-opentelemetry

Enables the nginx OpenTelemetry extension. _**default:**_ is disabled

_References:_
[https://github.com/open-telemetry/opentelemetry-cpp-contrib](https://github.com/open-telemetry/opentelemetry-cpp-contrib/tree/main/instrumentation/nginx)

## opentelemetry-operation-name

Specifies a custom name for the server span. _**default:**_ is empty

For example, set to "HTTP $request_method $uri".

## otlp-collector-host

Specifies the host to use when uploading traces. It must be a valid URL.

## otlp-collector-port

Specifies the port to use when uploading traces. _**default:**_ 4317

## otel-service-name

Specifies the service name to use for any traces created. _**default:**_ nginx

## opentelemetry-trust-incoming-span: "true"
Enables or disables using spans from incoming requests as parent for created ones. _**default:**_ true

## otel-sampler-parent-based

Uses sampler implementation which by default will take a sample if parent Activity is sampled. _**default:**_ false

## otel-sampler-ratio

Specifies sample rate for any traces created. _**default:**_ 0.01

## otel-sampler

Specifies the sampler to be used when sampling traces. The available samplers are: AlwaysOff, AlwaysOn, TraceIdRatioBased, remote. _**default:**_ AlwaysOff

## main-snippet

Adds custom configuration to the main section of the nginx configuration.
Expand Down
260 changes: 260 additions & 0 deletions docs/user-guide/third-party-addons/opentelemetry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,260 @@
# OpenTelemetry

Enables requests served by NGINX for distributed telemetry via The OpenTelemetry Project.

Using the third party module [opentelemetry-cpp-contrib/nginx](https://github.com/open-telemetry/opentelemetry-cpp-contrib/tree/main/instrumentation/nginx) the NGINX ingress controller can configure NGINX to enable [OpenTelemetry](http://opentelemetry.io) instrumentation.
By default this feature is disabled.

## Usage

To enable the instrumentation we must enable OpenTelemetry in the configuration ConfigMap:
```yaml
data:
enable-opentelemetry: "true"
```

To enable or disable instrumentation for a single Ingress, use
the `enable-opentelemetry` annotation:
```yaml
kind: Ingress
metadata:
annotations:
nginx.ingress.kubernetes.io/enable-opentelemetry: "true"
```

We must also set the host to use when uploading traces:

```yaml
otlp-collector-host: "otel-coll-collector.otel.svc"
```
NOTE: While the option is called `otlp-collector-host`, you will need to point this to any backend that recieves otlp-grpc.

Next you will need to deploy a distributed telemetry system which uses OpenTelemetry.
[opentelemetry-collector](https://github.com/open-telemetry/opentelemetry-collector), [Jaeger](https://www.jaegertracing.io/)
[Tempo](https://github.com/grafana/tempo), and [zipkin](https://zipkin.io/)
have been tested.

Other optional configuration options:
```yaml
# specifies the name to use for the server span
opentelemetry-operation-name

# sets whether or not to trust incoming telemetry spans
opentelemetry-trust-incoming-span

# specifies the port to use when uploading traces, Default: 4317
otlp-collector-port

# specifies the service name to use for any traces created, Default: nginx
otel-service-name

# The maximum queue size. After the size is reached data are dropped.
otel-max-queuesize

# The delay interval in milliseconds between two consecutive exports.
otel-schedule-delay-millis

# How long the export can run before it is cancelled.
otel-schedule-delay-millis

# The maximum batch size of every export. It must be smaller or equal to maxQueueSize.
otel-max-export-batch-size

# specifies sample rate for any traces created, Default: 0.01
otel-sampler-ratio

# specifies the sampler to be used when sampling traces.
# The available samplers are: AlwaysOn, AlwaysOff, TraceIdRatioBased, Default: AlwaysOff
otel-sampler

# Uses sampler implementation which by default will take a sample if parent Activity is sampled, Default: false
otel-sampler-parent-based
```

Note that you can also set whether to trust incoming spans (global default is true) per-location using annotations like the following:
```yaml
kind: Ingress
metadata:
annotations:
nginx.ingress.kubernetes.io/opentelemetry-trust-incoming-span: "true"
```

## Examples

The following examples show how to deploy and test different distributed telemetry systems. These example can be performed using Docker Desktop.

In the [esigo/nginx-example](https://github.com/esigo/nginx-example)
GitHub repository is an example of a simple hello service:

```mermaid
graph TB
subgraph Browser
start["http://esigo.dev/hello/nginx"]
end

subgraph app
sa[service-a]
sb[service-b]
sa --> |name: nginx| sb
sb --> |hello nginx!| sa
end

subgraph otel
otc["Otel Collector"]
end

subgraph observability
tempo["Tempo"]
grafana["Grafana"]
backend["Jaeger"]
zipkin["Zipkin"]
end

subgraph ingress-nginx
ngx[nginx]
end

subgraph ngx[nginx]
ng[nginx]
om[OpenTelemetry module]
end

subgraph Node
app
otel
observability
ingress-nginx
om --> |otlp-gRPC| otc --> |jaeger| backend
otc --> |zipkin| zipkin
otc --> |otlp-gRPC| tempo --> grafana
sa --> |otlp-gRPC| otc
sb --> |otlp-gRPC| otc
start --> ng --> sa
end
```

To install the example and collectors run:

1. Enable Ingress addon with:

```yaml
opentelemetry:
enabled: true
image: registry.k8s.io/ingress-nginx/opentelemetry:v20230107-helm-chart-4.4.2-2-g96b3d2165@sha256:331b9bebd6acfcd2d3048abbdd86555f5be76b7e3d0b5af4300b04235c6056c9
containerSecurityContext:
allowPrivilegeEscalation: false
```

2. Enable OpenTelemetry and set the otlp-collector-host:

```yaml
$ echo '
apiVersion: v1
kind: ConfigMap
data:
enable-opentelemetry: "true"
opentelemetry-config: "/etc/nginx/opentelemetry.toml"
opentelemetry-operation-name: "HTTP $request_method $service_name $uri"
opentelemetry-trust-incoming-span: "true"
otlp-collector-host: "otel-coll-collector.otel.svc"
otlp-collector-port: "4317"
otel-max-queuesize: "2048"
otel-schedule-delay-millis: "5000"
otel-max-export-batch-size: "512"
otel-service-name: "nginx-proxy" # Opentelemetry resource name
otel-sampler: "AlwaysOn" # Also: AlwaysOff, TraceIdRatioBased
otel-sampler-ratio: "1.0"
otel-sampler-parent-based: "false"
metadata:
name: ingress-nginx-controller
namespace: ingress-nginx
' | kubectl replace -f -
```

4. Deploy otel-collector, grafana and Jaeger backend:

```bash
# add helm charts needed for grafana and OpenTelemetry collector
helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm repo add grafana https://grafana.github.io/helm-charts
helm repo update
# deply cert-manager needed for OpenTelemetry collector operator
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.9.1/cert-manager.yaml
# create observability namespace
kubectl apply -f https://raw.githubusercontent.com/esigo/nginx-example/main/observability/namespace.yaml
# install OpenTelemetry collector operator
helm upgrade --install otel-collector-operator -n otel --create-namespace open-telemetry/opentelemetry-operator
# deploy OpenTelemetry collector
kubectl apply -f https://raw.githubusercontent.com/esigo/nginx-example/main/observability/collector.yaml
# deploy Jaeger all-in-one
kubectl apply -f https://github.com/jaegertracing/jaeger-operator/releases/download/v1.37.0/jaeger-operator.yaml -n observability
kubectl apply -f https://raw.githubusercontent.com/esigo/nginx-example/main/observability/jaeger.yaml -n observability
# deploy zipkin
kubectl apply -f https://raw.githubusercontent.com/esigo/nginx-example/main/observability/zipkin.yaml -n observability
# deploy tempo and grafana
helm upgrade --install tempo grafana/tempo --create-namespace -n observability
helm upgrade -f https://raw.githubusercontent.com/esigo/nginx-example/main/observability/grafana/grafana-values.yaml --install grafana grafana/grafana --create-namespace -n observability
```

3. Build and deploy demo app:

```bash
# build images
make images

# deploy demo app:
make deploy-app
```

5. Make a few requests to the Service:

```bash
kubectl port-forward --namespace=ingress-nginx service/ingress-nginx-controller 8090:80
curl http://esigo.dev:8090/hello/nginx


StatusCode : 200
StatusDescription : OK
Content : {"v":"hello nginx!"}

RawContent : HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 21
Content-Type: text/plain; charset=utf-8
Date: Mon, 10 Oct 2022 17:43:33 GMT

{"v":"hello nginx!"}

Forms : {}
Headers : {[Connection, keep-alive], [Content-Length, 21], [Content-Type, text/plain; charset=utf-8], [Date,
Mon, 10 Oct 2022 17:43:33 GMT]}
Images : {}
InputFields : {}
Links : {}
ParsedHtml : System.__ComObject
RawContentLength : 21
```

6. View the Grafana UI:

```bash
kubectl port-forward --namespace=observability service/grafana 3000:80
```
In the Grafana interface we can see the details:
![grafana screenshot](../../images/otel-grafana-demo.png "grafana screenshot")

7. View the Jaeger UI:

```bash
kubectl port-forward --namespace=observability service/jaeger-all-in-one-query 16686:16686
```
In the Jaeger interface we can see the details:
![Jaeger screenshot](../../images/otel-jaeger-demo.png "Jaeger screenshot")

8. View the Zipkin UI:

```bash
kubectl port-forward --namespace=observability service/zipkin 9411:9411
```
In the Zipkin interface we can see the details:
![zipkin screenshot](../../images/otel-zipkin-demo.png "zipkin screenshot")
3 changes: 3 additions & 0 deletions internal/ingress/annotations/annotations.go
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ import (
"github.com/imdario/mergo"
"k8s.io/ingress-nginx/internal/ingress/annotations/canary"
"k8s.io/ingress-nginx/internal/ingress/annotations/modsecurity"
"k8s.io/ingress-nginx/internal/ingress/annotations/opentelemetry"
"k8s.io/ingress-nginx/internal/ingress/annotations/proxyssl"
"k8s.io/ingress-nginx/internal/ingress/annotations/sslcipher"
"k8s.io/ingress-nginx/internal/ingress/annotations/streamsnippet"
Expand Down Expand Up @@ -94,6 +95,7 @@ type Ingress struct {
EnableGlobalAuth bool
HTTP2PushPreload bool
Opentracing opentracing.Config
Opentelemetry opentelemetry.Config
Proxy proxy.Config
ProxySSL proxyssl.Config
RateLimit ratelimit.Config
Expand Down Expand Up @@ -145,6 +147,7 @@ func NewAnnotationExtractor(cfg resolver.Resolver) Extractor {
"EnableGlobalAuth": authreqglobal.NewParser(cfg),
"HTTP2PushPreload": http2pushpreload.NewParser(cfg),
"Opentracing": opentracing.NewParser(cfg),
"Opentelemetry": opentelemetry.NewParser(cfg),
"Proxy": proxy.NewParser(cfg),
"ProxySSL": proxyssl.NewParser(cfg),
"RateLimit": ratelimit.NewParser(cfg),
Expand Down
Loading