Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Collector security documentation #5209

Merged
merged 48 commits into from
Nov 2, 2024
Merged
Show file tree
Hide file tree
Changes from 41 commits
Commits
Show all changes
48 commits
Select commit Hold shift + click to select a range
631930d
Edit security index.md
tiffany76 Sep 17, 2024
2dc2e17
Copy content into config-best-practices.md
tiffany76 Sep 17, 2024
0d88c24
Merge branch 'main' into collector-security
tiffany76 Sep 19, 2024
135fc1d
Fix spelling issues
tiffany76 Sep 19, 2024
6063b77
Make linter fixes
tiffany76 Sep 19, 2024
2de00e2
Update links on index.md
tiffany76 Sep 19, 2024
7cbbcf6
Copy content into hosting-best-practices.md
tiffany76 Sep 19, 2024
b4987c3
Add TODO
tiffany76 Sep 19, 2024
79688ec
Edits to config and hosting best practices
tiffany76 Sep 19, 2024
be0d8ee
Make Prettier fix
tiffany76 Sep 19, 2024
3e7c3c9
Apply suggestions from Juraci
tiffany76 Sep 23, 2024
932a3d5
Merge branch 'main' into collector-security
tiffany76 Sep 23, 2024
a1427f2
Update index.md with PII no-no
tiffany76 Sep 23, 2024
cc131a8
Add headings for child pages to index.md
tiffany76 Sep 23, 2024
cbf67a9
Update config receivers and exporters section
tiffany76 Sep 24, 2024
02d0944
Make link and linter fixes
tiffany76 Sep 24, 2024
3cee030
Merge branch 'main' into collector-security
tiffany76 Sep 24, 2024
500d331
Update DOS safeguard section
tiffany76 Sep 24, 2024
0a05a95
Adjust info architecture of all pages
tiffany76 Sep 24, 2024
b53353a
Merge branch 'main' into collector-security
tiffany76 Sep 26, 2024
18283d5
Edit protocol section
tiffany76 Sep 27, 2024
bb8852f
Change headings and info arch
tiffany76 Sep 27, 2024
0532e02
Edit scrubbing sensitive data section
tiffany76 Sep 27, 2024
92eccee
Merge branch 'main' into collector-security
tiffany76 Oct 6, 2024
36d55c1
Create new top level section and rework content
tiffany76 Oct 6, 2024
c95ba0c
Create 'specific risk' section
tiffany76 Oct 6, 2024
dbfd1b6
Remove forwarding section
tiffany76 Oct 6, 2024
2a582e7
Merge branch 'main' into collector-security
tiffany76 Oct 14, 2024
8605abb
Edit resource utlization section
tiffany76 Oct 14, 2024
44ae838
Minor edits
tiffany76 Oct 15, 2024
2e606e2
Make cSpell happy
tiffany76 Oct 15, 2024
fb566d1
Apply suggestions from Pablo's review
tiffany76 Oct 18, 2024
6d9150f
Merge branch 'main' into collector-security
tiffany76 Oct 18, 2024
961ac47
Fix formatting and remove sections
tiffany76 Oct 18, 2024
af95741
Rewrite and move observers section
tiffany76 Oct 18, 2024
73f4c38
Rewrite hosting best practices
tiffany76 Oct 18, 2024
3f856e9
Apply suggestions from Pablo's second review
tiffany76 Oct 19, 2024
faf27e4
Merge branch 'main' into collector-security
tiffany76 Oct 28, 2024
f099457
Rewrite index page opening
tiffany76 Oct 28, 2024
3b86e29
Rewrite least privilege section
tiffany76 Oct 28, 2024
ea36f37
Results from /fix:refcache
opentelemetrybot Oct 28, 2024
a496603
Merge branch 'main' into collector-security
reyang Oct 29, 2024
3eceaba
Update content/en/docs/security/hosting-best-practices.md
tiffany76 Oct 29, 2024
cba2e7c
Merge branch 'main' into collector-security
tiffany76 Oct 31, 2024
5f9dff7
Merge branch 'main' into collector-security
tiffany76 Nov 1, 2024
ac9713f
Apply fixes from Patrice
tiffany76 Nov 2, 2024
74ce955
Results from /fix:format
opentelemetrybot Nov 2, 2024
c13f118
Merge branch 'main' into collector-security
tiffany76 Nov 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions content/en/docs/security/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,41 @@
title: Security
weight: 970
---

In this section, learn how the OpenTelemetry project discloses vulnerabilities
and responds to incidents and discover what you can do to securely collect and
transmit your observability data.

## Common Vulnerabilities and Exposures (CVEs)

For CVEs across all repositories, see
[Common Vulnerabilities and Exposures](/docs/security/cve).

## Incident response

Learn how to report a vulnerability or find out how incident responses are
handled in
[Community incident response guidelines](/docs/security/security-response).

## Collector security

When setting up the OpenTelemetry Collector, consider implementing security best
practices in both your hosting infrastructure and your Collector configuration.
Running a secure Collector can help you

- Protect telemetry that shouldn't but might contain sensitive information, such
as personally identifiable information (PII), application-specific data, or
network traffic patterns.
- Prevent data tampering that makes telemetry unreliable and disrupts incident
responses.
- Comply with data privacy and security regulations.
- Defend against denial of service (DoS) attacks.

See [Hosting best practices](/docs/security/hosting-best-practices) to learn how
to secure your Collector's infrastructure.

See [Configuration best practices](/docs/security/config-best-practices) to
learn how to securely configure your Collector.
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved

For Collector component developers, see
[Security best practices](https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md).
208 changes: 208 additions & 0 deletions content/en/docs/security/config-best-practices.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,208 @@
---
title: Collector configuration best practices
linkTitle: Collector configuration
weight: 112
cSpell:ignore: exporterhelper
---

When configuring the OpenTelemetry (OTel) Collector, consider these best
practices to better secure your Collector instance.

## Create secure configurations

Follow these guidelines to secure your Collector's configuration and its
pipelines.

### Store your configuration securely

The Collector's configuration might contain sensitive information including:

- Authentication information such as API tokens.
- TLS certificates including private keys.

You should store sensitive information securely such as on an encrypted
filesystem or secret store. You can use environment variables to handle
sensitive and non-sensitive data as the Collector supports
[environment variable expansion](/docs/collector/configuration/#environment-variables).

### Use encryption and authentication

Your OTel Collector configuration should include encryption and authentication.

- For communication encryption, see
[Configuring certificates](/docs/collector/configuration/#setting-up-certificates).
- For authentication, use the OTel Collector's authentication mechanism, as
described in [Authentication](/docs/collector/configuration/#authentication).

### Minimize the number of components

We recommend limiting the set of components in your Collector configuration to
only those you need. Minimizing the number of components you use minimizes the
attack surface exposed.

- Use the
[OpenTelemetry Collector Builder (`ocb`)](/docs/collector/custom-collector) to
create a Collector distribution that uses only the components you need.
- Remove unused components from your configuration.

### Configure with care

Some components can increase the security risk of your Collector pipelines.

- Receivers, exporters, and other components should establish network
connections over a secure channel, potentially authenticated as well.
- Receivers and exporters might expose buffer, queue, payload, and worker
settings using configuration parameters. If these settings are available, you
should proceed with caution before modifying the default configuration values.
Improperly setting these values might expose the OpenTelemetry Collector to
additional attack vectors.

## Set permissions carefully

Avoid running the Collector as a root user. Some components might require
special permissions, however. In those cases, follow the principle of least
privilege and make sure your components only have the access they need to do
their job.

### Observers

Observers are implemented as extensions. Extensions are a type of component that
adds capabilities on top of the primary functions of the Collector. Extensions
don't require direct access to telemetry and aren't part of pipelines, but they
can still pose security risks if they require special permissions.

An observer discovers networked endpoints such as a Kubernetes pod, Docker
container, or local listening port on behalf of the
[receiver creator](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/receiver/receivercreator/README.md).
In order to discover services, observers might require greater access. For
example, the `k8s_observer` requires
[role-based access control (RBAC) permissions](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/extension/observer/k8sobserver#setting-up-rbac-permissions)
in Kubernetes.

## Manage specific security risks

Configure your Collector to block these security threats.

### Protect against denial of service attacks

For server-like receivers and extensions, you can protect your Collector from
exposure to the public internet or to wider networks than necessary by binding
these components' endpoints to addresses that limit connections to authorized
users. Try to always use specific interfaces, such as a pod's IP, or `localhost`
instead of `0.0.0.0`. For more information, see
[CWE-1327: Binding to an Unrestricted IP Address](https://cwe.mitre.org/data/definitions/1327.html).

From Collector v0.110.0, the default host for all servers in Collector
components is `localhost`. For earlier versions of the Collector, change the
default endpoint from `0.0.0.0` to `localhost` in all components by enabling the
`component.UseLocalHostAsDefaultHost`
[feature gate](https://github.com/open-telemetry/opentelemetry-collector/tree/main/featuregate).

If `localhost` resolves to a different IP due to your DNS settings, then
explicitly use the loopback IP instead: `127.0.0.1` for IPv4 or `::1` for IPv6.
For example, here's an IPv4 configuration using a `gRPC` port:
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved

```yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 127.0.0.1:4317
```

In IPv6 setups, make sure your system supports both IPv4 and IPv6 loopback
addresses so the network functions properly in dual-stack environments and
applications, where both protocol versions are used.

If you are working in environments that have nonstandard networking setups, such
as Docker or Kubernetes, see the
[example configurations](https://github.com/open-telemetry/opentelemetry-collector/blob/main/docs/security-best-practices.md#safeguards-against-denial-of-service-attacks)
in our component developer documentation for ideas on how to bind your component
endpoints.

### Scrub sensitive data

[Processors](/docs/collector/configuration/#processors) are the Collector
components that sit between receivers and exporters. They are responsible for
processing telemetry before it's analyzed. You can use the OpenTelemetry
Collector's `redaction` processor to obfuscate or scrub sensitive data before
exporting it to a backend.

The
[`redaction` processor](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/redactionprocessor)
deletes span, log, and metric datapoint attributes that don't match a list of
allowed attributes. It also masks attribute values that match a blocked value
list. Attributes that aren't on the allowed list are removed before any value
checks are done.

For example, here is a configuration that masks values containing credit card
numbers:

```yaml
processors:
redaction:
allow_all_keys: false
allowed_keys:
- description
- group
- id
- name
ignored_keys:
- safe_attribute
blocked_values: # Regular expressions for blocking values of allowed span attributes
- '4[0-9]{12}(?:[0-9]{3})?' # Visa credit card number
- '(5[1-5][0-9]{14})' # MasterCard number
summary: debug
```

See the
[documentation](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/processor/redactionprocessor)
to learn how to add the `redaction` processor to your Collector configuration.

### Safeguard resource utilization

After implementing safeguards for resource utilization in your
[hosting infrastructure](/docs/security/hosting-best-practices/), consider also
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved
adding these safeguards to your OpenTelemetry Collector configuration.

Batching your telemetry and limiting the memory available to your Collector can
prevent out-of-memory errors and usage spikes. You can also handle traffic
spikes by adjusting queue sizes to manage memory usage while avoiding data loss.
For example, use the
[`exporterhelper`](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/README.md)
to manage queue size for your `otlp` exporter:

```yaml
exporters:
otlp:
endpoint: <ENDPOINT>
sending_queue:
queue_size: 800
```

Filtering unwanted telemetry is another way you can protect your Collector's
resources. Not only does filtering protect your Collector instance, but it also
reduces the load on your backend. You can use the
[`filter` processor](/docs/collector/transforming-telemetry/#basic-filtering) to
drop logs, metrics, and spans you don't need. For example, here's a
configuration that drops non-HTTP spans:

```yaml
processors:
filter:
error_mode: ignore
traces:
span:
- attributes["http.request.method"] == nil
```

You can also configure your components with appropriate timeout and retry
limits. These limits should allow your Collector to handle failures without
accumulating too much data in memory. See the
[`exporterhelper` documentation](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/README.md)
for more information.

Finally, consider using compression with your exporters to reduce the send size
of your data and conserve network and CPU resources. By default, the
[`otlp` exporter](https://github.com/open-telemetry/opentelemetry-collector/tree/main/exporter/otlpexporter)
uses `gzip` compression.
2 changes: 1 addition & 1 deletion content/en/docs/security/cve.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
---
title: Common Vulnerabilities and Exposures
weight: 102
weight: 100
---

This is a list of reported Common Vulnerabilities and Exposures (CVEs) across
Expand Down
63 changes: 63 additions & 0 deletions content/en/docs/security/hosting-best-practices.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
---
title: Collector hosting best practices
linkTitle: Collector hosting
weight: 115
---

When setting up hosting for OpenTelemetry (OTel) Collector, consider these best
practices to better secure your hosting instance.

## Store data securely

Your Collector configuration file might contain sensitive data, including
authentication tokens or TLS certificates. See the best practices for
[securing your configuration](/docs/security/config-best-practices/#create-secure-configurations).
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved

If you are storing telemetry for processing, make sure to restrict access to
those directories to prevent tampering with raw data.

## Keep your secrets safe

Kubernetes [secrets](https://kubernetes.io/docs/concepts/configuration/secret/)
are credentials that hold confidential data. They authenticate and authorize
privileged access. If you're using a Kubernetes deployment for your Collector,
make sure to follow these
[recommended practices](https://kubernetes.io/docs/concepts/security/secrets-good-practices/)
to improve security for your clusters.

## Apply the principle of least privilege

The Collector should not require privileged access, except where the data it's
collecting is in a privileged location. For example, in a Kubernetes deployment,
system logs, application logs, and container runtime logs are often stored in a
node volume that requires special permission to access. If your Collector is
running as a daemonset on the node, make sure to grant only the specific volume
mount permissions it needs to access these logs and no more. You can configure
privilege access with role-based access control (RBAC). See
[RBAC good practices](https://kubernetes.io/docs/concepts/security/rbac-good-practices/)
for more information.

## Control access to server-like components

Some Collector components such as receivers and exporters can function like
servers. To limit access to authorized users, you should
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved

- Enable authentication by using bearer token authentication extensions and
basic authentication extensions, for example.
- Restrict the IPs that your Collector runs on.

## Safeguard resource utilization

Use the Collector's own
[internal telemetry](/docs/collector/internal-telemetry/) to monitor its
performance. Collect metrics from the Collector about its CPU, memory, and
throughput usage and set alerts for resource exhaustion.

If resource limits are reached, consider horizontally
[scaling the Collector](/docs/collector/scaling/) by deploying multiple
instances in a load-balanced configuration. Scaling your Collector distributes
the resource demands and prevents bottlenecks.

Once you secure resource utilization in your deployment, make sure your
Collector instance also uses
[safeguards in its configuration](/docs/security/config-best-practices/#safeguard-resource-utilization).
tiffany76 marked this conversation as resolved.
Show resolved Hide resolved
2 changes: 1 addition & 1 deletion content/en/docs/security/security-response.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: Community Incident Response Guidelines
title: Community incident response guidelines
weight: 102
---

Expand Down
12 changes: 12 additions & 0 deletions static/refcache.json
Original file line number Diff line number Diff line change
Expand Up @@ -8715,6 +8715,10 @@
"StatusCode": 206,
"LastSeen": "2024-08-09T10:44:30.895853-04:00"
},
"https://kubernetes.io/docs/concepts/configuration/secret/": {
"StatusCode": 206,
"LastSeen": "2024-10-17T20:41:39.419625448-07:00"
},
"https://kubernetes.io/docs/concepts/configuration/secret/#using-a-secret": {
"StatusCode": 206,
"LastSeen": "2024-04-25T00:01:05.630302-04:00"
Expand Down Expand Up @@ -8743,6 +8747,14 @@
"StatusCode": 206,
"LastSeen": "2024-08-09T10:45:22.265624-04:00"
},
"https://kubernetes.io/docs/concepts/security/rbac-good-practices/": {
"StatusCode": 206,
"LastSeen": "2024-10-28T23:48:13.923440181Z"
},
"https://kubernetes.io/docs/concepts/security/secrets-good-practices/": {
"StatusCode": 206,
"LastSeen": "2024-10-17T20:41:39.602462106-07:00"
},
"https://kubernetes.io/docs/concepts/services-networking/service/": {
"StatusCode": 206,
"LastSeen": "2024-01-30T06:06:10.439014-05:00"
Expand Down