Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define instrumentation configuration API #4128

Merged
merged 8 commits into from
Aug 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions spec-compliance-matrix.md
Original file line number Diff line number Diff line change
Expand Up @@ -294,10 +294,11 @@ Note: Support for environment variables is optional.
| OTEL_EXPORTER_OTLP_METRICS_DEFAULT_HISTOGRAM_AGGREGATION | | + | | | | | | | | | |
| OTEL_EXPERIMENTAL_CONFIG_FILE | | | | | | | | | | | |

## File Configuration
## Declarative configuration

See [File Configuration](./specification/configuration/file-configuration.md)
jack-berg marked this conversation as resolved.
Show resolved Hide resolved
See [declarative configuration](./specification/configuration/README.md#declarative-configuration)
for details.
Disclaimer: Declarative configuration is currently in Development status - work in progress.

| Feature | Go | Java | JS | Python | Ruby | Erlang | PHP | Rust | C++ | .NET | Swift |
|-------------------------------------------------------------------------------------------------------------------------|----|------|----|--------|------|--------|-----|------|-----|------|-------|
Expand Down
2 changes: 1 addition & 1 deletion specification/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ path_base_for_github_subdir:
- [Metrics](metrics/sdk.md)
- [Logs](logs/sdk.md)
- [Resource](resource/sdk.md)
- [Configuration](configuration/sdk-configuration.md)
- [Configuration](configuration/README.md)
- Data Specification
- [Semantic Conventions](overview.md#semantic-conventions)
- [Protocol](protocol/README.md)
Expand Down
57 changes: 56 additions & 1 deletion specification/configuration/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,59 @@ path_base_for_github_subdir:
to: configuration/README.md
--->

# Configuration
# Overview
jack-berg marked this conversation as resolved.
Show resolved Hide resolved

OpenTelemetry SDK components are highly configurable. This specification
outlines the mechanisms by which OpenTelemetry components can be configured. It
does not attempt to specify the details of what can be configured.

## Configuration Interfaces

### Programmatic

The SDK MUST provide a programmatic interface for all configuration.
This interface SHOULD be written in the language of the SDK itself.
All other configuration mechanisms SHOULD be built on top of this interface.

An example of this programmatic interface is accepting a well-defined
struct on an SDK builder class. From that, one could build a CLI that accepts a
file (YAML, JSON, TOML, ...) and then transforms into that well-defined struct
consumable by the programmatic interface (
see [declarative configuration](#declarative-configuration)).

### Environment variables

jack-berg marked this conversation as resolved.
Show resolved Hide resolved
Environment variable configuration defines a set of language agnostic
environment variables for common configuration goals.

See [OpenTelemetry Environment Variable Specification](./sdk-environment-variables.md).

### Declarative configuration
jack-berg marked this conversation as resolved.
Show resolved Hide resolved

Declarative configuration provides a mechanism for configuring OpenTelemetry
which is more expressive and full-featured than
the [environment variable](#environment-variables) based scheme, and language
agnostic in a way not possible with [programmatic configuration](#programmatic).
jack-berg marked this conversation as resolved.
Show resolved Hide resolved
Notably, declarative configuration defines tooling allowing users to load
OpenTelemetry components according to a file-based representation of a
standardized configuration data model.

Declarative configuration consists of the following main components:

* [Data model](./data-model.md) defines data structures which allow users to
specify an intended configuration of OpenTelemetry SDK components and
instrumentation. The data model includes a file-based representation.
* [Instrumentation configuration API](./api.md) allows
instrumentation libraries to consume configuration by reading relevant
configuration options during initialization.
* [Configuration SDK](./sdk.md) defines SDK capabilities around file
configuration, including an In-Memory configuration model, support for
referencing custom extension plugin interfaces in configuration files, and
operations to parse configuration files and interpret the configuration data
model.

### Other Mechanisms

Additional configuration mechanisms SHOULD be provided in whatever
language/format/style is idiomatic for the language of the SDK. The
SDK can include as many configuration mechanisms as appropriate.
80 changes: 80 additions & 0 deletions specification/configuration/api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,80 @@
# Instrumentation Configuration API

**Status**: [Development](../document-status.md)

<!-- toc -->

- [Overview](#overview)
* [ConfigProvider](#configprovider)
+ [ConfigProvider operations](#configprovider-operations)
- [Get instrumentation config](#get-instrumentation-config)
* [ConfigProperties](#configproperties)

<!-- tocstop -->

## Overview

The instrumentation configuration API is part of
the [declarative configuration interface](./README.md#declarative-configuration).

The API allows [instrumentation libraries](../glossary.md#instrumentation-library)
to consume configuration by reading relevant configuration during
initialization. For example, an instrumentation library for an HTTP client can
read the set of HTTP request and response headers to capture.

It consists of the following main components:

* [ConfigProvider](#configprovider) is the entry point of the API.
* [ConfigProperties](#configproperties) is a programmatic representation of a
configuration mapping node.

jack-berg marked this conversation as resolved.
Show resolved Hide resolved
### ConfigProvider

`ConfigProvider` provides access to configuration properties relevant to
instrumentation.

Instrumentation libraries access `ConfigProvider` during
initialization. `ConfigProvider` may be passed as an argument to the
instrumentation library, or the instrumentation library may access it from a
central place. Thus, the API SHOULD provide a way to access a global
default `ConfigProvider`, and set/register it.

#### ConfigProvider operations

The `ConfigProvider` MUST provide the following functions:

* [Get instrumentation config](#get-instrumentation-config)

TODO: decide if additional operations are needed to improve API ergonomics

##### Get instrumentation config

Obtain configuration relevant to instrumentation libraries.

**Returns:** [`ConfigProperties`](#configproperties) representing
the [`.instrumentation`](https://github.com/open-telemetry/opentelemetry-configuration/blob/670901762dd5cce1eecee423b8660e69f71ef4be/examples/kitchen-sink.yaml#L438-L439)
configuration mapping node.

If the `.instrumentation` node is not set, get instrumentation config MUST
return nil, null, undefined or another language-specific idiomatic pattern
denoting empty.

### ConfigProperties

`ConfigProperties` is a programmatic representation of a configuration mapping
node (i.e. a YAML mapping node).

`ConfigProperties` MUST provide accessors for reading all properties from the
mapping node it represents, including:

* scalars (string, boolean, double precision floating point, 64-bit integer)
* mappings, which SHOULD be represented as `ConfigProperties`
* sequences of scalars
* sequences of mappings, which SHOULD be represented as `ConfigProperties`
* the set of property keys present

`ConfigProperties` SHOULD provide access to properties in a type safe manner,
based on what is idiomatic in the language.

`ConfigProperties` SHOULD allow a caller to determine if a property is present
jack-berg marked this conversation as resolved.
Show resolved Hide resolved
with a null value, versus not set.
148 changes: 148 additions & 0 deletions specification/configuration/data-model.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# Configuration Data Model

**Status**: [Development](../document-status.md)

<!-- toc -->

- [Overview](#overview)
* [Stability definition](#stability-definition)
* [File-based configuration model](#file-based-configuration-model)
+ [YAML file format](#yaml-file-format)
+ [Environment variable substitution](#environment-variable-substitution)

<!-- tocstop -->

## Overview

The OpenTelemetry configuration data model is part of
the [declarative configuration interface](./README.md#declarative-configuration).

The data model defines data structures which allow users to specify an intended
configuration of OpenTelemetry SDK components and instrumentation.

The data model is defined
in [opentelemetry-configuration](https://github.com/open-telemetry/opentelemetry-configuration)
using [JSON Schema](https://json-schema.org/).

The data model itself is an abstraction with multiple built-in representations:

* [File-based configuration model](#file-based-configuration-model)
* [SDK in-memory configuration model](./sdk.md#in-memory-configuration-model)

### Stability definition

TODO: define stability guarantees and backwards compatibility

### File-based configuration model

A configuration file is a serialized file-based representation of
the configuration data model.

Configuration files SHOULD use one the following serialization formats:

* [YAML file format](#yaml-file-format)

#### YAML file format

[YAML](https://yaml.org/spec/1.2.2/) configuration files SHOULD follow YAML spec
revision >= 1.2.

YAML configuration files SHOULD be parsed using [v1.2 YAML core schema](https://yaml.org/spec/1.2.2/#103-core-schema).

YAML configuration files MUST use file extensions `.yaml` or `.yml`.
jack-berg marked this conversation as resolved.
Show resolved Hide resolved

#### Environment variable substitution

Configuration files support environment variables substitution for references
which match the following PCRE2 regular expression:

```regexp
\$\{(?:env:)?(?<ENV_NAME>[a-zA-Z_][a-zA-Z0-9_]*)(:-(?<DEFAULT_VALUE>[^\n]*))?\}
```

The `ENV_NAME` MUST start with an alphabetic or `_` character, and is followed
by 0 or more alphanumeric or `_` characters.

For example, `${API_KEY}` and `${env:API_KEY}` are valid, while `${1API_KEY}`
jack-berg marked this conversation as resolved.
Show resolved Hide resolved
and `${API_$KEY}` are invalid.

Environment variable substitution MUST only apply to scalar values. Mapping keys
are not candidates for substitution.

The `DEFAULT_VALUE` is an optional fallback value which is substituted
if `ENV_NAME` is null, empty, or undefined. `DEFAULT_VALUE` consists of 0 or
more non line break characters (i.e. any character except `\n`). If a referenced
environment variable is not defined and does not have a `DEFAULT_VALUE`, it MUST
be replaced with an empty value.

When parsing a configuration file that contains a reference not matching
jack-berg marked this conversation as resolved.
Show resolved Hide resolved
the references regular expression but does match the following PCRE2
regular expression, the parser MUST return an empty result (no partial
results are allowed) and an error describing the parse failure to the user.

```regexp
\$\{(?<INVALID_IDENTIFIER>[^}]+)\}
```

Node types MUST be interpreted after environment variable substitution takes
place. This ensures the environment string representation of boolean, integer,
jack-berg marked this conversation as resolved.
Show resolved Hide resolved
or floating point fields can be properly converted to expected types.

It MUST NOT be possible to inject YAML structures by environment variables. For
example, see references to `INVALID_MAP_VALUE` environment variable below.

It MUST NOT be possible to inject environment variable by environment variables.
For example, see references to `DO_NOT_REPLACE_ME` environment variable below.

For example, consider the following environment variables,
and [YAML](#yaml-file-format) configuration file:

```shell
export STRING_VALUE="value"
export BOOL_VALUE="true"
export INT_VALUE="1"
export FLOAT_VALUE="1.1"
export HEX_VALUE="0xdeadbeef" # A valid integer value written in hexadecimal
export INVALID_MAP_VALUE="value\nkey:value" # An invalid attempt to inject a map key into the YAML
export DO_NOT_REPLACE_ME="Never use this value" # An unused environment variable
export REPLACE_ME='${DO_NOT_REPLACE_ME}' # A valid replacement text, used verbatim, not replaced with "Never use this value"
```

```yaml
string_key: ${STRING_VALUE} # Valid reference to STRING_VALUE
env_string_key: ${env:STRING_VALUE} # Valid reference to STRING_VALUE
other_string_key: "${STRING_VALUE}" # Valid reference to STRING_VALUE inside double quotes
another_string_key: "${BOOL_VALUE}" # Valid reference to BOOL_VALUE inside double quotes
string_key_with_quoted_hex_value: "${HEX_VALUE}" # Valid reference to HEX_VALUE inside double quotes
yet_another_string_key: ${INVALID_MAP_VALUE} # Valid reference to INVALID_MAP_VALUE, but YAML structure from INVALID_MAP_VALUE MUST NOT be injected
bool_key: ${BOOL_VALUE} # Valid reference to BOOL_VALUE
int_key: ${INT_VALUE} # Valid reference to INT_VALUE
int_key_with_unquoted_hex_value: ${HEX_VALUE} # Valid reference to HEX_VALUE without quotes
float_key: ${FLOAT_VALUE} # Valid reference to FLOAT_VALUE
combo_string_key: foo ${STRING_VALUE} ${FLOAT_VALUE} # Valid reference to STRING_VALUE and FLOAT_VALUE
string_key_with_default: ${UNDEFINED_KEY:-fallback} # UNDEFINED_KEY is not defined but a default value is included
undefined_key: ${UNDEFINED_KEY} # Invalid reference, UNDEFINED_KEY is not defined and is replaced with ""
${STRING_VALUE}: value # Invalid reference, substitution is not valid in mapping keys and reference is ignored
recursive_key: ${REPLACE_ME} # Valid reference to REPLACE_ME
# invalid_identifier_key: ${STRING_VALUE:?error} # If uncommented, this is an invalid identifier, it would fail to parse
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jack-berg why isn't :?error syntax supported? It's standard shell syntax, just like :-default. Was there a discussion?

Bigger point - if we're using some convention, such as shell var syntax, it's very surprising when that convention is only partially supported. Context: https://github.com/open-telemetry/opentelemetry-collector/pull/10907/files#r1722093779

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's open an issue to track. This syntax wasn't introduced in this PR, it was just moved around. At the time when env var substitution syntax was needed, we were solving a very targeted problem and used shell syntax prior art to avoid reinventing something. You make a good point that partially supporting shell syntax would be surprising to some users.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

```

Environment variable substitution results in the following YAML:

```yaml
string_key: value # Interpreted as type string, tag URI tag:yaml.org,2002:str
env_string_key: value # Interpreted as type string, tag URI tag:yaml.org,2002:str
other_string_key: "value" # Interpreted as type string, tag URI tag:yaml.org,2002:str
another_string_key: "true" # Interpreted as type string, tag URI tag:yaml.org,2002:str
string_key_with_quoted_hex_value: "0xdeadbeef" # Interpreted as type string, tag URI tag:yaml.org,2002:str
yet_another_string_key: "value\nkey:value" # Interpreted as type string, tag URI tag:yaml.org,2002:str
bool_key: true # Interpreted as type bool, tag URI tag:yaml.org,2002:bool
int_key: 1 # Interpreted as type int, tag URI tag:yaml.org,2002:int
int_key_with_unquoted_hex_value: 3735928559 # Interpreted as type int, tag URI tag:yaml.org,2002:int
float_key: 1.1 # Interpreted as type float, tag URI tag:yaml.org,2002:float
combo_string_key: foo value 1.1 # Interpreted as type string, tag URI tag:yaml.org,2002:str
string_key_with_default: fallback # Interpreted as type string, tag URI tag:yaml.org,2002:str
undefined_key: # Interpreted as type null, tag URI tag:yaml.org,2002:null
${STRING_VALUE}: value # Interpreted as type string, tag URI tag:yaml.org,2002:str
recursive_key: ${DO_NOT_REPLACE_ME} # Interpreted as type string, tag URI tag:yaml.org,2002:str
```
Loading
Loading