File Exporter

Status
Stability	alpha: traces, metrics, logs
Distributions	core, contrib, k8s
Issues
Code Owners	@atingchen

Writes telemetry data to files on disk.

Use the OTLP JSON File receiver to read the data back into the collector (as long as the data was exported using OTLP JSON format).

Exporter supports the following features：

Support for writing pipeline data to a file.
Support for rotation of telemetry files.
Support for compressing the telemetry data before exporting.
Support for writing into multiple files, where the file path is determined by a resource attribute.

Please note that there is no guarantee that exact field names will remain stable.

The official opentelemetry-collector-contrib container does not have a writable filesystem by default since it's built on the scratch layer. As such, you will need to create a writable directory for the path. You could do this by mounting a volume with flags such as rw or rwZ.

On Linux, and given a otel-collector-config.yaml with a file exporter whose path is prefixed with /file-exporter,

# linux needs +x to list a directory.  You can use a+ instead of o+ for the mode if you want to ensure your user and group has access.
mkdir --mode o+rwx file-exporter
# z is an SELinux construct that is ignored on other systems
docker run -v "./file-exporter:/file-exporter:rwz" -v "otel-collector-config.yaml:/etc/otelcol-contrib/config.yaml" otel/opentelemetry-collector-contrib:latest

Note this same syntax for volumes will work with docker-compose.

You could also modify the base image and manually build your own container to have a writeable directory or change the runas uid if needed, but this is more involved.

Configuration options:

The following settings are required:

path [no default]: where to write information.

The following settings are optional:

rotation settings to rotate telemetry files.
- max_megabytes: [default: 100]: the maximum size in megabytes of the telemetry file before it is rotated.
- max_days: [no default (unlimited)]: the maximum number of days to retain telemetry files based on the timestamp encoded in their filename.
- max_backups: [default: 100]: the maximum number of old telemetry files to retain.
- localtime : [default: false (use UTC)] whether or not the timestamps in backup files is formatted according to the host's local time.
format[default: json]: define the data format of encoded telemetry data. The setting can be overridden with proto.
encoding[default: none]: if specified, uses an encoding extension to encode telemetry data. Overrides format.
append[default: false] defines whether append to the file (true) or truncate (false). If append: true is set then setting rotation or compression is currently not supported.
compression[no default]: the compression algorithm used when exporting telemetry data to file. Supported compression algorithms:zstd
flush_interval[default: 1s]: time.Duration interval between flushes. See time.ParseDuration for valid formats. NOTE: a value without unit is in nanoseconds and flush_interval is ignored and writes are not buffered if rotation is set.
group_by enables writing to separate files based on a resource attribute.
- enabled: [default: false] enables group_by. When group_by is enabled, rotation setting is ignored.
- resource_attribute: [default: fileexporter.path_segment]: specifies the name of the resource attribute that contains the path segment of the file to write to. The final path will be the path config value, with the * replaced with the value of this resource attribute.
- max_open_files: [default: 100]: specifies the maximum number of open file descriptors for the output files.

File Rotation

Telemetry data is exported to a single file by default. fileexporter only enables file rotation when the user specifies rotation: in the config. However, if specified, related default settings would apply.

Telemetry is first written to a file that exactly matches the path setting. When the file size exceeds max_megabytes or age exceeds max_days, the file will be rotated.

When a file is rotated, it is renamed by putting the current time in a timestamp in the name immediately before the file's extension (or the end of the filename if there's no extension). A new telemetry file will be created at the original path.

For example, if your path is data.json and rotation is triggered, this file will be renamed to data-2022-09-14T05-02-14.173.json, and a new telemetry file created with data.json

File Compression

Telemetry data is compressed according to the compression setting. fileexporter does not compress data by default.

Currently, fileexporter support the zstd compression algorithm, and we will support more compression algorithms in the future.

File Format

Telemetry data is encoded according to the format setting and then written to the file.

When format is json and compression is none , telemetry data is written to file in JSON format. Each line in the file is a JSON object.

Otherwise, when using proto format or any kind of encoding, each encoded object is preceded by 4 bytes (an unsigned 32 bit integer) which represent the number of bytes contained in the encoded object.When we need read the messages back in, we read the size, then read the bytes into a separate buffer, then parse from that buffer.

Group by attribute

By specifying group_by.resource_attribute in the config, the exporter will determine a filepath for each telemetry record, by substituting the value of the resource attribute into the path configuration value.

The final path is guaranteed to start with the prefix part of the path config value (the part before the * character). For example if path is "/data/*.json", and the resource attribute value is "../etc/my_config", then the final path will be sanitized to "/data/etc/my_config.json".

The final path can contain path separators (/). The exporter will create missing directories recursively (similarly to mkdir -p).

Grouping by attribute currently only supports a single resource attribute. If you would like to use multiple attributes, please use Transform processor create a routing key. If you would like to use a non-resource level (eg: Log/Metric/DataPoint) attribute, please use Group by Attributes processor first.

Example:

exporters:
  file/no_rotation:
    path: ./foo

  file/rotation_with_default_settings:
    path: ./foo
    rotation:

  file/rotation_with_custom_settings:
    path: ./foo
    rotation:
      max_megabytes: 10
      max_days: 3
      max_backups: 3
      localtime: true
    format: proto
    compression: zstd

  file/flush_every_5_seconds:
    path: ./foo
    flush_interval: 5

Get Started in an existing cluster

We will follow the documentation to first install the operator in an existing cluster and then create an OpenTelemetry Collector (otelcol) instance, mounting an additional volume under /data under which the file exporter will write metrics.json:

kubectl apply -f - <<EOF
apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
  name: fileexporter
spec:
  config: |
    receivers:
      otlp:
        protocols:
          grpc:
          http:
    processors:

    exporters:
      debug:
      file:
        path: /data/metrics.json

    service:
      pipelines:
        metrics:
          receivers: [otlp]
          processors: []
          exporters: [debug,file]
  volumes:
    - name: file
      emptyDir: {}
  volumeMounts: 
    - name: file
      mountPath: /data
EOF

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

File Exporter

Configuration options:

File Rotation

File Compression

File Format

Group by attribute

Example:

Get Started in an existing cluster

Files

README.md

Latest commit

History

README.md

File metadata and controls

File Exporter

Configuration options:

File Rotation

File Compression

File Format

Group by attribute

Example:

Get Started in an existing cluster