Skip to content
This repository has been archived by the owner on Dec 6, 2024. It is now read-only.

Add Logs to OpenTelemetry vocabulary #91

Merged
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 49 additions & 0 deletions text/0091-logs-vocabulary.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
# Logs: Vocabulary

This documents defines the vocabulary for logs to be used across OpenTelemetry project.

## Motivation

We need a common language and common understanding of terms that we use to
avoid the chaos experienced by the builders of the Tower of Babel.

## Proposal

OpenTelemetry specification already contains a [vocabulary](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/overview.md)
for Traces, Metrics and other relevant concepts.

This proposal is to add the following concepts to the vocabulary.

### Log Record
tigrannajaryan marked this conversation as resolved.
Show resolved Hide resolved

A recording of an event. Typically the record includes a timestamp indicating
when the event happened as well as other data that describes what happened,
where it happened, etc.

### Embedded Log

A log record embedded inside a [Span](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/api-tracing.md#span)
arminru marked this conversation as resolved.
Show resolved Hide resolved
object, in the [Events](https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/api-tracing.md#add-events) list.

### Standalone Log

A log record that is not embedded inside a Span and is recorded elsewhere.

### Log Attributes

Key/value pairs contained in a Log Record.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also expect "log attributes" to be metadata about a whole log file/stream (e.g. producer, host or service information, time span, size) rather than attributes on a single log record/entry.

Copy link

@jaredcnance jaredcnance Mar 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are two classes of data which can be included in structured logs:

  • attributes: key/value pairs that are specific to a single log record and often have high cardinality values (e.g. timestamp)
  • metadata: key/value pairs that apply to one or more log records, generally constant for a given span (e.g. traceId, customerId, etc.) or may be constant for the lifetime of the process (e.g. host, containerId, etc.).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the distinction.

The constants part of "metadata" is stored in a Resource in OpenTelemetry. There is already a definition of a Resource.

The non-const parts are Attributes of a log record. The fact that they can be identical for a batch of log records is a data modeling topic and does not belong to the vocabulary (at least not yet, until we have the data model). We can have further clarification of types of attributes, but for now I would refrain from adding anything to the vocabulary. It is best to discuss this in a Data Model proposal (which I believe we inevitably need to have).

(BTW, "metadata" is not a well defined term, IMO it is best to avoid using it in this discussion since people have different understanding of what metadata means).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with both of your comments.
My comment was about the term "log attributes" itself, which sounds like attributes that apply to the whole log file/stream (collection of logs) at once. These could therefore, for example, be expressed and stored in a Resource or preamble to a log file/stream rather than one single log record. I'd rather call the key/value pairs contained in a single log record "log record attributes" to avoid this ambiguity.


### Structured Logs

Logs that are recorded in a format that has a well-defined structure that allows
tigrannajaryan marked this conversation as resolved.
Show resolved Hide resolved
to differentiate between different elements of a Log Record (e.g. the Timestamp,
the Attributes, etc). For example [Syslog, RFC5425](https://tools.ietf.org/html/rfc5424)
protocol defines `structured-data` format.
tigrannajaryan marked this conversation as resolved.
Show resolved Hide resolved

### Flat File Logs

Logs recorded in text files, often one line per log record (although multiline
records are possible too). There is no common industry agreement whether
logs written to text files in more structured formats (e.g. JSON files)
are considered Flat File Logs or no. Where such distinction is important it is
tigrannajaryan marked this conversation as resolved.
Show resolved Hide resolved
recommended to call it out specifically.