Skip to content

Latest commit

 

History

History
481 lines (360 loc) · 19.5 KB

File metadata and controls

481 lines (360 loc) · 19.5 KB

Logs Data Model

Status: Stable

Table of Contents

This is a data model and semantic conventions that allow to represent logs from various sources: application log files, machine generated events, system logs, etc. Existing log formats can be unambiguously mapped to this data model. Reverse mapping from this data model is also possible to the extent that the target log format has equivalent capabilities.

The purpose of the data model is to have a common understanding of what a log record is, what data needs to be recorded, transferred, stored and interpreted by a logging system.

This proposal defines a data model for Standalone Logs.

Design Notes

Requirements

The Data Model was designed to satisfy the following requirements:

  • It should be possible to unambiguously map existing log formats to this Data Model. Translating log data from an arbitrary log format to this Data Model and back should ideally result in identical data.

  • Mappings of other log formats to this Data Model should be semantically meaningful. The Data Model must preserve the semantics of particular elements of existing log formats.

  • Translating log data from an arbitrary log format A to this Data Model and then translating from the Data Model to another log format B ideally must result in a meaningful translation of log data that is no worse than a reasonable direct translation from log format A to log format B.

  • It should be possible to efficiently represent the Data Model in concrete implementations that require the data to be stored or transmitted. We primarily care about 2 aspects of efficiency: CPU usage for serialization/deserialization and space requirements in serialized form. This is an indirect requirement that is affected by the specific representation of the Data Model rather than the Data Model itself, but is still useful to keep in mind.

The Data Model aims to successfully represent 3 sorts of logs and events:

  • System Formats. These are logs and events generated by the operating system and over which we have no control - we cannot change the format or affect what information is included (unless the data is generated by an application which we can modify). An example of system format is Syslog.

  • Third-party Applications. These are generated by third-party applications. We may have certain control over what information is included, e.g. customize the format. An example is Apache log file.

  • First-party Applications. These are applications that we develop and we have some control over how the logs and events are generated and what information we include in the logs. We can likely modify the source code of the application if needed.

Definitions Used in this Document

In this document we refer to types any and map<string, any>, defined as follows.

Type any

Value of type any can be one of the following:

  • A scalar value: number, string or boolean,

  • A byte array,

  • An array (a list) of any values,

  • A map<string, any>.

Type map<string, any>

Value of type map<string, any> is a map of string keys to any values. The keys in the map are unique (duplicate keys are not allowed). The representation of the map is language-dependent.

Arbitrary deep nesting of values for arrays and maps is allowed (essentially allows to represent an equivalent of a JSON object).

Field Kinds

This Data Model defines a logical model for a log record (irrespective of the physical format and encoding of the record). Each record contains 2 kinds of fields:

  • Named top-level fields of specific type and meaning.

  • Fields stored as map<string, any>, which can contain arbitrary values of different types. The keys and values for well-known fields follow semantic conventions for key names and possible values that allow all parties that work with the field to have the same interpretation of the data. See references to semantic conventions for Resource and Attributes fields and examples in Appendix A.

The reasons for having these 2 kinds of fields are:

  • Ability to efficiently represent named top-level fields, which are almost always present (e.g. when using encodings like Protocol Buffers where fields are enumerated but not named on the wire).

  • Ability to enforce types of named fields, which is very useful for compiled languages with type checks.

  • Flexibility to represent less frequent data as map<string, any>. This includes well-known data that has standardized semantics as well as arbitrary custom data that the application may want to include in the logs.

When designing this data model we followed the following reasoning to make a decision about when to use a top-level named field:

  • The field needs to be either mandatory for all records or be frequently present in well-known log and event formats (such as Timestamp) or is expected to be often present in log records in upcoming logging systems (such as TraceId).

  • The field’s semantics must be the same for all known log and event formats and can be mapped directly and unambiguously to this data model.

Both of the above conditions were required to give the field a place in the top-level structure of the record.

Log and Event Record Definition

Appendix A contains many examples that show how existing log formats map to the fields defined below. If there are questions about the meaning of the field reviewing the examples may be helpful.

Here is the list of fields in a log record:

Field Name Description
Timestamp Time when the event occurred.
ObservedTimestamp Time when the event was observed.
TraceId Request trace id.
SpanId Request span id.
TraceFlags W3C trace flag.
SeverityText The severity text (also known as log level).
SeverityNumber Numerical value of the severity.
Body The body of the log record.
Resource Describes the source of the log.
InstrumentationScope Describes the scope that emitted the log.
Attributes Additional information about the event.

Below is the detailed description of each field.

Field: Timestamp

Type: Timestamp, uint64 nanoseconds since Unix epoch.

Description: Time when the event occurred measured by the origin clock, i.e. the time at the source. This field is optional, it may be missing if the source timestamp is unknown.

Field: ObservedTimestamp

Type: Timestamp, uint64 nanoseconds since Unix epoch.

Description: Time when the event was observed by the collection system. For events that originate in OpenTelemetry (e.g. using OpenTelemetry Logging SDK) this timestamp is typically set at the generation time and is equal to Timestamp. For events originating externally and collected by OpenTelemetry (e.g. using Collector) this is the time when OpenTelemetry's code observed the event measured by the clock of the OpenTelemetry code. This field SHOULD be set once the event is observed by OpenTelemetry.

For converting OpenTelemetry log data to formats that support only one timestamp or when receiving OpenTelemetry log data by recipients that support only one timestamp internally the following logic is recommended:

  • Use Timestamp if it is present, otherwise use ObservedTimestamp.

Trace Context Fields

Field: TraceId

Type: byte sequence.

Description: Request trace id as defined in W3C Trace Context. Can be set for logs that are part of request processing and have an assigned trace id. This field is optional.

Field: SpanId

Type: byte sequence.

Description: Span id. Can be set for logs that are part of a particular processing span. If SpanId is present TraceId SHOULD be also present. This field is optional.

Field: TraceFlags

Type: byte.

Description: Trace flag as defined in W3C Trace Context specification. At the time of writing the specification defines one flag - the SAMPLED flag. This field is optional.

Severity Fields

Field: SeverityText

Type: string.

Description: severity text (also known as log level). This is the original string representation of the severity as it is known at the source. If this field is missing and SeverityNumber is present then the short name that corresponds to the SeverityNumber may be used as a substitution. This field is optional.

Field: SeverityNumber

Type: number.

Description: numerical value of the severity, normalized to values described in this document. This field is optional.

SeverityNumber is an integer number. Smaller numerical values correspond to less severe events (such as debug events), larger numerical values correspond to more severe events (such as errors and critical events). The following table defines the meaning of SeverityNumber value:

SeverityNumber range Range name Meaning
1-4 TRACE A fine-grained debugging event. Typically disabled in default configurations.
5-8 DEBUG A debugging event.
9-12 INFO An informational event. Indicates that an event happened.
13-16 WARN A warning event. Not an error but is likely more important than an informational event.
17-20 ERROR An error event. Something went wrong.
21-24 FATAL A fatal error such as application or system crash.

Smaller numerical values in each range represent less important (less severe) events. Larger numerical values in each range represent more important (more severe) events. For example SeverityNumber=17 describes an error that is less critical than an error with SeverityNumber=20.

Mapping of SeverityNumber

Mappings from existing logging systems and formats (or source format for short) must define how severity (or log level) of that particular format corresponds to SeverityNumber of this data model based on the meaning given for each range in the above table.

If the source format has more than one severity that matches a single range in this table then the severities of the source format must be assigned numerical values from that range according to how severe (important) the source severity is.

For example if the source format defines "Error" and "Critical" as error events and "Critical" is a more important and more severe situation then we can choose the following SeverityNumber values for the mapping: "Error"->17, "Critical"->18.

If the source format has only a single severity that matches the meaning of the range then it is recommended to assign that severity the smallest value of the range.

For example if the source format has an "Informational" log level and no other log levels with similar meaning then it is recommended to use SeverityNumber=9 for "Informational".

Source formats that do not define a concept of severity or log level MAY omit SeverityNumber and SeverityText fields. Backend and UI may represent log records with missing severity information distinctly or may interpret log records with missing SeverityNumber and SeverityText fields as if the SeverityNumber was set equal to INFO (numeric value of 9).

Reverse Mapping

When performing a reverse mapping from SeverityNumber to a specific format and the SeverityNumber has no corresponding mapping entry for that format then it is recommended to choose the target severity that is in the same severity range and is closest numerically.

For example Zap has only one severity in the INFO range, called "Info". When doing reverse mapping all SeverityNumber values in INFO range (numeric 9-12) will be mapped to Zap’s "Info" level.

Error Semantics

If SeverityNumber is present and has a value of ERROR (numeric 17) or higher then it is an indication that the log record represents an erroneous situation. It is up to the reader of this value to make a decision on how to use this fact (e.g. UIs may display such errors in a different color or have a feature to find all erroneous log records).

If the log record represents an erroneous event and the source format does not define a severity or log level concept then it is recommended to set SeverityNumber to ERROR (numeric 17) during the mapping process. If the log record represents a non-erroneous event the SeverityNumber field may be omitted or may be set to any numeric value less than ERROR (numeric 17). The recommended value in this case is INFO (numeric 9). See Appendix B for more mapping examples.

Displaying Severity

The following table defines the recommended short name for each SeverityNumber value. The short name can be used for example for representing the SeverityNumber in the UI:

SeverityNumber Short Name
1 TRACE
2 TRACE2
3 TRACE3
4 TRACE4
5 DEBUG
6 DEBUG2
7 DEBUG3
8 DEBUG4
9 INFO
10 INFO2
11 INFO3
12 INFO4
13 WARN
14 WARN2
15 WARN3
16 WARN4
17 ERROR
18 ERROR2
19 ERROR3
20 ERROR4
21 FATAL
22 FATAL2
23 FATAL3
24 FATAL4

When an individual log record is displayed it is recommended to show both SeverityText and SeverityNumber values. A recommended combined string in this case begins with the short name followed by SeverityText in parenthesis.

For example "Informational" Syslog record will be displayed as INFO (Informational). When for a particular log record the SeverityNumber is defined but the SeverityText is missing it is recommended to only show the short name, e.g. INFO.

When drop down lists (or other UI elements that are intended to represent the possible set of values) are used for representing the severity it is preferable to display the short name in such UI elements.

For example a dropdown list of severities that allows filtering log records by severities is likely to be more usable if it contains the short names of SeverityNumber (and thus has a limited upper bound of elements) compared to a dropdown list, which lists all distinct SeverityText values that are known to the system (which can be a large number of elements, often differing only in capitalization or abbreviated, e.g. "Info" vs "Information").

Comparing Severity

In the contexts where severity participates in less-than / greater-than comparisons SeverityNumber field should be used. SeverityNumber can be compared to another SeverityNumber or to numbers in the 1..24 range (or to the corresponding short names).

Field: Body

Type: any.

Description: A value containing the body of the log record (see the description of any type above). Can be for example a human-readable string message (including multi-line) describing the event in a free form or it can be a structured data composed of arrays and maps of other values. First-party Applications SHOULD use a string message. However, a structured body SHOULD be used to preserve the semantics of structured logs emitted by Third-party Applications. Can vary for each occurrence of the event coming from the same source. This field is optional.

Field: Resource

Type: map<string, any>.

Description: Describes the source of the log, aka resource. Multiple occurrences of events coming from the same event source can happen across time and they all have the same value of Resource. Can contain for example information about the application that emits the record or about the infrastructure where the application runs. Data formats that represent this data model may be designed in a manner that allows the Resource field to be recorded only once per batch of log records that come from the same source. SHOULD follow OpenTelemetry semantic conventions for Resources. This field is optional.

Field: InstrumentationScope

Type: (Name,Version) tuple of strings.

Description: the instrumentation scope. Multiple occurrences of events coming from the same scope can happen across time and they all have the same value of InstrumentationScope. For log sources which define a logger name (e.g. Java Logger Name) the Logger Name SHOULD be recorded as the Instrumentation Scope name.

Version is optional. Name SHOULD be specified if version is specified, otherwise Name is optional.

Field: Attributes

Type: map<string, any>.

Description: Additional information about the specific event occurrence. Unlike the Resource field, which is fixed for a particular source, Attributes can vary for each occurrence of the event coming from the same source. Can contain information about the request context (other than TraceId/SpanId). SHOULD follow OpenTelemetry semantic conventions for Log Attributes or semantic conventions for Span Attributes. This field is optional.

Errors and Exceptions

Additional information about errors and/or exceptions that are associated with a log record MAY be included in the structured data in the Attributes section of the record. If included, they MUST follow the OpenTelemetry semantic conventions for exception-related attributes.

Example Log Records

For example log records see JSON File serialization.

Example Mappings

For example log format mappings, see the Data Model Appendix.

References