Note: This specification for the v0.2 OpenTelemetry milestone does not cover the Observer gauge instrument discussed in the overview. Observer instruments will be added in the v0.3 milestone.
TODO: Add a table of contents.
Metric instruments are the entry point for application and framework developers to instrument their code using counters, gauges, and measures.
Metric instruments have names, which are how we refer to them in external systems. Metric names conform to the following syntax:
- They are non-empty strings
- They are case-insensitive
- The first character must be non-numeric, non-space, non-punctuation
- Subsequent characters must be belong to the alphanumeric characters, '_', '.', and '-'.
Metrics names belong to a namespace by virtue of a "Named" Meter
instance. A "Named" Meter
refers to the requirement that every
Meter
instance must have an associated component
label, determined
statically in the code. The component
label value of the associated
Meter
serves as its namespace, allowing the same metric name to be
used in multiple libraries of code, unambiguously, within the same
application.
Metric instruments are defined using a Meter
instance, using a
variety of New
methods specific to the kind of metric and type of
input (integer or floating point). The Meter will return an error
when a metric name is already registered with a different kind for the
same component name. Metric systems are expected to automatically
prefix exported metrics by the component
namespace in a manner
consistent with the target system. For example, a Prometheus exporter
SHOULD use the component followed by _
as the application
prefix.
Regardless of the instrument kind or method of input, metric events include the instrument, a numerical value, and an optional set of labels. The instrument, discussed in detail below, contains the metric name and various optional settings.
Labels are key:value pairs associated with events describing various
dimensions or categories that describe the event. A "label key"
refers to the key component while "label value" refers to the
correlated value component of a label. Label refers to the pair of
label key and value. Labels are passed in to the metric event in the
form of a LabelSet
argument, using several input methods discussed
below.
Metric events always have an associated component name, the name
passed when constructing the corresponding Meter
. Metric events are
associated with the current (implicit or explicit) OpenTelemetry
context, including distributed correlation context and span context.
The Meter
interface allows creating of a registered metric
instrument using methods specific to each kind of metric. There are
six constructors representing the three kinds of instrument taking
either floating point or integer inputs, see the detailed design below.
Binding instruments to a single Meter
instance has two benefits:
- Instruments can be exported from the zero state, prior to first use, with no explicit
Register
call - The component name provided by the named
Meter
satisfies a namespace requirement
The recommended practice is to define structures to contain the instruments in use and keep references only to the instruments that are specifically needed.
We recognize that many existing metric systems support allocating
metric instruments statically and providing the Meter
interface at
the time of use. In this example, typical of statsd clients, existing
code may not be structured with a convenient place to store new metric
instruments. Where this becomes a burden, it is recommended to use
the global meter factory to construct a static named Meter
, to
construct metric instruments.
The situation is similar for users of Prometheus clients, where
instruments are allocated statically and there is an implicit global.
Such code may not have access to the appropriate Meter
where
instruments are defined. Where this becomes a burden, it is
recommended to use the global meter factory to construct a static
named Meter
, to construct metric instruments.
Applications are expected to construct long-lived instruments. Instruments are considered permanent for the lifetime of a SDK, there is no method to delete them.
In this Golang example, a struct holding four instruments is built
using the provided, non-global Meter
instance.
type instruments struct {
counter1 metric.Int64Counter
counter2 metric.Float64Counter
gauge3 metric.Int64Gauge
measure4 metric.Float64Measure
}
func newInstruments(metric.Meter meter) *instruments {
return &instruments{
counter1: meter.NewCounter("counter1", ...), // Optional parameters
counter2: meter.NewCounter("counter2", ...), // are discussed below.
gauge3: meter.NewGauge("gauge3", ...),
measure4: meter.NewMeasure("measure4", ...),
}
}
Code will be structured to call newInstruments
somewhere in a
constructor and keep the instruments
reference for use at runtime.
Here's an example of building a server with configured instruments and
a single metric operation.
type server struct {
meter metric.Meter
instruments *instruments
// ... other fields
}
func newServer(meter metric.Meter) *server {
return &server{
meter: meter,
instruments: newInstruments(meter),
// ... other fields
}
}
// ...
func (s *server) operate(ctx context.Context) {
// ... other work
s.instruments.counter1.Add(ctx, 1, s.meter.Labels(
label1.String("..."),
label2.String("...")))
}
This API is factored into three core concepts: instruments, handles, and label sets. In doing so, we provide several ways of capturing measurements that are semantically equivalent and generate equivalent metric events, but offer varying degrees of performance and convenience.
This section applies to calling conventions for counter, gauge, and measure instruments.
As described above, metric events consist of an instrument, a set of labels, and a numerical value, plus associated context. The performance of a metric API depends on the work done to enter a new measurement. One approach to reduce cost is to aggregate intermediate results in the SDK, so that subsequent events happening in the same collection period, for the same label set, combine into the same working memory.
In this document, the term "aggregation" is used to describe the process of coalescing metric events for a complete set of labels, whereas "grouping" is used to describe further coalescing aggregate metric data into a reduced number of key dimensions. SDKs may be designed to perform aggregation and/or grouping in the process, with various trade-offs in terms of complexity and performance.
This approach requires locating an entry for the instrument and label set in a table of some kind, finding the location where a metric events are being aggregated. This lookup can be successfully precomputed, giving rise to the Handle calling convention.
In situations where performance is a requirement and a metric is repeatedly used with the same set of labels, the developer may elect to use instrument handles as an optimization. For handles to be a benefit, it requires that a specific instrument will be re-used with specific labels. If an instrument will be used with the same label set more than once, obtaining an instrument handle corresponding to the label set ensures the highest performance available.
To obtain a handle given an instrument and label set, use the
GetHandle()
method to return an interface that supports the Add()
,
Set()
, or Record()
method of the instrument in question.
Instrument handles may consume SDK resources indefinitely.
func (s *server) processStream(ctx context.Context) {
streamLabels := s.meter.Labels(
labelA.String("..."),
labelB.String("..."),
)
counter2Handle := s.instruments.counter2.GetHandle(streamLabels)
for _, item := <-s.channel {
// ... other work
// High-performance metric calling convention: use of handles.
counter2Handle.Add(ctx, item.size())
}
}
When convenience is more important than performance, or there is no re-use to potentially optimize with instrument handles, users may elect to operate directly on metric instruments, supplying a label set at the call site.
For example, to update a single counter:
func (s *server) method(ctx context.Context) {
// ... other work
s.instruments.counter1.Add(ctx, 1, s.meter.Labels(...))
}
This method offers the greatest convenience possible. If performance
becomes a problem, one option is to use handles as described above.
Another performance option, in some cases, is to just re-use the
labels. In the example here, meter.Labels(...)
constructs a
re-usable label set which may be an important performance
optimization.
A significant factor in the cost of metrics export is that labels, which arrive as an unordered list of keys and values, must be canonicalized in some way before they can be used for lookup. Canonicalizing labels can be an expensive operation as it may require sorting or de-duplicating by some other means, possibly even serializing, the set of labels to produce a valid map key.
The operation of converting an unordered set of labels into a
canonicalized set of labels, useful for pre-aggregation, is expensive
enough that we give it first-class treatment in the API. The
meter.Labels(...)
API canonicalizes labels, returning an opaque
LabelSet
object, another form of pre-computation available to the
user.
Re-usable LabelSet
objects provide a potential optimization for
scenarios where handles might not be effective. For example, if the
label set will be re-used but only used once per metric, handles do
not offer any optimization. It may be best to pre-compute a
canonicalized LabelSet
once and re-use it with the direct calling
convention.
Constructing an instrument handle is considered the higher-performance
option, when the handle will be used more than once. Still, consider
re-using the result of Meter.Labels(...)
when constructing more than
one instrument handle.
func (s *server) method(ctx context.Context) {
// ... other work
labelSet := s.meter.Labels(...)
s.instruments.counter1.Add(ctx, 1, labelSet)
// ... more work
s.instruments.gauge1.Set(ctx, 10, labelSet)
// ... more work
s.instruments.measure1.Record(ctx, 100, labelSet)
}
When the SDK interprets a LabelSet
in the context of grouping
aggregated values for an exporter, and where there are keys that are
missing, the SDK is required to consider these values explicitly
unspecified, a distinct value type of the exported data model.
As a language-optional feature, the direct and handle calling
convention APIs may support alternate convenience methods to pass raw
labels at the call site. These may be offered as overloaded methods
for Add()
, Set()
, and Record()
(direct calling convention) or
GetHandle()
(handle calling convention), in both cases bypassing a
call to meter.Labels(...)
. For example:
public void method() {
// pass raw labels, no explicit `LabelSet`
s.instruments.counter1.add(1, labelA.value(...), labelB.value(...))
// ... or
// pass raw labels, no explicit `LabelSet`
handle := s.instruments.gauge1.getHandle(labelA.value(...), labelB.value(...))
}
As a language-level decision, APIs may support ordered LabelSet construction, in which a pre-defined set of ordered label keys is defined such that values can be supplied in order. For example,
var rpcLabelKeys = meter.OrderedLabelKeys("a", "b", "c")
for _, input := range stream {
labels := rpcLabelKeys.Values(1, 2, 3) // a=1, b=2, c=3
// ...
}
This is specified as a language-optional feature because its safety,
and therefore its value as an input for monitoring, depends on the
availability of type-checking in the source language. Passing
unordered labels (i.e., a list of bound keys and values) to the
Meter.Labels(...)
constructor is considered the safer alternative.
There is one final API for entering measurements, which is like the direct access calling convention but supports multiple simultaneous measurements. The use of a RecordBatch API supports entering multiple measurements, implying a semantically atomic update to several instruments.
The preceding example could be rewritten:
func (s *server) method(ctx context.Context) {
// ... other work
labelSet := s.meter.Labels(...)
// ... more work
s.meter.RecordBatch(ctx, labelSet,
s.instruments.counter1.Measurement(1),
s.instruments.gauge1.Measurement(10),
s.instruments.measure2.Measurement(123.45),
)
}
Using the RecordBatch calling convention is semantically identical to
the sequence of direct calls in the preceding example, with the
addition of atomicity. Because values are entered in a single call,
the SDK is potentially able to implement an atomic update, from the
exporter's point of view. Calls to RecordBatch
may potentially
reduce costs because the SDK can enqueue a single bulk update, or take
a lock only once, for example.
See the SDK-facing Metrics API specification for an in-depth summary of each method in the Metrics API.
Instruments are constructed using the appropriate New
method for the
kind of instrument (Counter, Gauge, Measure) and for the type of input
(integer or floating point).
Meter method |
Kind of instrument |
---|---|
NewIntCounter(name, options...) |
An integer counter |
NewFloatCounter(name, options...) |
A floating point counter |
NewIntGauge(name, options...) |
An integer gauge |
NewFloatGauge(name, options...) |
A floating point gauge |
NewIntMeasure(name, options...) |
An integer measure |
NewFloatMeasure(name, options...) |
A floating point measure |
As in all OpenTelemetry specifications, these names are examples. Each language committee will decide on the appropriate names based on conventions in that language.
Instruments may be defined with a recommended set of label keys. This setting may be used by SDKs as a good default for grouping exported metrics, where used with pre-aggregation. The recommended label keys are usually selected by the developer for exhibiting low cardinality, importance for monitoring purposes, and an intention to provide these variables locally.
SDKs should consider grouping exported metric data by the recommended
label keys of each instrument, unless superceded by another form of
configuration. Recommended keys that are missing will be considered
explicitly unspecified, as for missing LabelSet
keys in general.
Instruments provide several optional settings, summarized here. The kind of instrument and input value type are implied by the constructor that it used, and the metric name is the only required field.
Option | Option name | Explanation |
---|---|---|
Description | WithDescription(string) | Descriptive text documenting the instrument. |
Unit | WithUnit(string) | Units specified according to the UCUM. |
Recommended label keys | WithRecommendedKeys(list) | Recommended grouping keys for this instrument. |
Monotonic | WithMonotonic(boolean) | Configure a counter or gauge that accepts only monotonic/non-monotonic updates. |
Absolute | WithAbsolute(boolean) | Configure a measure that does or does not accept negative updates. |
See the Metric API specification overview for more information about the kind-specific monotonic and absolute options.
Counter, gauge, and measure instruments each support allocating
handles for the high-performance calling convention. The
Instrument.GetHandle(LabelSet)
method returns an interface which
implements the Add()
, Set()
or Record()
method, respectively,
for counter, gauge, and measure instruments.
Counter, gauge, and measure instruments support the appropriate
Add()
, Set()
, and Record()
method for submitting individual
metric events.
The LabelSet
type introduced above applies strictly to "local"
labels, meaning provided in a call to meter.Labels(...)
. The
application explicitly declares these labels, whereas distributed
correlation context labels are implicitly associated with the event.
There is a clear intention to pre-aggregate metrics within the SDK,
using the contents of a LabelSet
to derive grouping keys. There are
two available options for users to apply distributed correlation
context to the local grouping function used for metrics
pre-aggregation:
- The distributed context, whether implicit or explicit, is
associated with every metric event. The SDK could automatically
project selected label keys from the distributed correlation into the
metric event. This would require some manner of dynamic mapping from
LabelSet
to grouping key during aggregation. - The user can explicitly perform the same projection of distributed
correlation into a
LabelSet
by extracting from the correlation context and including it in the call tometric.Labels(...)
.
An example of an explicit projection follows.
import "go.opentelemetry.io/api/distributedcontext"
func (s *server) doThing(ctx context.Context) {
var doLabels []core.KeyValue{
key1.String("..."),
key2.String("..."),
}
correlations := distributedcontext.FromContext()
if val, ok := correlations.Value(key3); ok {
doLabels = append(doLabels, key3.Value(val))
}
labels := s.meter.Labels(doLabels)
// ...
}