Skip to content
This repository has been archived by the owner on Dec 6, 2024. It is now read-only.

Versioning and Stability for OpenTelemetry Clients #143

Merged
merged 60 commits into from
Dec 16, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
60 commits
Select commit Hold shift + click to select a range
f427d74
versioning and stability first draft
tedsuo Dec 4, 2020
50db1aa
whitespace
tedsuo Dec 4, 2020
a4d6432
update rfc number to match PR id
tedsuo Dec 4, 2020
a001ee6
Update text/0143-versioning-and-stability.md
tedsuo Dec 4, 2020
40d4dae
Update text/0143-versioning-and-stability.md
tedsuo Dec 4, 2020
a848e3a
Update text/0143-versioning-and-stability.md
tedsuo Dec 4, 2020
680d1ff
Update text/0143-versioning-and-stability.md
tedsuo Dec 5, 2020
f880b88
Update text/0143-versioning-and-stability.md
tedsuo Dec 7, 2020
70786fe
Update text/0143-versioning-and-stability.md
tedsuo Dec 7, 2020
b5b9255
Update text/0143-versioning-and-stability.md
tedsuo Dec 7, 2020
7a62f27
Update text/0143-versioning-and-stability.md
tedsuo Dec 8, 2020
7f24f1d
Update text/0143-versioning-and-stability.md
tedsuo Dec 8, 2020
bdb150a
remove vague design goal
tedsuo Dec 8, 2020
f818d11
Merge branch 'versioning' of github.com:tedsuo/rfcs into versioning
tedsuo Dec 8, 2020
d1624dc
Adding more info about cross-cutting concerns
tedsuo Dec 8, 2020
0ee804e
wtf git
tedsuo Dec 8, 2020
a3384e0
clarify deprecation version bump
tedsuo Dec 9, 2020
9669b23
clarify dependency conflicts
tedsuo Dec 9, 2020
48d7f4f
clarify stability
tedsuo Dec 9, 2020
0d03cf8
clarify contrib
tedsuo Dec 9, 2020
ccb0cc6
spelling
tedsuo Dec 9, 2020
f87ee1f
spelling
tedsuo Dec 9, 2020
2d52f5c
clarify experimental stage
tedsuo Dec 9, 2020
6f7fe0f
Update text/0143-versioning-and-stability.md
tedsuo Dec 9, 2020
55ddbfb
Update text/0143-versioning-and-stability.md
tedsuo Dec 9, 2020
9e0284b
remove double spaces
tedsuo Dec 9, 2020
53df880
clarify levels of stability
tedsuo Dec 9, 2020
a5e9192
seperate
tedsuo Dec 9, 2020
bacf18d
spelling
tedsuo Dec 9, 2020
1fbaf30
emphasize that the SDK should not be refernced
tedsuo Dec 9, 2020
c335a51
remove LTS
tedsuo Dec 9, 2020
02fa9c7
clarify what counts as a bug
tedsuo Dec 9, 2020
e90aec5
support -> retains
tedsuo Dec 9, 2020
d69fc1a
spelling
tedsuo Dec 11, 2020
4a13c0a
clarify that this proposal is about clients
tedsuo Dec 11, 2020
fe35fdb
Each component has it's own version
tedsuo Dec 11, 2020
f7a2c04
Added long term support
tedsuo Dec 11, 2020
4337fd8
Lint
tedsuo Dec 11, 2020
86cbe2b
Update text/0143-versioning-and-stability.md
tedsuo Dec 11, 2020
2f7396b
long term support infographic
tedsuo Dec 11, 2020
8e04de1
define OpenTelemetry GA
tedsuo Dec 11, 2020
6a2a78f
clarify SDK stability
tedsuo Dec 11, 2020
594e380
Clarify long term support for SDK and Contrib
tedsuo Dec 11, 2020
1da65ef
lint
tedsuo Dec 11, 2020
a71d8b5
Clarify that experimental packages should not move
tedsuo Dec 11, 2020
ca647dc
clarify which audience is allowed to interact with the SDK
tedsuo Dec 11, 2020
66e58e4
Make ABI a langauge specific concern
tedsuo Dec 11, 2020
eb39b89
clarify new major versions of existing signals
tedsuo Dec 11, 2020
00b57c7
clarify single API version
tedsuo Dec 11, 2020
a602397
clarify single SDK version number
tedsuo Dec 11, 2020
04a3f15
lint
tedsuo Dec 11, 2020
a4e027a
better pic
tedsuo Dec 12, 2020
8184d15
add ruby and js to 0.X experimental
tedsuo Dec 12, 2020
e560725
whitespace
tedsuo Dec 15, 2020
0919fa7
clarify that component versions do not need to match
tedsuo Dec 15, 2020
5b93dc7
clarify contrib contains multiple versions
tedsuo Dec 15, 2020
92335ad
complete sentence
tedsuo Dec 15, 2020
31a2cfa
clarify stability may be applied to API ecosystem before SDK ecosystem
tedsuo Dec 15, 2020
93622e5
Clarify that contrib does include core plugins
tedsuo Dec 15, 2020
9938dd5
Give examples of contructors and plugins
tedsuo Dec 15, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
165 changes: 165 additions & 0 deletions text/0143-versioning-and-stability.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
# Versioning and stability for OpenTelemetry clients

OpenTelemetry is a large project with strict compatibility requirements. This proposal defines the stability guarantees offered by the OpenTelemetry clients, along with a versioning and lifecycle proposal which defines how we meet those requirements.

Language implementations are expected to follow this proposal exactly, unless a language or package manager convention interferes significantly. Implementations must take this cross-language proposal, and produce a language-specific proposal which details how these requirements will be met.

Note: In this document, the term OpenTelemetry specifically refers to the OpenTelemetry clients. It does not refer to the specification or the Collector.

## Design goals

**Ensure that end users stay up to date with the latest release.**
We want all users to stay up to date with the latest version of OpenTelemetry. We do not want to create hard breaks in support, of any kind, which leave users stranded on older versions. It must always be possible to upgrade to the latest minor version of OpenTelemetry, without creating compilation or runtime errors.

**Never create a dependency conflict between packages which rely on different versions of OpenTelemetry. Avoid breaking all stable public APIs.**
Backwards compatibility is a strict requirement. Instrumentation APIs cannot create a version conflict, ever. Otherwise, OpenTelemetry cannot be embedded in widely shared libraries, such as web frameworks. Code written against older versions of the API must work with all newer versions of the API. Transitive dependencies of the API cannot create a version conflict. The OpenTelemetry API cannot depend on "foo" if there is any chance that any library or application may require a different, incompatible version of "foo." A library using OpenTelemetry should never become incompatible with other libraries due to a version conflict in one of OpenTelemetry's dependencies. Theoretically, APIs can be deprecated and eventually removed, but this is a process measured in years and we have no plans to do so.

**Allow for multiple levels of package stability within the same release.**
Provide maintainers a clear process for developing new, experimental APIs alongside stable APIs. DIfferent packages within the same release may have different levels of stability. This means that an implementation wishing to release stable tracing today must ensure that experimental metrics are factored out in such a way that breaking changes to metrics API do not destabilize the trace API packages.

## Relevant architecture

![Cross cutting concerns](img/0143_cross_cutting.png)

At the highest architectural level, OpenTelemetry is organized into signals. Each signal provides a specialized form of observability. For example, tracing, metrics, and baggage are three separate signals. Signals share a common subsystem – context propagation – but they function independently from each other.

Each signal provides a mechanism for software to describe itself. A codebase, such as an API handler or a database client, takes a dependency on various signals in order to describe itself. OpenTelemetry instrumentation code is then mixed into the other code within that codebase. This makes OpenTelemetry a **cross-cutting concern** - a piece of software which must be mixed into many other pieces of software in order to provide value. Cross-cutting concerns, by their very nature, violate a core design principle – separation of concerns. As a result, OpenTelemetry requires extra care and attention to avoid creating issues for the codebase which depend upon these cross-cutting APIs.

OpenTelemetry is designed to separate the portion of each signal which must be imported as cross-cutting concerns from the portions of OpenTelemetry which can be managed independently. OpenTelemetry is also designed to be an extensible framework. To accomplish this these goals, each signal consists of four types of packages:

**API -** API packages consist of the cross-cutting public interfaces used for instrumentation. Any portion of OpenTelemetry which 3rd-party libraries and application code depend upon is considered part of the API. To manage different levels of stability, every signal has its own, independent API package. These individual APIs may also be bundled up into a shared global API, for convenience.

carlosalberto marked this conversation as resolved.
Show resolved Hide resolved
**SDK -** The implementation of the API. The SDK is managed by the application owner. Note that the SDKs includes additional public interfaces which are not considered part of the API package, as they are not cross-cutting concerns. These public interfaces are defined as **constructors** and **plugin interfaces**. Examples of plugin interfaces include the SpanProcessor, Exporter, and Sampler interfaces. Examples of constructors include configuration objects, environment variables, and SDK builders. Application owners may interact with SDK constructors; plugin authors may interact with SDK plugin interfaces. Instrumentation authors must never directly reference any SDK package of any kind, only the API.

**Semantic Conventions -** A schema defining the attributes which describe common concepts and operations which the signal observes. Note that unlike the API or SDK, stable conventions for all signals may be placed in the same package, as they are often useful across different signals.

**Contrib –** plugins and instrumentation that make use of the API or SDK interfaces, but are not part of the core packages necessary for running OTel. The term "contrib" specifically refers to the plugins and instrumentation maintained by the OpenTelemetry organization outside of the SDK; it does not refer to third party plugins hosted elsewhere, or core plugins which are required to be part of the SDK release, such as OTLP Exporters and TraceContext Propagators. **API Contrib** refers to packages which depend solely upon the API; **SDK Contrib** refers to packages which also depend upon the SDK.

## Signal lifecycle
carlosalberto marked this conversation as resolved.
Show resolved Hide resolved

OpenTelemetry is structured around signals. Each signal represents a coherent, stand-alone set of functionality. Each signal follows a lifecycle.

![API Lifecycle](img/0143_api_lifecycle.png)

### Lifecycle stages

**Experimental –** Breaking changes and performance issues may occur. Components may not be feature-complete. The experiment may be discarded.

**Stable –** Stability guarantees apply, based on component type (API, SDK, Conventions, and Contrib). Long term dependencies may now be taken against these packages.

**Deprecated –** this signal has been replaced but is still retains the same stability guarantees.

**Removed -** a deprecated signal is no longer supported, and is removed.

All signal components may become stable together, or one by one in the following order: API, Semantic Conventions, API Contrib, SDK, SDK Contrib.

When transitioning from experimental to stable to deprecated, packages **should not move or otherwise break how they are imported by users**. Do NOT use and "experimental" directory or package suffix.

Optionally, package **version numbers** MAY include a suffix, such as -alpha, -beta, -rc, or -experimental, to differentiate stable and experimental packages.

### Stability

Once a signal component is marked as stable, the following rules apply until the end of that signal’s existence.
tedsuo marked this conversation as resolved.
Show resolved Hide resolved

**API Stability -**
No backward-incompatible changes to the API are allowed unless the major version number is incremented. All existing API calls must continue to compile and function against all future minor versions of the same major version. ABI compatibility for the API may be offered on a language by language basis.
tedsuo marked this conversation as resolved.
Show resolved Hide resolved

**SDK Stability -**
Public portions of the SDK must remain backwards compatible. There are two categories: **plugin interfaces** and **constructors**. Examples of plugins include the SpanProcessor, Exporter, and Sampler interfaces. Examples of constructors include configuration objects, environment variables, and SDK builders.

ABI compatibility for SDK plugin interfaces and constructors may be offered on a language by language basis.

**Semantic Conventions Stability -**
Semantic Conventions may not be removed once they are stable. New conventions may be added to replace usage of older conventions, but the older conventions are never removed, they will only be marked as deprecated in favor of the newer ones.

**Contrib Stability -**
Plugins and instrumentation are kept up to date, and are released simultaneously (or shortly after) the latest release of the API. The goal is to ensure users can update to the latest version of OpenTelemetry, and not be held back by the plugins that they depend on.

Public portions of contrib packages (constructors, configuration, interfaces) must remain backwards compatible. ABI compatibility for contrib packages may be offered on a language by language basis.

Telemetry produced by contrib instrumentation must also remain stable and backwards compatible, to avoid breaking alerts and dashboards. This means that existing data may not be mutated or removed without a major version bump. Additional data may be added. This applies to spans, metrics, resources, attributes, events, and any other data types that OpenTelemetry emits.

### Deprecation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There isn't any time-period described here between deprecation and removal, which makes deprecation somewhat moot. I can simply replace old with new, and cut a new major version without deprecating. If this is binding in some form (e.g. through a minimum "deprecation period"), we should spell that out. If not, make it clearer that this is an optional, be-a-nice-community sort of thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I don't want to arbitrarily pick a minimum deprecation period, and no one has pointed to any kind of convention or standard to follow. The current deprecation period should be considered "infinite" for the time being.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite following. IIUC, an infinite deprecation period would imply we can't remove signals, even with a new major version, which doesn't align with the rest of the proposal.

Here is what I would suggest as a definition for deprecation:

"A deprecated signal is a signal which will be removed in the next major version, but is identical to a stable signal in all other respects. Marking a signal deprecated does not imply a new major version must be cut, but guarantees that it will be removed if/when a new major version is cut. All signals which are removed in a new major version must be marked deprecated in the old major version before releasing the new major version--no cherrypicking deprecations after-the-fact, and no removing signals which are not deprecated. Marking a signal deprecated is intended to smooth the transition to a new major version by encouraging users to move off of the deprecated signal prior to upgrading to the next major version. A user who has moved off of all deprecated signals will not encounter any compilation errors when upgrading to the new major version."


In theory, signals could be replaced. When this happens, they are marked as deprecated.

Code is only marked as deprecated when the replacement becomes stable. Deprecated code still abides by the same support guarantees as stable code. Deprecated APIs remain stable and backwards compatible.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implies there is always a replacement for deprecated functionality. Are we saying there will never be a signal we want to remove without replacement?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the idea is to actually remove those components way down the road, just not right away (we did this in OpenTracing, removing APIs after deprecating them, and there were a few complains).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I can't see a reason for us removing stable features just because we decide we don't like them any more. In those cases, we just leave them alone and let them be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. The only process by which we should remove signals is by introducing a new major version. We should also put a high price on new major versions, which pushes removal of signals way down the road.

I just don't want to imply that major version releases, which can remove signals, will have feature-parity with previous major versions.

Suggested change
Code is only marked as deprecated when the replacement becomes stable. Deprecated code still abides by the same support guarantees as stable code. Deprecated APIs remain stable and backwards compatible.
If the signal is being replaced, code should only be marked as deprecated after the replacement becomes stable. Deprecated code still abides by the same support guarantees as stable code. Deprecated APIs remain stable and backwards compatible.


### Removal

Packages are end-of-life’d by being removed from the release. The release then makes a major version bump.

We currently have no plans for deprecating signals or creating a major version past v1.0.
tedsuo marked this conversation as resolved.
Show resolved Hide resolved

For clarity, it is still possible to create a new, backwards incompatible version of an existing type of signal without actually moving to v2.0 and breaking support. Allow me to explain.

Imagine we develop a new, better tracing API - let's call it AwesomeTrace. We will never mutate the current tracing API into AwesomeTrace. Instead, AwesomeTrace would be added as an entirely new signal which coexists and interoperates with the current tracing signal. This would make adding AwesomeTrace a minor version bump, *not* v2.0. v2.0 would mark the end of support for current tracing, not the addition of AwesomeTrace. And we don't want to ever end that support, if we can help it.

This is not actually a theoretical example. OpenTelemetry already supports two tracing APIs: OpenTelemetry and OpenTracing. We invented a new tracing API, but continue to support the old one.

## Version Numbers

OpenTelemetry follows [semver 2.0](https://semver.org/) conventions, with the following distinction.

OpenTelemetry clients have four components: API, Semantic Conventions, SDK, and Contrib.

For the purposes of versioning, all code within a component is treated as if it were part of a single package, and versioned with the same version number, except for Contrib, which may be a collection of packages versioned separately.

* All packages within the API share the same version number. API packages for all signals version together, across all signals. Signals do not have separate version numbers. There is one version number that applies to all signals that are included in the API release that is labeled with that particular version number.
* All packages within the SDK share the same version number. SDK packages for all signals version together, across all signals. There is one version number that applies to all signals that are included in the SDK release that is labeled with that particular version number.
* All Semantic Conventions are contained within a single package with a single version number.
* Each contrib package has it's own version.
* The API, SDK, Semantic Conventions, and contrib components are not required to share a version number. For example, the latest version of `opentelemetry-python-api` may be at v1.2.3, while the latest version of `opentelemetry-python-sdk` may be at v2.3.1.
* Different language implementations do not need to have matching version numbers. For example, it is fine to have `opentelemetry-python-api` at v1.2.8 when `opentelemetry-java-api` is at v1.3.2.
* Language implementations do not need to match the version of the specification they implement. For example, it is fine for v1.8.2 of `opentelemetry-python-api` to implement v1.1.1 of the specification.

**Exception:** in some languages, package managers may react poorly to experimental packages having a version higher than 0.X. In these cases, a language-specific workaround is required. Go, Ruby, and Javascript are examples.

**Major version bump**
Major version bumps only occur when there is a breaking change to a stable interface, or the removal of deprecated signals.

OpenTelemetry values long term support. The expectation is that we will version to v1.0 once the first set of packages are declared stable. OpenTelemetry will then remain at v1.0 for years. There are no plans for a v2.0 of OpenTelemetry at this time. Additional stable packages, such as metrics and logs, will be added as minor version bumps.

**Minor version bump**
Most changes to OpenTelemetry result in a minor version bump.

* New backward-compatible functionality added to any component.
* Breaking changes to internal SDK components.
tedsuo marked this conversation as resolved.
Show resolved Hide resolved
* Breaking changes to experimental signals.
* New experimental packages are added.
* Experimental packages become stable.

**Patch version bump**
Patch versions make no changes which would require recompilation or potentially break application code. The following are examples of patch fixes.

* Bug fixes which don't require minor version bump per rules above.
* Security fixes.
* Documentation.

Currently, OpenTelemetry does NOT have plans to backport bug and security fixes to prior minor versions. Security and bug fixes are only applied to the latest minor version. We are committed to making it feasible for end users to stay up to date with the latest version of OpenTelemetry.

## Long Term Support

![long term support](img/0143_long_term.png)

### API support

Major versions of the API will be supported for a minimum of **three years** after the release of the next major API version. Support covers the following areas.

API stability, as defined above, will be maintained.

A version of the SDK which supports the last major version of the API will continue to be maintained during this period. Bug and security fixes will be backported. Additional feature development is not guaranteed.

Contrib packages available when the API is versioned will continue to be maintained for the duration of this period. Bug and security fixes will be backported. Additional feature development is not guaranteed.

### SDK Support

SDK stability, as defined above, will be maintained for a minimum of **one year** after after the release of the next major SDK version.
tedsuo marked this conversation as resolved.
Show resolved Hide resolved

### Contrib Support
tedsuo marked this conversation as resolved.
Show resolved Hide resolved

Contrib stability, as defined above, will be maintained for a minimum of **one year** after after the release of the next major version of a contrib package.

## OpenTelemetry GA

The term “OpenTelemetry GA” refers to the point at which a stable version of both tracing and metrics has been released in at least three languages.
tedsuo marked this conversation as resolved.
Show resolved Hide resolved
Binary file added text/img/0143_api_lifecycle.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added text/img/0143_cross_cutting.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added text/img/0143_long_term.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.