Skip to content

Commit

Permalink
Implement MicroProfile Fault Tolerance 4.1
Browse files Browse the repository at this point in the history
This includes support for OpenTelemetry Metrics. For testing, this
commit uses SmallRye OpenTelemetry.
  • Loading branch information
Ladicek committed Oct 4, 2024
1 parent b9afadb commit 1c2f023
Show file tree
Hide file tree
Showing 29 changed files with 945 additions and 174 deletions.
4 changes: 2 additions & 2 deletions doc/antora.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ asciidoc:
smallrye-fault-tolerance-version: '6.4.1'

microprofile-fault-tolerance: MicroProfile Fault Tolerance
microprofile-fault-tolerance-version: '4.0.2'
microprofile-fault-tolerance-url: https://download.eclipse.org/microprofile/microprofile-fault-tolerance-4.0/microprofile-fault-tolerance-spec-4.0.html
microprofile-fault-tolerance-version: '4.1'
microprofile-fault-tolerance-url: https://download.eclipse.org/microprofile/microprofile-fault-tolerance-4.1/microprofile-fault-tolerance-spec-4.1.html

vertx4-version: '4.5.8'
63 changes: 32 additions & 31 deletions doc/modules/ROOT/pages/integration/metrics.adoc
Original file line number Diff line number Diff line change
@@ -1,64 +1,65 @@
= Metrics

{smallrye-fault-tolerance} provides support for MicroProfile Metrics and Micrometer.
{smallrye-fault-tolerance} provides support for MicroProfile Metrics, OpenTelemetry and Micrometer.
Alternatively, metrics may be completely disabled at the integration level.

As usual, this integration is based on CDI.
{smallrye-fault-tolerance} includes an internal interface `MetricsProvider` and 3 different implementations.
Exactly 1 bean of type `MetricsProvider` must exist.
An instance of that bean is used to interact with the metrics system.
{smallrye-fault-tolerance} includes an internal interface `MetricsProvider` and these implementations:

There are 2 ways to select which metrics provider bean exists:
* `io.smallrye.faulttolerance.metrics.MicroProfileMetricsProvider`
* `io.smallrye.faulttolerance.metrics.OpenTelemetryProvider`
* `io.smallrye.faulttolerance.metrics.MicrometerProvider`
* `io.smallrye.faulttolerance.metrics.NoopProvider`
There are 2 possible ways how to integrate metrics:

* exactly 1 class from the list above is a bean;
* more than 1 class from the list above is a bean, in which case, `io.smallrye.faulttolerance.metrics.CompoundMetricsProvider` must also be a bean.
NOTE: Only the _names_ of the classes listed above are treated as public.
That is, the classes should be treated as opaque, no guarantees about their internals are made.

== Default Integration

- using a constructor of the Portable Extension,
- altering the set of discovered types.
In case the integrator uses the CDI Portable Extension `FaultToleranceExtension` and lets the container create an instance, metrics presence is discovered automatically.
All present metrics systems are used.

== Using a `FaultToleranceExtension` Constructor

In case the integrator uses the CDI Portable Extension `FaultToleranceExtension` and creates its instance manually, they can use a constructor.

In addition to a zero-parameter constructor, there's a constructor that takes a `MetricsIntegration` parameter.
In addition to a zero-parameter constructor, which is used in the default integration as described above, there are constructors that take a parameter of `MetricsIntegration` or `Set<MetricsIntegration>`.

`MetricsIntegration` is an enum with these values:

* `MICROPROFILE_METRICS`: use MicroProfile Metrics integration
* `OPENTELEMETRY`: use OpenTelemetry (MicroProfile Telemetry) integration
* `MICROMETER`: use Micrometer integration
* `NOOP`: no metrics

As mentioned above, this is only useful if the integrator creates an instance of the extension themselves.
If the integrator relies on the CDI container to discover and instantiate the extension, the zero-parameter constructor is used, which defaults to `MICROPROFILE_METRICS`.
Such integrator can use the 2nd approach of altering the set of discovered types.

== Altering the Set of Discovered Types

The integrator may select the metrics provider by making sure that the correct implementation is discovered during CDI type discovery.
The existing metrics providers are:

* `io.smallrye.faulttolerance.metrics.MicroProfileMetricsProvider`
* `io.smallrye.faulttolerance.metrics.MicrometerProvider`
* `io.smallrye.faulttolerance.metrics.NoopProvider`

NOTE: Only the _names_ of the classes listed above are treated as public.
That is, the classes should be treated as opaque, no guarantees about their internals are made.

Exactly one of these classes must be discovered during CDI type discovery.

NOTE: Integrators that rely on the CDI container to instantiate `FaultToleranceExtension` must be aware that in this case, the extension adds `MicroProfileMetricsProvider` to the set of discovered types.
If they want to use a different metrics provider, they need to veto the `MicroProfileMetricsProvider` type.

== Metrics Providers

Metrics providers have additional requirements, as described below.

=== MicroProfile Metrics

If MicroProfile Metrics are used, the integrator must ensure that the following artifacts are present:
If MicroProfile Metrics should be used, the integrator must ensure that the following artifacts are present:

* `org.eclipse.microprofile.metrics:microprofile-metrics-api`;
* some implementation of MicroProfile Metrics.

=== OpenTelemetry

If OpenTelemetry should be used, the integrator must ensure that the following artifact is present:

* `io.opentelemetry:opentelemetry-api`.

Further, a bean of type `io.opentelemetry.api.metrics.Meter` must exist.
This bean is used to emit the actual metrics.

=== Micrometer

If Micrometer is used, the integrator must ensure that the following artifact is present:
If Micrometer should be used, the integrator must ensure that the following artifact is present:

* `io.micrometer:micrometer-core`.

Expand Down
7 changes: 4 additions & 3 deletions doc/modules/ROOT/pages/integration/programmatic-api.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,8 @@ After `StandaloneFaultTolerance.shutdown()`, it is not possible to reinitialize
=== Metrics

In the standalone implementation, MicroProfile Metrics make no sense, as that is exclusively based on CDI.
It is however possible to integrate with Micrometer.
It is however possible to integrate with OpenTelemetry or Micrometer.

The `Configuration.metricsAdapter()` method must be implemented and return an instance of `io.smallrye.faulttolerance.standalone.MicrometerAdapter`.
The constructor of `MicrometerAdapter` accepts the Micrometer registry (`MeterRegistry`) to which metrics shall be emitted.
The `Configuration.metricsAdapter()` method must be implemented and return an instance of `io.smallrye.faulttolerance.standalone.OpenTelemetryAdapter` or `io.smallrye.faulttolerance.standalone.MicrometerAdapter`.
The constructor of `OpenTelemetryAdapter` accepts the `Meter` to which metrics shall be emitted.
The constructor of `MicrometerAdapter` accepts the `MeterRegistry` to which metrics shall be emitted.
35 changes: 28 additions & 7 deletions doc/modules/ROOT/pages/reference/bulkhead.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,10 @@ Bulkhead exposes the following metrics:
[cols="1,5"]
|===
| Name | `ft.bulkhead.calls.total`
| Type | `Counter`
| Type
a| * MP Metrics: `Counter`
* OpenTelemetry: `LongCounter`
* Micrometer: `Counter`
| Unit | None
| Description | The number of times the bulkhead logic was run. This is usually once per method call, but may be zero times if the circuit breaker or rate limit prevented execution or more than once if the method call was retried.
| Tags
Expand All @@ -78,7 +81,10 @@ a| * `method` - the fully qualified method name
[cols="1,5"]
|===
| Name | `ft.bulkhead.executionsRunning`
| Type | `Gauge<Long>`
| Type
a| * MP Metrics: `Gauge<Long>`
* OpenTelemetry: `LongUpDownCounter`
* Micrometer: `Gauge`
| Unit | None
| Description | Number of currently running executions.
| Tags
Expand All @@ -88,7 +94,10 @@ a| * `method` - the fully qualified method name
[cols="1,5"]
|===
| Name | `ft.bulkhead.executionsWaiting`
| Type | `Gauge<Long>`
| Type
a| * MP Metrics: `Gauge<Long>`
* OpenTelemetry: `LongUpDownCounter`
* Micrometer: `Gauge`
| Unit | None
| Description | Number of executions currently waiting in the queue.
| Tags
Expand All @@ -99,8 +108,14 @@ a| * `method` - the fully qualified method name
[cols="1,5"]
|===
| Name | `ft.bulkhead.runningDuration`
| Type | `Histogram`
| Unit | Nanoseconds
| Type
a| * MP Metrics: `Histogram`
* OpenTelemetry: `DoubleHistogram` with explicit bucket boundaries `[0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10]`
* Micrometer: `Timer`
| Unit
a| * MP Metrics: nanoseconds
* OpenTelemetry: seconds
* Micrometer: nanoseconds
| Description | Histogram of the time that method executions spent running.
| Tags
a| * `method` - the fully qualified method name
Expand All @@ -109,8 +124,14 @@ a| * `method` - the fully qualified method name
[cols="1,5"]
|===
| Name | `ft.bulkhead.waitingDuration`
| Type | `Histogram`
| Unit | Nanoseconds
| Type
a| * MP Metrics: `Histogram`
* OpenTelemetry: `DoubleHistogram` with explicit bucket boundaries `[0.005, 0.01, 0.025, 0.05, 0.075, 0.1, 0.25, 0.5, 0.75, 1, 2.5, 5, 7.5, 10]`
* Micrometer: `Timer`
| Unit
a| * MP Metrics: nanoseconds
* OpenTelemetry: seconds
* Micrometer: nanoseconds
| Description | Histogram of the time that method executions spent waiting in the queue.
| Tags
a| * `method` - the fully qualified method name
Expand Down
25 changes: 20 additions & 5 deletions doc/modules/ROOT/pages/reference/circuit-breaker.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,10 @@ Circuit breaker exposes the following metrics:
[cols="1,5"]
|===
| Name | `ft.circuitbreaker.calls.total`
| Type | `Counter`
| Type
a| * MP Metrics: `Counter`
* OpenTelemetry: `LongCounter`
* Micrometer: `Counter`
| Unit | None
| Description | The number of times the circuit breaker logic was run. This is usually once per method call, but may be more than once if the method call is retried.
| Tags
Expand All @@ -145,8 +148,14 @@ a| * `method` - the fully qualified method name
[cols="1,5"]
|===
| Name | `ft.circuitbreaker.state.total`
| Type | `Gauge<Long>`
| Unit | Nanoseconds
| Type
a| * MP Metrics: `Gauge<Long>`
* OpenTelemetry: `LongCounter`
* Micrometer: `TimeGauge`
| Unit
a| * MP Metrics: nanoseconds
* OpenTelemetry: nanoseconds
* Micrometer: nanoseconds
| Description | Amount of time the circuit breaker has spent in each state
| Tags
a| * `method` - the fully qualified method name
Expand All @@ -157,7 +166,10 @@ a| * `method` - the fully qualified method name
[cols="1,5"]
|===
| Name | `ft.circuitbreaker.opened.total`
| Type | `Counter`
| Type
a| * MP Metrics: `Counter`
* OpenTelemetry: `LongCounter`
* Micrometer: `Counter`
| Unit | None
| Description | Number of times the circuit breaker has moved from closed state to open state
| Tags
Expand All @@ -169,7 +181,10 @@ a| * `method` - the fully qualified method name
| Name | `ft.circuitbreaker.state.current`
2+a|
include::partial$srye-feature.adoc[]
| Type | `Gauge<Long>` (`0` or `1`)
| Type
a| * MP Metrics: `Gauge<Long>`
* OpenTelemetry: `LongUpDownCounter`
* Micrometer: `Gauge`
| Unit | None
| Description | Whether the circuit breaker is currently in given state (`1`) or not (`0`)
| Tags
Expand Down
103 changes: 11 additions & 92 deletions doc/modules/ROOT/pages/reference/metrics.adoc
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
= Metrics

{smallrye-fault-tolerance} exposes metrics to MicroProfile Metrics, as {microprofile-fault-tolerance-url}#_integration_with_microprofile_metrics[specified] by {microprofile-fault-tolerance}.
{smallrye-fault-tolerance} exposes metrics, as {microprofile-fault-tolerance-url}#_integration_with_microprofile_metrics_and_microprofile_telemetry[specified] by {microprofile-fault-tolerance}.
[[general]]
== General Metrics
Expand All @@ -10,7 +10,10 @@ For all methods guarded with some fault tolerance strategy, the following metric
[cols="1,5"]
|===
| Name | `ft.invocations.total`
| Type | `Counter`
| Type
a| * MP Metrics: `Counter`
* OpenTelemetry: `LongCounter`
* Micrometer: `Counter`
| Unit | None
| Description | The number of times the method was called.
| Tags
Expand Down Expand Up @@ -42,7 +45,10 @@ The behavior of the timer thread can be observed through the following metrics:
[cols="1,5"]
|===
| Name | `ft.timer.scheduled`
| Type | `Gauge<Integer>`
| Type
a| * MP Metrics: `Gauge<Integer>`
* OpenTelemetry: `LongUpDownCounter`
* Micrometer: `Gauge`
| Unit | None
| Description | The number of tasks that are currently scheduled (for future execution) on the timer.
| Tags
Expand All @@ -51,95 +57,8 @@ a| * `id` - the ID of the timer, to distinguish multiple timers in a multi-appli

== Micrometer Support

In addition to the MicroProfile Metrics support, {smallrye-fault-tolerance} also provides support for https://micrometer.io/[Micrometer].
The set of metrics emitted to Micrometer is the same as the set of metrics emitted to MicroProfile Metrics, using the same metric names and tags.
Metric types are mapped as closely as possible:

|===
| Name | MicroProfile Metrics | Micrometer | Note

| `ft.invocations.total`
| counter
| counter
|

| `ft.retry.calls.total`
| counter
| counter
|

| `ft.retry.retries.total`
| counter
| counter
|

| `ft.timeout.calls.total`
| counter
| counter
|

| `ft.timeout.executionDuration`
| histogram
| timer
|

| `ft.circuitbreaker.calls.total`
| counter
| counter
|

| `ft.circuitbreaker.state.total`
| gauge
| time gauge
|

| `ft.circuitbreaker.state.current`
| gauge
| gauge
| *

| `ft.circuitbreaker.opened.total`
| counter
| counter
|

| `ft.bulkhead.calls.total`
| counter
| counter
|

| `ft.bulkhead.executionsRunning`
| gauge
| gauge
|

| `ft.bulkhead.executionsWaiting`
| gauge
| gauge
|

| `ft.bulkhead.runningDuration`
| histogram
| timer
|

| `ft.bulkhead.waitingDuration`
| histogram
| timer
|

| `ft.ratelimit.calls.total`
| counter
| counter
| *

| `ft.timer.scheduled`
| gauge
| gauge
| *
|===

{empty}* This is a {smallrye-fault-tolerance} feature, not specified by {microprofile-fault-tolerance}.
In addition to the MicroProfile Metrics and OpenTelemetry support (as specified by {microprofile-fault-tolerance}), {smallrye-fault-tolerance} also provides support for https://micrometer.io/[Micrometer].
The set of metrics emitted to Micrometer is the same, using the same metric names and tags.

Note that distribution summaries in Micrometer, including timers, do not emit quantiles by default.
Micrometer recommends that libraries should not configure them out of the box, so if you need them, you should use a `MeterFilter`.
Expand Down
2 changes: 1 addition & 1 deletion doc/modules/ROOT/pages/reference/programmatic-api.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -350,7 +350,7 @@ private static final FaultTolerance<String> guarded = FaultTolerance.<String>cre
<1> A description of `hello` is set, it will be used as a value of the `method` tag in all metrics.

It is possible to create multiple `FaultTolerance` objects with the same description.
In this case, it won't be possbile to distinguish the different `FaultTolerance` objects in metrics; their values will be aggregated.
In this case, it won't be possible to distinguish the different `FaultTolerance` objects in metrics; their values will be aggregated.

If no description is provided, a random UUID is used.

Expand Down
5 changes: 4 additions & 1 deletion doc/modules/ROOT/pages/reference/rate-limit.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -133,7 +133,10 @@ Rate limit exposes the following metrics:
[cols="1,5"]
|===
| Name | `ft.ratelimit.calls.total`
| Type | `Counter`
| Type
a| * MP Metrics: `Counter`
* OpenTelemetry: `LongCounter`
* Micrometer: `Counter`
| Unit | None
| Description | The number of times the rate limit logic was run. This is usually once per method call, but may be zero times if the circuit breaker prevented execution or more than once if the method call was retried.
| Tags
Expand Down
Loading

0 comments on commit 1c2f023

Please sign in to comment.