Skip to content

Commit

Permalink
Merge branch 'main' into mx-psi/graduation-process
Browse files Browse the repository at this point in the history
  • Loading branch information
mx-psi authored Dec 17, 2024
2 parents cf39cdc + 58a5ffc commit 9a58deb
Show file tree
Hide file tree
Showing 5 changed files with 16 additions and 7 deletions.
3 changes: 2 additions & 1 deletion docs/component-status.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ There is a finite state machine underlying the status reporting API that governs

![State Diagram](img/component-status-state-diagram.png)

The finite state machine ensures that components progress through the lifecycle properly and it manages transitions through runtime states so that components do not need to track their state internally. Only changes in status result in new events being generated; repeat reports of the same status are ignored. PermanentError and FatalError are permanent runtime states. A component in these states cannot make any further state transitions.
The finite state machine ensures that components progress through the lifecycle properly and it manages transitions through runtime states so that components do not need to track their state internally. Only changes in status result in new events being generated; repeat reports of the same status are ignored. PermanentError is a permanent runtime state. A component in a PermanentError state cannot transtion to OK or RecoverableError, but it can transition to Stopping. FatalError is a final state. A component in a FatalError state cannot make any further state transitions.

![Status Event Generation](img/component-status-event-generation.png)

Expand All @@ -61,6 +61,7 @@ Under most circumstances, a component does not need to report explicit status du
**Runtime**

![Runtime State Diagram](img/component-status-runtime-states.png)

During runtime a component should not have to keep track of its state. A component should report status as operations succeed or fail and the finite state machine will handle the rest. Changes in status will result in new status events being emitted. Repeat reports of the same status will no-op. Similarly, attempts to make an invalid state transition, such as PermanentError to OK, will have no effect.

We intend to define guidelines to help component authors distinguish between recoverable and permanent errors on a per-component type basis and we'll update this document as we make decisions. See [this issue](https://github.com/open-telemetry/opentelemetry-collector/issues/9957) for current thoughts and discussions.
Expand Down
Binary file modified docs/img/component-status-state-diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions service/internal/graph/lifecycle_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -351,6 +351,8 @@ func TestStatusReportedOnStartupShutdown(t *testing.T) {
instanceIDs[eStErr]: {
componentstatus.NewEvent(componentstatus.StatusStarting),
componentstatus.NewPermanentErrorEvent(assert.AnError),
componentstatus.NewEvent(componentstatus.StatusStopping),
componentstatus.NewEvent(componentstatus.StatusStopped),
},
},
startupErr: assert.AnError,
Expand All @@ -362,6 +364,8 @@ func TestStatusReportedOnStartupShutdown(t *testing.T) {
instanceIDs[rStErr]: {
componentstatus.NewEvent(componentstatus.StatusStarting),
componentstatus.NewPermanentErrorEvent(assert.AnError),
componentstatus.NewEvent(componentstatus.StatusStopping),
componentstatus.NewEvent(componentstatus.StatusStopped),
},
instanceIDs[eNoErr]: {
componentstatus.NewEvent(componentstatus.StatusStarting),
Expand Down
6 changes: 4 additions & 2 deletions service/internal/status/status.go
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,10 @@ func newFSM(onTransition onTransitionFunc) *fsm {
componentstatus.StatusFatalError: {},
componentstatus.StatusStopping: {},
},
componentstatus.StatusPermanentError: {},
componentstatus.StatusFatalError: {},
componentstatus.StatusPermanentError: {
componentstatus.StatusStopping: {},
},
componentstatus.StatusFatalError: {},
componentstatus.StatusStopping: {
componentstatus.StatusRecoverableError: {},
componentstatus.StatusPermanentError: {},
Expand Down
10 changes: 6 additions & 4 deletions service/internal/status/status_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -75,17 +75,19 @@ func TestStatusFSM(t *testing.T) {
expectedErrorCount: 2,
},
{
name: "PermanentError is terminal",
name: "PermanentError is stoppable",
reportedStatuses: []componentstatus.Status{
componentstatus.StatusStarting,
componentstatus.StatusOK,
componentstatus.StatusPermanentError,
componentstatus.StatusOK,
componentstatus.StatusStopping,
},
expectedStatuses: []componentstatus.Status{
componentstatus.StatusStarting,
componentstatus.StatusOK,
componentstatus.StatusPermanentError,
componentstatus.StatusStopping,
},
expectedErrorCount: 1,
},
Expand Down Expand Up @@ -154,7 +156,7 @@ func TestValidSeqsToStopped(t *testing.T) {
}

for _, ev := range events {
name := fmt.Sprintf("transition from: %s to: %s invalid", ev.Status(), componentstatus.StatusStopped)
name := fmt.Sprintf("transition from: %s to: %s", ev.Status(), componentstatus.StatusStopped)
t.Run(name, func(t *testing.T) {
fsm := newFSM(func(*componentstatus.Event) {})
if ev.Status() != componentstatus.StatusStarting {
Expand All @@ -165,9 +167,9 @@ func TestValidSeqsToStopped(t *testing.T) {
err := fsm.transition(componentstatus.NewEvent(componentstatus.StatusStopped))
require.ErrorIs(t, err, errInvalidStateTransition)

// stopping -> stopped is allowed for non-fatal, non-permanent errors
// stopping -> stopped is allowed for non-fatal errors
err = fsm.transition(componentstatus.NewEvent(componentstatus.StatusStopping))
if ev.Status() == componentstatus.StatusPermanentError || ev.Status() == componentstatus.StatusFatalError {
if ev.Status() == componentstatus.StatusFatalError {
require.ErrorIs(t, err, errInvalidStateTransition)
} else {
require.NoError(t, err)
Expand Down

0 comments on commit 9a58deb

Please sign in to comment.