Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[jaeger-v2] Define an internal interface of storage v2 spanstore #5399

Merged
merged 18 commits into from
May 1, 2024
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions storage_v2/empty_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
// Copyright (c) 2024 The Jaeger Authors.
// SPDX-License-Identifier: Apache-2.0

package storage_v2

import (
"testing"

"github.com/jaegertracing/jaeger/pkg/testutils"
)

func TestMain(m *testing.M) {
testutils.VerifyGoLeaks(m)
}
21 changes: 21 additions & 0 deletions storage_v2/factory.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
// Copyright (c) 2024 The Jaeger Authors.
// SPDX-License-Identifier: Apache-2.0

package storage_v2

import (
"context"
)

// Factory is a general factory interface to be reused across different storage factories.
// It lives within the OTEL collector extension component's lifecycle.
// The Initialize and Close functions supposed to be called from the
// OTEL component's Start and Shutdown functions.
type Factory interface {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
type Factory interface {
type FactoryBase interface {

Even though it is namespaced separately from the actual factories, this interface itself is not a "Factory"

// Initialize performs internal initialization of the factory,
// such as opening connections to the backend store.
Initialize(ctx context.Context) error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've read through the discussions and the latest approach suggested that the Start function shouldn't throw error anymore and uses the TelemetrySettings.ReportStatus to report StatusFatalError to halt the startup asynchronously, cmiiw.

I'm thinking that any changes to the component lifecycles should be handled by the Jaeger storage extension, we can capture the errors from the storage factory in the extension and report them instead, so the storage factory interface should not be affected by it.

Copy link
Member

@yurishkuro yurishkuro May 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, this will be similar to what jaeger-v1 does today. However, jaeger-v1 has a failure mode today where if the db is not ready for any reason when the binary starts (even though the Jaeger configuration pointing to the db is correct), Jaeger fails to start and the binary exists. That's not how a component of a distributed system technically should be operating, i.e. a db temporarily not available at start-up vs. becoming temporarily not available during normal operation should not be treated as different conditions. We don't exit the binary if we lose a connection later, so why do we kill it if we cannot establish the connection during Init()?

If we were to fix this, we need to establish a different expectation from storage implementations, and this is where async runtime status reporting instead of returning an error from Init is coming from. This open-telemetry/opentelemetry-collector#9324 (comment) has a good explanation.


// Close closes the resources held by the factory
Close(ctx context.Context) error
}
14 changes: 14 additions & 0 deletions storage_v2/spanstore/empty_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
// Copyright (c) 2024 The Jaeger Authors.
// SPDX-License-Identifier: Apache-2.0

package spanstore

import (
"testing"

"github.com/jaegertracing/jaeger/pkg/testutils"
)

func TestMain(m *testing.M) {
testutils.VerifyGoLeaks(m)
}
20 changes: 20 additions & 0 deletions storage_v2/spanstore/factory.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
// Copyright (c) 2024 The Jaeger Authors.
// SPDX-License-Identifier: Apache-2.0

package spanstore

import (
"github.com/jaegertracing/jaeger/storage_v2"
)

// Factory defines an interface for a factory that can create implementations of
// different span storage components.
type Factory interface {
storage_v2.Factory

// CreateTraceReader creates a spanstore.Reader.
CreateTraceReader() (Reader, error)

// CreateTraceWriter creates a spanstore.Writer.
CreateTraceWriter() (Writer, error)
}
71 changes: 71 additions & 0 deletions storage_v2/spanstore/reader.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
// Copyright (c) 2024 The Jaeger Authors.
// SPDX-License-Identifier: Apache-2.0

package spanstore

import (
"context"
"time"

"go.opentelemetry.io/collector/pdata/pcommon"
"go.opentelemetry.io/collector/pdata/ptrace"

spanstore_v1 "github.com/jaegertracing/jaeger/storage/spanstore"
)

// ErrTraceNotFound is returned by Reader's GetTrace if no data is found for given trace ID.
var ErrTraceNotFound = spanstore_v1.ErrTraceNotFound

// Reader finds and loads traces and other data from storage.
type Reader interface {
// GetTrace retrieves the trace with a given id.
//
// If no spans are stored for this trace, it returns ErrTraceNotFound.
yurishkuro marked this conversation as resolved.
Show resolved Hide resolved
GetTrace(ctx context.Context, traceID pcommon.TraceID) (*ptrace.Traces, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
GetTrace(ctx context.Context, traceID pcommon.TraceID) (*ptrace.Traces, error)
GetTrace(ctx context.Context, traceID pcommon.TraceID) (ptrace.Traces, error)


// GetServices returns all service names known to the backend from spans
// within its retention period.
GetServices(ctx context.Context) ([]string, error)

// GetOperations returns all operation names for a given service
// known to the backend from spans within its retention period.
GetOperations(ctx context.Context, query OperationQueryParameters) ([]Operation, error)

// FindTraces returns all traces matching query parameters. There's currently
// an implementation-dependent ambiguity whether all query filters (such as
// multiple tags) must apply to the same span within a trace, or can be satisfied
// by different spans.
//
// If no matching traces are found, the function returns (nil, nil).
FindTraces(ctx context.Context, query *TraceQueryParameters) ([]*ptrace.Traces, error)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FindTraces(ctx context.Context, query *TraceQueryParameters) ([]*ptrace.Traces, error)
FindTraces(ctx context.Context, query *TraceQueryParameters) ([]ptrace.Traces, error)

if you return 30 traces, []*... requires 30 heap allocations for no practical gains.


// FindTraceIDs does the same search as FindTraces, but returns only the list
// of matching trace IDs.
//
// If no matching traces are found, the function returns (nil, nil).
FindTraceIDs(ctx context.Context, query *TraceQueryParameters) ([]pcommon.TraceID, error)
}

// TraceQueryParameters contains parameters of a trace query.
type TraceQueryParameters struct {
ServiceName string
OperationName string
Tags map[string]string
StartTimeMin time.Time
StartTimeMax time.Time
DurationMin time.Duration
DurationMax time.Duration
NumTraces int
}

// OperationQueryParameters contains parameters of query operations, empty spanKind means get operations for all kinds of span.
type OperationQueryParameters struct {
ServiceName string
SpanKind string
}

// Operation contains operation name and span kind
type Operation struct {
Name string
SpanKind string
}
19 changes: 19 additions & 0 deletions storage_v2/spanstore/writer.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
// Copyright (c) 2024 The Jaeger Authors.
// SPDX-License-Identifier: Apache-2.0

package spanstore

import (
"context"

"go.opentelemetry.io/collector/pdata/ptrace"
)

// Writer writes spans to storage.
type Writer interface {
// WriteTrace writes a batch of spans to storage. Idempotent.
// Implementations are not required to support atomic transactions,
// so if any of the spans fail to be written an error is returned.
// Compatible with OTLP Exporter API.
WriteTraces(ctx context.Context, td *ptrace.Traces) error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTEL Exporter interface passes ptrace by Value, let's do the same:

func (exp *storageExporter) pushTraces(ctx context.Context, td ptrace.Traces) error {

}
Loading