Skip to content

Commit

Permalink
Adjust settings for small-size instance
Browse files Browse the repository at this point in the history
  • Loading branch information
jmacd committed Sep 27, 2024
1 parent 70b26af commit 1eb704d
Show file tree
Hide file tree
Showing 5 changed files with 79 additions and 19 deletions.
27 changes: 27 additions & 0 deletions .chloggen/otelarrow-defaults.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Use this changelog template to create an entry for release notes.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: enhancement

# The name of the component, or a single word describing the area of concern, (e.g. filelogreceiver)
component: otelarrowexporter

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: Adjust defaults for small instance size.

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [35477]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:

# If your change doesn't affect end users or the exported elements of any package,
# you should instead start your pull request title with [chore] or use the "Skip Changelog" label.
# Optional: The change log or logs in which this entry should be included.
# e.g. '[user]' or '[user, api]'
# Include 'user' if the change is relevant to end users.
# Include 'api' if there is a change to a library API.
# Default: '[user]'
change_logs: [user]
29 changes: 21 additions & 8 deletions exporter/otelarrowexporter/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,10 +101,23 @@ to standard OTLP.
- `disabled` (default: false): disables use of Arrow, causing the exporter to use standard OTLP
- `disable_downgrade` (default: false): prevents this exporter from using standard OTLP.

The following settings determine the resources that the exporter will use:
The following setting determines how long a stream will stay open.
Stream lifetime is limited to 30 seconds because compression benefit
is limited at that point and shorter streams make load balancing
easier.

- `num_streams` (default: number of CPUs): the number of concurrent Arrow streams
- `max_stream_lifetime` (default: unlimited): duration after which streams are recycled.
- `max_stream_lifetime` (default: 30s): duration after which streams
are recycled.

The following setting determines memory and CPU resources that the
exporter will use:

- `num_streams` (default: 1): the number of concurrent Arrow streams

The `num_streams` default limits the exporter to one stream, to limit
resources used by this component. Larger instances may wish to export
multiple streams in parallel, in which case `num_streams` can be
raised up to the number of available CPUs.

When `num_streams` is greater than one, a configurable policy
determines how load is assigned across streams. The supported
Expand Down Expand Up @@ -229,12 +242,12 @@ The exporter supports configuring compression at the [Arrow
columnar-protocol
level](https://arrow.apache.org/docs/format/Columnar.html#format-ipc).

- `payload_compression`: compression applied at the Arrow IPC level, "none" by default, "zstd" supported.
- `payload_compression` (default "zstd"): compression applied at the Arrow IPC level.

Compression settings at the Arrow IPC level cannot be further
configured. We do not recommend configuring both payload and
gRPC-level compression at once, hwoever these settings are
independent.
Compression at the Arrow level is enabled by default because it boosts
compression slightly and helps Arrow payloads meet gRPC maximum
request size limits. Compression settings at the Arrow IPC level
cannot be further configured.

For example, two exporters may be configured with multiple zstd
configurations, provided they use different levels:
Expand Down
11 changes: 4 additions & 7 deletions exporter/otelarrowexporter/factory.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,6 @@ package otelarrowexporter // import "github.com/open-telemetry/opentelemetry-col

import (
"context"
"runtime"
"time"

arrowpb "github.com/open-telemetry/otel-arrow/api/experimental/arrow/v1"
"go.opentelemetry.io/collector/component"
Expand Down Expand Up @@ -59,15 +57,14 @@ func createDefaultConfig() component.Config {
BalancerName: "round_robin",
},
Arrow: ArrowConfig{
NumStreams: runtime.NumCPU(),
MaxStreamLifetime: time.Hour,
NumStreams: arrow.DefaultNumStreams,
MaxStreamLifetime: arrow.DefaultMaxStreamLifetime,

Zstd: zstd.DefaultEncoderConfig(),
Prioritizer: arrow.DefaultPrioritizer,

// PayloadCompression is off by default because gRPC
// compression is on by default, above.
PayloadCompression: "",
// Note the default payload compression is
PayloadCompression: arrow.DefaultPayloadCompression,
},
}
}
Expand Down
7 changes: 3 additions & 4 deletions exporter/otelarrowexporter/factory_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@ package otelarrowexporter
import (
"context"
"path/filepath"
"runtime"
"testing"
"time"

Expand Down Expand Up @@ -38,9 +37,9 @@ func TestCreateDefaultConfig(t *testing.T) {
assert.Equal(t, configcompression.TypeZstd, ocfg.Compression)
assert.Equal(t, ArrowConfig{
Disabled: false,
NumStreams: runtime.NumCPU(),
MaxStreamLifetime: time.Hour,
PayloadCompression: "",
NumStreams: 1,
MaxStreamLifetime: 30 * time.Second,
PayloadCompression: "zstd",
Zstd: zstd.DefaultEncoderConfig(),
Prioritizer: arrow.DefaultPrioritizer,
}, ocfg.Arrow)
Expand Down
24 changes: 24 additions & 0 deletions exporter/otelarrowexporter/internal/arrow/exporter.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ import (
arrowpb "github.com/open-telemetry/otel-arrow/api/experimental/arrow/v1"
arrowRecord "github.com/open-telemetry/otel-arrow/pkg/otel/arrow_record"
"go.opentelemetry.io/collector/component"
"go.opentelemetry.io/collector/config/configcompression"
"go.opentelemetry.io/collector/pdata/plog"
"go.opentelemetry.io/collector/pdata/pmetric"
"go.opentelemetry.io/collector/pdata/ptrace"
Expand All @@ -27,6 +28,29 @@ import (
"github.com/open-telemetry/opentelemetry-collector-contrib/internal/otelarrow/netstats"
)

// Defaults settings should use relatively few resources, so that
// users are required to explicitly configure large instances.
var (
// DefaultNumStreams is 1 which limits network and memory
// resources used by this component.
DefaultNumStreams = 1

// DefaultMaxStreamLifetime is 30 seconds, because the
// marginal compression benefit of a longer OTel-Arrow stream
// is limited after 100s of batches.
DefaultMaxStreamLifetime = 30 * time.Second

// DefaultPayloadCompression is "zstd" so that Arrow IPC
// payloads use Arrow-configured Zstd over the payload
// independently of whatever compression gRPC may have
// configured. This is on by default, achieving "double
// compression" because:
// (a) relatively cheap in CPU terms
// (b) minor compression benefit
// (c) helps stay under gRPC request size limits
DefaultPayloadCompression configcompression.Type = "zstd"
)

// Exporter is 1:1 with exporter, isolates arrow-specific
// functionality.
type Exporter struct {
Expand Down

0 comments on commit 1eb704d

Please sign in to comment.