Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add decoupleafterbatch converter to ensure decouple processor follows batch processor #1255

Merged
merged 22 commits into from
Apr 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
3d7dde0
Always put decouple processor first in pipeline
nslaughter Apr 10, 2024
728e681
Add converter to derive processors from a base
nslaughter Apr 10, 2024
f4b76d3
Remove scratchpad code
nslaughter Apr 10, 2024
ffab596
implement rules and test
nslaughter Apr 12, 2024
25e2092
update tests
nslaughter Apr 14, 2024
d543b2c
improve tests for reviewers
nslaughter Apr 14, 2024
e2dd81c
Merge branch 'open-telemetry:main' into enhancement/decouple-first
nslaughter Apr 16, 2024
4c51018
fix toggle for append predicate
nslaughter Apr 16, 2024
38e8f13
Fix typo in function comment
nslaughter Apr 16, 2024
e35b344
Document converter and auto-configuration
nslaughter Apr 16, 2024
b536046
Document converter and auto-configuration
nslaughter Apr 16, 2024
a10b9e1
rm errant test
nslaughter Apr 16, 2024
65ce10f
Add tests to clarify decouple->batch ill-formed chain
nslaughter Apr 17, 2024
ec45851
Fix typo in test case description
nslaughter Apr 17, 2024
7b236f3
Improve name of predicate/helper
nslaughter Apr 17, 2024
1b21b0b
Update collector/processor/decoupleprocessor/README.md
nslaughter Apr 17, 2024
a818179
gofmt -s -w .
nslaughter Apr 17, 2024
9b72b9e
restructure tests to extend coverage
nslaughter Apr 17, 2024
d88a4e2
go mod tidy
nslaughter Apr 17, 2024
d8d5062
Update collector/internal/confmap/converter/decoupleafterbatchconvert…
nslaughter Apr 17, 2024
5038809
Add auto-config explaination to Collector
nslaughter Apr 22, 2024
9db6767
Merge branch 'main' into enhancement/decouple-first
nslaughter Apr 22, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 12 additions & 54 deletions collector/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,68 +90,26 @@ from an S3 object using a CloudFormation template:

Loading configuration from S3 will require that the IAM role attached to your function includes read access to the relevant bucket.

## Auto-Configuration

Configuring the Lambda Collector without the decouple processor and batch processor can lead to performance issues. So the OpenTelemetry Lambda Layer automatically adds the decouple processor to the end of the chain if the batch processor is used and the decouple processor is not.

# Improving Lambda responses times
At the end of a lambda function's execution, the OpenTelemetry client libraries will flush any pending spans/metrics/logs
to the collector before returning control to the Lambda environment. The collector's pipelines are synchronous and this
means that the response of the lambda function is delayed until the data has been exported.
to the collector before returning control to the Lambda environment. The collector's pipelines are synchronous and this
means that the response of the lambda function is delayed until the data has been exported.
This delay can potentially be for hundreds of milliseconds.

To overcome this problem the [decouple](./processor/decoupleprocessor/README.md) processor can be used to separate the
two ends of the collectors pipeline and allow the lambda function to complete while ensuring that any data is exported
To overcome this problem the [decouple](./processor/decoupleprocessor/README.md) processor can be used to separate the
two ends of the collectors pipeline and allow the lambda function to complete while ensuring that any data is exported
before the Lambda environment is frozen.

Below is a sample configuration that uses the decouple processor:
```yaml
receivers:
otlp:
protocols:
grpc:

exporters:
logging:
loglevel: debug
otlp:
endpoint: { backend endpoint }

processors:
decouple:

service:
pipelines:
traces:
receivers: [otlp]
processors: [decouple]
exporters: [logging, otlp]
```
See the section regarding auto-configuration above. You don't need to manually add the decouple processor to your configuration.

## Reducing Lambda runtime
If your lambda function is invoked frequently it is also possible to pair the decouple processor with the batch
processor to reduce total lambda execution time at the expense of delaying the export of OpenTelemetry data.
If your lambda function is invoked frequently it is also possible to pair the decouple processor with the batch
processor to reduce total lambda execution time at the expense of delaying the export of OpenTelemetry data.
When used with the batch processor the decouple processor must be the last processor in the pipeline to ensure that data
is successfully exported before the lambda environment is frozen.

An example use of the batch and decouple processors:
```yaml
receivers:
otlp:
protocols:
grpc:

exporters:
logging:
loglevel: debug
otlp:
endpoint: { backend endpoint }

processors:
decouple:
batch:
timeout: 5m

service:
pipelines:
traces:
receivers: [otlp]
processors: [batch, decouple]
exporters: [logging, otlp]
```
As stated previously in the auto-configuration section, the OpenTelemetry Lambda Layer will automatically add the decouple processor to the end of the processors if the batch is used and the decouple processor is not. The result will be the same whether you configure it manually or not.
1 change: 1 addition & 0 deletions collector/go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ replace cloud.google.com/go => cloud.google.com/go v0.107.0

require (
github.com/golang-collections/go-datastructures v0.0.0-20150211160725-59788d5eb259
github.com/google/go-cmp v0.6.0
github.com/open-telemetry/opentelemetry-collector-contrib/confmap/provider/s3provider v0.92.0
github.com/open-telemetry/opentelemetry-lambda/collector/lambdacomponents v0.91.0
github.com/open-telemetry/opentelemetry-lambda/collector/lambdalifecycle v0.0.0-00010101000000-000000000000
Expand Down
3 changes: 2 additions & 1 deletion collector/internal/collector/collector.go
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ import (
"go.uber.org/zap/zapcore"

"github.com/open-telemetry/opentelemetry-lambda/collector/internal/confmap/converter/disablequeuedretryconverter"
"github.com/open-telemetry/opentelemetry-lambda/collector/internal/confmap/converter/decoupleafterbatchconverter"
)

// Collector runs a single otelcol as a go routine within the
Expand Down Expand Up @@ -68,7 +69,7 @@ func NewCollector(logger *zap.Logger, factories otelcol.Factories, version strin
ResolverSettings: confmap.ResolverSettings{
URIs: []string{getConfig(l)},
Providers: mapProvider,
Converters: []confmap.Converter{expandconverter.New(), disablequeuedretryconverter.New()},
Converters: []confmap.Converter{expandconverter.New(), disablequeuedretryconverter.New(), decoupleafterbatchconverter.New()},
},
}
cfgProvider, err := otelcol.NewConfigProvider(cfgSet)
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
# DecoupleAfterBatch Converter

The `DecoupleAfterBatch` converter automatically modifies the collector's configuration for the Lambda distribution. Its purpose is to ensure that a decouple processor is always present after a batch processor in a pipeline, in order to prevent potential data loss due to the Lambda environment being frozen.

## Behavior

The converter scans the collector's configuration and makes the following adjustments:

1. If a pipeline contains a batch processor with no decouple processor defined after it, the converter will automatically add a decouple processor to the end of the pipeline.

2. If a pipeline contains a batch processor with a decouple processor already defined after it or there is no batch processor defined, the converter will not make any changes to the pipeline configuration.
nslaughter marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
// Copyright The OpenTelemetry Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

// The decoupleafterbatchconverter implements the Converter for mutating Collector
// configurations to ensure the decouple processor is placed after the batch processor.
// This is logically implemented by appending the decouple processor to the end of
// processor chains where a batch processor is found unless another decouple processor
// was seen.
package decoupleafterbatchconverter

import (
"context"
"fmt"
"strings"

"go.opentelemetry.io/collector/confmap"
)

const (
serviceKey = "service"
pipelinesKey = "pipelines"
processorsKey = "processors"
batchProcessor = "batch"
decoupleProcessor = "decouple"
)

type converter struct{}

// New returns a confmap.Converter that ensures the decoupleprocessor is placed first in the pipeline.
func New() confmap.Converter {
return &converter{}
}

func (c converter) Convert(_ context.Context, conf *confmap.Conf) error {
serviceVal := conf.Get(serviceKey)
service, ok := serviceVal.(map[string]interface{})
if !ok {
return nil
}

pipelinesVal, ok := service[pipelinesKey]
if !ok {
return nil
}

pipelines, ok := pipelinesVal.(map[string]interface{})
if !ok {
return nil
}

// accumulates updates over the pipelines and applies them
// once all pipeline configs are processed
updates := make(map[string]interface{})
for telemetryType, pipelineVal := range pipelines {
pipeline, ok := pipelineVal.(map[string]interface{})
if !ok {
continue
}

processorsVal, ok := pipeline[processorsKey]
if !ok {
continue
}

processors, ok := processorsVal.([]interface{})
if !ok {
continue
}

// accumulate config updates
if shouldAppendDecouple(processors) {
processors = append(processors, decoupleProcessor)
updates[fmt.Sprintf("%s::%s::%s::%s", serviceKey, pipelinesKey, telemetryType, processorsKey)] = processors
break
}

}

// apply all updates
if len(updates) > 0 {
if err := conf.Merge(confmap.NewFromStringMap(updates)); err != nil {
return err
}
}

return nil
}

// The shouldAppendDecouple is the filter predicate for the Convert function action. It tells whether
// (bool) there was a decouple processor after the last
// batch processor, which Convert uses to decide whether to append the decouple processor.
func shouldAppendDecouple(processors []interface{}) bool {
var shouldAppendDecouple bool
for _, processorVal := range processors {
processor, ok := processorVal.(string)
if !ok {
continue
}
processorBaseName := strings.Split(processor, "/")[0]
if processorBaseName == batchProcessor {
shouldAppendDecouple = true
} else if processorBaseName == decoupleProcessor {
shouldAppendDecouple = false
}
}
return shouldAppendDecouple
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,153 @@
// Copyright The OpenTelemetry Authors
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
package decoupleafterbatchconverter

import (
"context"
"testing"

"go.opentelemetry.io/collector/confmap"

"github.com/google/go-cmp/cmp"
)

func TestConvert(t *testing.T) {
// Since this really tests differences in input, it's easier to read cases
// without the repeated definition of other fields in the config.
baseConf := func(input []interface{}) *confmap.Conf {
return confmap.NewFromStringMap(map[string]interface{}{
"service": map[string]interface{}{
"pipelines": map[string]interface{}{
"traces": map[string]interface{}{
"processors": input,
},
},
},
})
}

testCases := []struct {
name string
input *confmap.Conf
expected *confmap.Conf
err error
}{
// This test is first, because it illustrates the difference in making the rule that when
// batch is present the converter appends decouple processor to the end of chain versus
// the approach of this code which is to do this only when the last instance of batch
// is not followed by decouple processor.
{
name: "batch then decouple in middle of chain",
input: baseConf([]interface{}{"processor1", "batch", "decouple", "processor2"}),
expected: baseConf([]interface{}{"processor1", "batch", "decouple", "processor2"}),
},
{
name: "no service",
input: confmap.New(),
expected: confmap.New(),
},
{
name: "no pipelines",
input: confmap.NewFromStringMap(
map[string]interface{}{
"service": map[string]interface{}{
"extensions": map[string]interface{}{},
},
},
),
expected: confmap.NewFromStringMap(
map[string]interface{}{
"service": map[string]interface{}{
"extensions": map[string]interface{}{},
},
},
),
},
{
name: "no processors in chain",
input: confmap.NewFromStringMap(
map[string]interface{}{
"service": map[string]interface{}{
"extensions": map[string]interface{}{},
"pipelines": map[string]interface{}{
"traces": map[string]interface{}{},
},
},
},
),
expected: confmap.NewFromStringMap(map[string]interface{}{
"service": map[string]interface{}{
"extensions": map[string]interface{}{},
"pipelines": map[string]interface{}{
"traces": map[string]interface{}{},
},
},
},
),
},
{
name: "batch processor in singleton chain",
input: baseConf([]interface{}{"batch"}),
expected: baseConf([]interface{}{"batch", "decouple"}),
},
{
name: "batch processor present twice",
input: baseConf([]interface{}{"batch", "processor1", "batch"}),
expected: baseConf([]interface{}{"batch", "processor1", "batch", "decouple"}),
},

{
name: "batch processor not present",
input: baseConf([]interface{}{"processor1", "processor2"}),
expected: baseConf([]interface{}{"processor1", "processor2"}),
},
{
name: "batch sandwiched between input no decouple",
input: baseConf([]interface{}{"processor1", "batch", "processor2"}),
expected: baseConf([]interface{}{"processor1", "batch", "processor2", "decouple"}),
},

{
name: "batch and decouple input already present in correct position",
input: baseConf([]interface{}{"processor1", "batch", "processor2", "decouple"}),
expected: baseConf([]interface{}{"processor1", "batch", "processor2", "decouple"}),
},
{
name: "decouple and batch",
input: baseConf([]interface{}{"decouple", "batch"}),
expected: baseConf([]interface{}{"decouple", "batch", "decouple"}),
},
{
name: "decouple then batch mixed with others in the pipelinefirst then batch somewhere",
input: baseConf([]interface{}{"processor1", "decouple", "processor2", "batch", "processor3"}),
expected: baseConf([]interface{}{"processor1", "decouple", "processor2", "batch", "processor3", "decouple"}),
},
}

for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
conf := tc.input
expected := tc.expected

c := New()
err := c.Convert(context.Background(), conf)
if err != tc.err {
t.Errorf("unexpected error converting: %v", err)
}
if diff := cmp.Diff(expected.ToStringMap(), conf.ToStringMap()); diff != "" {
t.Errorf("Convert() mismatch: (-want +got):\n%s", diff)
}
})
}
}
Loading
Loading