Optimisations - Round 3 #108

lquerel · 2023-01-26T22:42:02Z

This PR is a massive update of the underlying infrastructure used to encode/decode OTLP to/from OTLP Arrow messages.
In the previous version the AdaptiveSchema was used to dynamically adapt the schema for fields that can be either a dictionary or a primitive/binary type (e.g. Int32, String, Binary).

This PR introduce a more general system called RecordBuilderExt (ext for extension to the standard Arrow RecordBuilder). The RecordBuilderExt interprets special annotations attached to the Arrow Schema of OTEL entities. There are currently 2 types of annotation supported:

optional: used to qualify a field as optional
dictionary: used to qualify a field as a dynamic dictionary able to detect index overflow and react accordingly either by selecting a better type for the index or by fall-backing to the value type of the dictionary.

This modification aims to improve the performance (CPU, memory allocation, compression) of small batches (<100) and to some extent improve a bit the performance of bigger batches.

Note: On the decoder side, we treat all the fields as optional as in protobuf. Logically some fields are mandatory but in fact, at the protobuf level, nothing prevents them from being optional. A default value will be assigned to any missing field (as in pdata).

A new set of unit tests have been added to track dictionary overflow for metrics, logs, and traces.

Updated benchmark results have been included into the README.md file.

…06-optimization-round-3

pkg/otel/arrow_record/arrow_record.go

pkg/otel/arrow_record/producer.go

pkg/otel/common/otlp2/doc.go

pkg/otel/common/schema/builder/int.go

Co-authored-by: Joshua MacDonald <[email protected]>

…06-optimization-round-3

# Conflicts: # go.mod # go.sum

jmacd

I understand now how schema building and transformation works at the high level. Looks good. I noticed a few places where panic makes sense now. I filed #121 to make sure the collector won't die in case of panics.

pkg/arrow/from_struct.go

pkg/otel/common/schema/transform_node.go

Improve Record ShowStats output

f9b5cc9

lquerel linked an issue Jan 26, 2023 that may be closed by this pull request

Optimization - Round 3 #106

Closed

lquerel self-assigned this Jan 26, 2023

lquerel added the performance Performance improvement, benchmarks label Jan 26, 2023

lquerel added this to the Beta V2 milestone Jan 26, 2023

lquerel and others added 21 commits January 26, 2023 14:42

Merge branch 'main' into 106-optimization-round-3

d515efe

Fix profiler issue

8e83018

Merge remote-tracking branch 'origin/106-optimization-round-3' into 1…

3947f2c

…06-optimization-round-3

Extend AdaptiveSchema to support optional fields.

3a3cafc

Add support for boolean field.

e39ae99

Add support for u64, i64, binary fields.

506fecb

Add support for u32, and i32 fields.

891bf5a

Add support for timestamp field.

4a5dc4d

Add support for fixed size 8/16 binary fields.

22c59c7

Add support for map, boolean fields.

0b368c2

Add unit tests for AdaptiveSchema.

cacde45

Add logic to manage dictionary in the transformation tree (step 1).

432ffe7

Add logic to manage dictionary in the transformation tree (step 2).

1878c50

Add logic to manage dictionary in the transformation tree (step 3).

2aa327e

Add logic to manage dictionary in the transformation tree (last update).

aef87a7

Convert resource, scope, attributes, and any_value to RecordBuilderExt.

22f210f

Convert logs to RecordBuilderExt.

064cd73

All tests pass with logs based on RecordBuilderExt.

722eb5c

Refactor pkg/arrow2

aded12c

Migrate/Test traces to RecordBuilderExt

490fa5f

Migrate metrics to RecordBuilderExt (part 1/2)

e862a9a

jmacd self-requested a review February 17, 2023 21:52

jmacd reviewed Feb 17, 2023

View reviewed changes

lquerel and others added 2 commits February 17, 2023 15:11

Migrate metrics to RecordBuilderExt (part 2)

117cf49

Update pkg/otel/arrow_record/arrow_record.go

66db59d

Co-authored-by: Joshua MacDonald <[email protected]>

lquerel added 3 commits February 17, 2023 23:12

Update to take into account jcmad feedback

4226a55

Merge remote-tracking branch 'origin/106-optimization-round-3' into 1…

ba44371

…06-optimization-round-3

Add unaryrpc mode in logs and traces benchmark

a8a81c6

lquerel mentioned this pull request Mar 2, 2023

[Donation Proposal]: OTEL Arrow Adapter open-telemetry/community#1332

Closed

lquerel added 18 commits March 2, 2023 16:50

Add dataset stats for logs and traces

be82f93

Update trace benchmark to include all the different batch sizes

7e01577

Add a test to convert OTLP Attributes into/from OTLP Arrow Attributes.

580427c

Improve test to convert OTLP Attributes into/from OTLP Arrow Attributes.

7ece96e

Add test OTLP <--> OTLP Arrow conversion for gauges.

c52a6b0

Add test OTLP <--> OTLP Arrow conversion for sums.

b139267

Add test OTLP <--> OTLP Arrow conversion for summaries.

6a25dae

Add test OTLP <--> OTLP Arrow conversion for histograms.

cb4fd49

Add test OTLP <--> OTLP Arrow conversion for exponential histograms.

4b8cc86

Fix metrics tests

6be78b2

Replace metrics/arrow with metrics/arrow2

b15c8b4

Replace pkg/arrow with pkg/arrow2

26b6fed

Fix conversion issue for fields with zero values.

98ed14a

Fix all unit tests.

e2f79d3

Update benchmark results and improve tests.

43052a3

Merge remote-tracking branch 'origin/main' into 106-optimization-round-3

d2acaec

# Conflicts: # go.mod # go.sum

Merge with main

21eac24

Remove AdaptiveSchema and related tests.

1be2030

lquerel marked this pull request as ready for review March 17, 2023 18:18

Fix mem benchmark tool

3f177bb

jmacd mentioned this pull request Mar 17, 2023

Recover from panics in exporter and receiver code path #121

Closed

jmacd approved these changes Mar 17, 2023

View reviewed changes

pkg/arrow/from_struct.go Outdated Show resolved Hide resolved

pkg/otel/common/schema/transform_node.go Show resolved Hide resolved

pkg/otel/common/schema/transform_node.go Show resolved Hide resolved

lquerel added 2 commits March 17, 2023 13:37

Merge with main

56f1d2c

Merge with main

e7e9daa

lquerel merged commit 7c38029 into main Mar 18, 2023

jmacd deleted the 106-optimization-round-3 branch April 7, 2023 17:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimisations - Round 3 #108

Optimisations - Round 3 #108

lquerel commented Jan 26, 2023 •

edited

Loading

jmacd left a comment

Optimisations - Round 3 #108

Optimisations - Round 3 #108

Conversation

lquerel commented Jan 26, 2023 • edited Loading

jmacd left a comment

Choose a reason for hiding this comment

lquerel commented Jan 26, 2023 •

edited

Loading