Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add performance testing blog post #3470

Merged
merged 17 commits into from
Nov 22, 2023
Merged
83 changes: 83 additions & 0 deletions content/en/blog/2023/perf-testing/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
---
title: OTel component performance benchmarks
linkTitle: Performance benchmarks
date: 2023-11-27
author: '[Martin Kuba](https://github.com/martinkuba) (Lightstep)'
cSpell:ignore: Kuba
---

As more and more users are looking to use OpenTelemetry instrumentation in their
production deployments, one important consideration is the impact that
OpenTelemetry will have on their application performance. In this blog post I
will discuss a few recent improvements in tooling around performance
benchmarking.

### Measuring performance overhead

Instrumentation is not free. It intercepts an application's operations and
collects (often) a large amount of data, which takes additional CPU and memory.
This can have a direct effect on throughput and response time, which can affect
the end-user experience with the application. It can also have an impact on
operational cost, such as increasing the number of instances a service runs on.

Providing general guidance about performance overhead is inherently difficult.
There are many factors that affect performance: the application throughput,
hardware the application runs on, what exactly is instrumented, how the
OpenTelemetry SDK is configured, sampling, etc. Ultimately, the best way to
measure performance is in the context of the specific application by running a
load test.

With that said a number of OpenTelemetry components include performance tests
that help catch regressions and can be used to provide some idea of their
performance characteristics.

### OpenTelemetry Collector

The [OpenTelemetry Collector](/docs/collector/) runs
[end-to-end load tests](https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/workflows/load-tests.yml)
on every merge to the main branch. There have been two recent updates to the CI
workflow:

1. Tests run on community-owned bare metal machines, which has made test results
more consistent.
2. Test results are published automatically: for a subset of the load test
results, see [Collector Benchmarks](/docs/collector/benchmarks/). The
[complete test results](https://open-telemetry.github.io/opentelemetry-collector-contrib/benchmarks/loadtests/)
are available as well.

### Language SDKs

A number of OpenTelemetry SDKs already include existing micro-benchmark tests,
for example:

- [SpanBenchmark.java](https://github.com/open-telemetry/opentelemetry-java/blob/main/sdk/trace/src/jmh/java/io/opentelemetry/sdk/trace/SpanBenchmark.java)
- [test_benchmark_trace.py](https://github.com/open-telemetry/opentelemetry-python/blob/main/opentelemetry-sdk/tests/performance/benchmarks/trace/test_benchmark_trace.py)
- [benchmark_test.go](https://github.com/open-telemetry/opentelemetry-go/blob/main/sdk/trace/benchmark_test.go)
- [benchmark/span.js](https://github.com/open-telemetry/opentelemetry-js/blob/main/packages/opentelemetry-sdk-trace-base/test/performance/benchmark/span.js)

These tests were run only on demand in the past. With the recent tooling
improvements, Java and JavaScript tests are now run automatically on every merge
to the main branch, and the results are published for anyone to easily access.
The tests are also run on community-owned bare metal machines, so that the
results are as consistent as possible.

{{% figure
src="java-benchmark-results.png"
caption="Sample [benchmark results for Java](https://open-telemetry.github.io/opentelemetry-java/benchmarks/)"
%}}

{{% figure
src="js-benchmark-results.png"
caption="Sample [benchmark results for JavaScript](https://open-telemetry.github.io/opentelemetry-js/benchmarks/)"
%}}

There is work in progress to make the same updates for Python and Go.

### Conclusion

Performance optimization is often considered only as an afterthought, but it
does not have to be. We are making improvements to automated tooling and
documentation to provide project maintainers and the community with reliable
performance testing during development. Ultimately our focus as a community is
to give end users confidence when using our components, especially around the
impact of OpenTelemetry's instrumentation on their applications’ performance.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
16 changes: 16 additions & 0 deletions static/refcache.json
Original file line number Diff line number Diff line change
Expand Up @@ -2439,6 +2439,10 @@
"StatusCode": 200,
"LastSeen": "2023-07-06T11:55:26.882609-07:00"
},
"https://github.com/martinkuba": {
"StatusCode": 200,
"LastSeen": "2023-11-04T11:32:20.86746-04:00"
},
"https://github.com/metrico/otel-collector": {
"StatusCode": 200,
"LastSeen": "2023-10-17T15:13:11.067528+02:00"
Expand Down Expand Up @@ -2615,6 +2619,10 @@
"StatusCode": 200,
"LastSeen": "2023-07-07T13:45:50.007391-07:00"
},
"https://github.com/open-telemetry/opentelemetry-collector-contrib/actions/workflows/load-tests.yml": {
"StatusCode": 200,
"LastSeen": "2023-11-04T11:32:21.428206-04:00"
},
"https://github.com/open-telemetry/opentelemetry-collector-contrib/issues/10116": {
"StatusCode": 200,
"LastSeen": "2023-06-30T08:43:50.226669-04:00"
Expand Down Expand Up @@ -4663,10 +4671,18 @@
"StatusCode": 206,
"LastSeen": "2023-07-06T12:14:49.802412-07:00"
},
"https://open-telemetry.github.io/opentelemetry-java/benchmarks/": {
"StatusCode": 206,
"LastSeen": "2023-11-04T11:32:21.536067-04:00"
},
"https://open-telemetry.github.io/opentelemetry-js": {
"StatusCode": 206,
"LastSeen": "2023-06-29T18:46:19.489479-04:00"
},
"https://open-telemetry.github.io/opentelemetry-js/benchmarks/": {
"StatusCode": 206,
"LastSeen": "2023-11-04T11:32:21.613865-04:00"
},
"https://open-telemetry.github.io/opentelemetry-js/benchmarks/data.js": {
"StatusCode": 206,
"LastSeen": "2023-10-03T11:24:52.148514-07:00"
Expand Down