From ce38468cfe90bb0da890dadecf281481cde68bc6 Mon Sep 17 00:00:00 2001 From: Gregor Zeitlinger Date: Thu, 18 Jul 2024 19:36:02 +0200 Subject: [PATCH 1/2] fix spring starter dependencies (#4849) Co-authored-by: opentelemetrybot <107717825+opentelemetrybot@users.noreply.github.com> --- content/en/docs/zero-code/java/_index.md | 4 ++-- .../zero-code/java/spring-boot-starter/getting-started.md | 8 ++++---- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/content/en/docs/zero-code/java/_index.md b/content/en/docs/zero-code/java/_index.md index 2005b37188c9..f6723a6ef0ca 100644 --- a/content/en/docs/zero-code/java/_index.md +++ b/content/en/docs/zero-code/java/_index.md @@ -6,8 +6,8 @@ aliases: - /docs/languages/java/automatic_instrumentation cascade: vers: - instrumentation: 2.5.0 - otel: 1.39.0 + instrumentation: 2.6.0 + otel: 1.40.0 --- Zero-code instrumentation with Java uses a Java agent JAR or Spring Boot diff --git a/content/en/docs/zero-code/java/spring-boot-starter/getting-started.md b/content/en/docs/zero-code/java/spring-boot-starter/getting-started.md index c4706aea998d..d1cad1c43a5c 100644 --- a/content/en/docs/zero-code/java/spring-boot-starter/getting-started.md +++ b/content/en/docs/zero-code/java/spring-boot-starter/getting-started.md @@ -25,9 +25,9 @@ A Bill of Material ([BOM](https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#bill-of-materials-bom-poms)) ensures that versions of dependencies (including transitive ones) are aligned. -Importing the `opentelemetry-bom` and `opentelemetry-instrumentation-bom-alpha` -BOMs when using the OpenTelemetry starter is important to ensure version -alignment across all OpenTelemetry dependencies. +Importing the `opentelemetry-bom` and `opentelemetry-instrumentation-bom` BOMs +when using the OpenTelemetry starter is important to ensure version alignment +across all OpenTelemetry dependencies. {{% alert title="Note" color="info" %}} @@ -91,7 +91,7 @@ plugins { dependencyManagement { imports { mavenBom("io.opentelemetry:opentelemetry-bom:{{% param vers.otel %}}") - mavenBom("io.opentelemetry.instrumentation:opentelemetry-instrumentation-bom-alpha:{{% param vers.instrumentation %}}") + mavenBom("io.opentelemetry.instrumentation:opentelemetry-instrumentation-bom:{{% param vers.instrumentation %}}") } } ``` From b8e63ff5bb9742aa09d3fc67d601c1e98c6ad333 Mon Sep 17 00:00:00 2001 From: Fabrizio Ferri-Benedetti Date: Fri, 19 Jul 2024 08:04:08 +0200 Subject: [PATCH 2/2] Add Performance doc to Java (#4828) Co-authored-by: Tiffany Hrabusa <30397949+tiffany76@users.noreply.github.com> Co-authored-by: opentelemetrybot <107717825+opentelemetrybot@users.noreply.github.com> Co-authored-by: Phillip Carter Co-authored-by: Jay DeLuca Co-authored-by: Trask Stalnaker --- content/en/docs/languages/java/performance.md | 194 ++++++++++++++++++ 1 file changed, 194 insertions(+) create mode 100644 content/en/docs/languages/java/performance.md diff --git a/content/en/docs/languages/java/performance.md b/content/en/docs/languages/java/performance.md new file mode 100644 index 000000000000..d47dae31e231 --- /dev/null +++ b/content/en/docs/languages/java/performance.md @@ -0,0 +1,194 @@ +--- +title: Performance +description: Performance reference for the OpenTelemetry Java agent +weight: 75 +--- + +The OpenTelemetry Java agent instruments your application by running inside the +same Java Virtual Machine (JVM). Like any other software agent, the Java agent +requires system resources like CPU, memory, and network bandwidth. The use of +resources by the agent is called agent overhead or performance overhead. The +OpenTelemetry Java agent has minimal impact on system performance when +instrumenting JVM applications, although the final agent overhead depends on +multiple factors. + +Some factors that might increase agent overhead are environmental, such as the +physical machine architecture, CPU frequency, amount and speed of memory, system +temperature, and resource contention. Other factors include virtualization and +containerization, the operating system and its libraries, the JVM version and +vendor, JVM settings, the algorithmic design of the software being monitored, +and software dependencies. + +Due to the complexity of modern software and the broad diversity in deployment +scenarios, it is impossible to come up with a single agent overhead estimate. To +find the overhead of any instrumentation agent in a given deployment, you have +to conduct experiments and collect measurements directly. Therefore, treat all +statements about performance as general information and guidelines that are +subject to evaluation in a specific system. + +The following sections describe the minimum requirements of the OpenTelemetry +Java agent, as well as potential constraints impacting performance, and +guidelines to optimize and troubleshoot the performance of the agent. + +## Guidelines to reduce agent overhead + +The following best practices and techniques might help reduce overhead caused by +the Java agent. + +### Configure trace sampling + +The volume of spans processed by the instrumentation might impact agent +overhead. You can configure trace sampling to adjust the span volume and reduce +resource usage. See [Sampling](/docs/languages/java/sampling). + +### Turn off specific instrumentations + +You can further reduce agent overhead by turning off instrumentations that +aren't needed or are producing too many spans. To turn off an instrumentation, +use `-Dotel.instrumentation..enabled=false` or the +`OTEL_INSTRUMENTATION__ENABLED` environment variable, where `` is +the name of the instrumentation. + +For example, the following option turns off the JDBC instrumentation: +`-Dotel.instrumentation.jdbc.enabled=false` + +### Allocate more memory for the application + +Increasing the maximum heap size of the JVM using the `-Xmx` option might +help in alleviating agent overhead issues, as instrumentations can generate a +large number of short-lived objects in memory. + +### Reduce manual instrumentation to what you need + +Too much manual instrumentation might introduce inefficiencies that increase +agent overhead. For example, using `@WithSpan` on every method results in a high +span volume, which in turn increases noise in the data and consumes more system +resources. + +### Provision adequate resources + +Make sure to provision enough resources for your instrumentation and for the +Collector. The amount of resources such as memory or disk depend on your +application architecture and needs. For example, a common setup is to run the +instrumented application on the same host as the OpenTelemetry Collector. In +that case, consider rightsizing the resources for the Collector and optimize its +settings. See [Scaling](/docs/collector/scaling/). + +## Constraints impacting the performance of the Java agent + +In general, the more telemetry you collect from your application, the greater +the the impact on agent overhead. For example, tracing methods that aren't +relevant to your application can still produce considerable agent overhead +because tracing such methods is computationally more expensive than running the +method itself. Similarly, high cardinality tags in metrics might increase memory +usage. Debug logging, if turned on, also increases write operations to disk and +memory usage. + +Some instrumentations, for example JDBC or Redis, produce high span volumes that +increase agent overhead. For more information on how to turn off unnecessary +instrumentations, see +[Turn off specific instrumentations](#turn-off-specific-instrumentations). + +> [!NOTE] Experimental features of the Java agent might increase agent overhead +> due to the experimental focus on functionality over performance. Stable +> features are safer in terms of agent overhead. + +## Troubleshooting agent overhead issues + +When troubleshooting agent overhead issues, do the following: + +- Check minimum requirements. See + [Prerequisites](/docs/languages/java/getting-started/#prerequisites). +- Use the latest compatible version of the Java agent. +- Use the latest compatible version of your JVM. + +Consider taking the following actions to decrease agent overhead: + +- If your application is approaching memory limits, consider giving it more + memory. +- If your application is using all the CPU, you might want to scale it + horizontally. +- Try turning off or tuning metrics. +- Tune trace sampling settings to reduce span volume. +- Turn off specific instrumentations. +- Review manual instrumentation for unnecessary span generation. + +## Guidelines for measuring agent overhead + +Measuring agent overhead in your own environment and deployments provides +accurate data about the impact of instrumentation on the performance of your +application or service. The following guidelines describe the general steps for +collecting and comparing reliable agent overhead measurements. + +### Decide what you want to measure + +Different users of your application or service might notice different aspects of +agent overhead. For example, while end users might notice degradation in service +latency, power users with heavy workloads pay more attention to CPU overhead. On +the other hand, users who deploy frequently, for example due to elastic +workloads, care more about startup time. + +Reduce your measurements to factors that are sure to impact user experience, so +your datasets don't contain irrelevant information. Some examples of +measurements include the following: + +- User average, user peak, and machine average CPU usage +- Total memory allocated and maximum heap used +- Garbage collection pause time +- Startup time in milliseconds +- Average and percentile 95 (p95) service latency +- Network read and write average throughput + +### Prepare a suitable test environment + +By measuring agent overhead in a controlled test environment you can better +identify the factors affecting performance. When preparing a test environment, +complete the following: + +1. Make sure that the configuration of the test environment resembles + production. +2. Isolate the application under test from other services that might interfere. +3. Turn off or remove all unnecessary system services on the application host. +4. Ensure that the application has enough system resources to handle the test + workload. + +### Create a battery of realistic tests + +Design the tests that you run against the test environment to resemble typical +workloads as much as possible. For example, if some REST API endpoints of your +service are susceptible to high request volumes, create a test that simulates +heavy network traffic. + +For Java applications, use a warm-up phase prior to starting measurements. The +JVM is a highly dynamic machine that performs a large number of optimizations +through just-in-time compilation (JIT). The warm-up phase helps the application +to finish most of its class loading and gives the JIT compiler time to run the +majority of optimizations. + +Make sure to run a large number of requests and to repeat the test pass many +times. This repetition helps to ensure a representative data sample. Include +error scenarios in your test data. Simulate an error rate similar to that of a +normal workload, typically between 2% and 10%. + +{{% alert title="Note" color="info" %}} + +Tests might increase costs when targeting observability backends and other +commercial services. Plan your tests accordingly or consider using alternative +solutions, such as self-hosted or locally run backends. + +{{% /alert %}} + +### Collect comparable measurements + +To identify which factors might be affecting performance and causing agent +overhead, collect measurements in the same environment after modifying a single +factor or condition. + +### Analyze the agent overhead data + +After collecting data from multiple passes, you can plot results in a chart or +compare averages using statistical tests to check for significant differences. + +Consider that different stacks, applications, and environments might result in +different operational characteristics and different agent overhead measurement +results.