diff --git a/docs/apm/apm-alerts.asciidoc b/docs/apm/apm-alerts.asciidoc index bc5e1ccc1dd55..7bdfe80b42177 100644 --- a/docs/apm/apm-alerts.asciidoc +++ b/docs/apm/apm-alerts.asciidoc @@ -18,12 +18,22 @@ image::apm/images/apm-alert.png[Create an alert in the APM app] For a walkthrough of the alert flyout panel, including detailed information on each configurable property, see Kibana's <>. -The APM app supports two different types of threshold alerts: transaction duration, and error rate. -Below, we'll create one of each. +The APM app supports four different types of alerts: + +* Transaction duration anomaly: +alerts when the service's transaction duration reaches a certain anomaly score +* Transaction duration threshold: +alerts when the service's transaction duration exceeds a given time limit over a given time frame +* Transaction error rate threshold: +alerts when the service's transaction error rate is above the selected rate over a given time frame +* Error count threshold: +alerts when service exceeds a selected number of errors over a given time frame + +Below, we'll walk through the creation of two of these alerts. [float] [[apm-create-transaction-alert]] -=== Create a transaction duration alert +=== Example: create a transaction duration alert Transaction duration alerts trigger when the duration of a specific transaction type in a service exceeds a defined threshold. This guide will create an alert for the `opbeans-java` service based on the following criteria: @@ -57,9 +67,9 @@ Enter a name for the connector, and paste the webhook URL. See Slack's webhook documentation if you need to create one. -Add a message body in markdown format. +A default message is provided as a starting point for your alert. You can use the https://mustache.github.io/[Mustache] template syntax, i.e., `{{variable}}` -to pass alert values at the time a condition is detected to an action. +to pass additional alert values at the time a condition is detected to an action. A list of available variables can be accessed by selecting the **add variable** button image:apm/images/add-variable.png[add variable button]. @@ -67,7 +77,7 @@ Select **Save**. The alert has been created and is now active! [float] [[apm-create-error-alert]] -=== Create an error rate alert +=== Example: create an error rate alert Error rate alerts trigger when the number of errors in a service exceeds a defined threshold. This guide creates an alert for the `opbeans-python` service based on the following criteria: @@ -94,9 +104,9 @@ Based on the alert criteria, define the following alert details: Select the **Email** action type and click **Create a connector**. Fill out the required details: sender, host, port, etc., and click **save**. -Add a message body in markdown format. +A default message is provided as a starting point for your alert. You can use the https://mustache.github.io/[Mustache] template syntax, i.e., `{{variable}}` -to pass alert values at the time a condition is detected to an action. +to pass additional alert values at the time a condition is detected to an action. A list of available variables can be accessed by selecting the **add variable** button image:apm/images/add-variable.png[add variable button]. diff --git a/docs/apm/filters.asciidoc b/docs/apm/filters.asciidoc index d53adb439f0c8..c405ea10ade3d 100644 --- a/docs/apm/filters.asciidoc +++ b/docs/apm/filters.asciidoc @@ -69,7 +69,7 @@ the host filter will still be applied. These filters are very useful for quickly and easily removing noise from your data. With just a click, you can filter your transactions by the transaction result, -host, container ID, and more. +host, container ID, Kubernetes pod, and more. [role="screenshot"] image::apm/images/local-filter.png[Local filters available in the APM app in Kibana] \ No newline at end of file diff --git a/docs/apm/images/apm-alert.png b/docs/apm/images/apm-alert.png index 350704d8969ae..c68b36f522bfc 100644 Binary files a/docs/apm/images/apm-alert.png and b/docs/apm/images/apm-alert.png differ diff --git a/docs/apm/images/apm-distributed-tracing.png b/docs/apm/images/apm-distributed-tracing.png index e9c6713361c73..0dbffa591d43a 100644 Binary files a/docs/apm/images/apm-distributed-tracing.png and b/docs/apm/images/apm-distributed-tracing.png differ diff --git a/docs/apm/images/apm-error-group.png b/docs/apm/images/apm-error-group.png index ecdf9c20cf4aa..359bdc6b704e9 100644 Binary files a/docs/apm/images/apm-error-group.png and b/docs/apm/images/apm-error-group.png differ diff --git a/docs/apm/images/apm-errors-overview.png b/docs/apm/images/apm-errors-overview.png index 90f16b81e9f50..969a1f19f9f43 100644 Binary files a/docs/apm/images/apm-errors-overview.png and b/docs/apm/images/apm-errors-overview.png differ diff --git a/docs/apm/images/apm-geo-ui.png b/docs/apm/images/apm-geo-ui.png index a767ed7e08e0c..3757127bad9c0 100644 Binary files a/docs/apm/images/apm-geo-ui.png and b/docs/apm/images/apm-geo-ui.png differ diff --git a/docs/apm/images/apm-metrics.png b/docs/apm/images/apm-metrics.png index 60383ef428f2a..ffe5ffc7e1d83 100644 Binary files a/docs/apm/images/apm-metrics.png and b/docs/apm/images/apm-metrics.png differ diff --git a/docs/apm/images/apm-query-bar.png b/docs/apm/images/apm-query-bar.png index 313ee7d4b8fc8..90955fb61016d 100644 Binary files a/docs/apm/images/apm-query-bar.png and b/docs/apm/images/apm-query-bar.png differ diff --git a/docs/apm/images/apm-service-map-anomaly.png b/docs/apm/images/apm-service-map-anomaly.png index b661e8f09d1a1..cd59f86690666 100644 Binary files a/docs/apm/images/apm-service-map-anomaly.png and b/docs/apm/images/apm-service-map-anomaly.png differ diff --git a/docs/apm/images/apm-services-overview.png b/docs/apm/images/apm-services-overview.png index 48236522ddfbb..85d14cc7dfc6e 100644 Binary files a/docs/apm/images/apm-services-overview.png and b/docs/apm/images/apm-services-overview.png differ diff --git a/docs/apm/images/apm-settings.png b/docs/apm/images/apm-settings.png index 4eaef9ec15ac5..14cf32877b720 100644 Binary files a/docs/apm/images/apm-settings.png and b/docs/apm/images/apm-settings.png differ diff --git a/docs/apm/images/apm-traces.png b/docs/apm/images/apm-traces.png index 6219be5b6d6e4..bf1f7e783bb11 100644 Binary files a/docs/apm/images/apm-traces.png and b/docs/apm/images/apm-traces.png differ diff --git a/docs/apm/images/apm-transaction-response-dist.png b/docs/apm/images/apm-transaction-response-dist.png index ecf5a4af2c25d..1d268bbaac465 100644 Binary files a/docs/apm/images/apm-transaction-response-dist.png and b/docs/apm/images/apm-transaction-response-dist.png differ diff --git a/docs/apm/images/apm-transaction-sample.png b/docs/apm/images/apm-transaction-sample.png index 73668b094f9cf..bfdb6a5abe65b 100644 Binary files a/docs/apm/images/apm-transaction-sample.png and b/docs/apm/images/apm-transaction-sample.png differ diff --git a/docs/apm/images/apm-transactions-overview.png b/docs/apm/images/apm-transactions-overview.png index b3b6ca22c4f63..53d7637b18647 100644 Binary files a/docs/apm/images/apm-transactions-overview.png and b/docs/apm/images/apm-transactions-overview.png differ diff --git a/docs/apm/images/example-metadata.png b/docs/apm/images/example-metadata.png index 0e35f90691723..2a5bda7f088f6 100644 Binary files a/docs/apm/images/example-metadata.png and b/docs/apm/images/example-metadata.png differ diff --git a/docs/apm/images/jvm-metrics-overview.png b/docs/apm/images/jvm-metrics-overview.png index 9c8ba4a12a262..586836c6cfe3e 100644 Binary files a/docs/apm/images/jvm-metrics-overview.png and b/docs/apm/images/jvm-metrics-overview.png differ diff --git a/docs/apm/images/jvm-metrics.png b/docs/apm/images/jvm-metrics.png index 1720e1370ff90..52a1ca5bea8d8 100644 Binary files a/docs/apm/images/jvm-metrics.png and b/docs/apm/images/jvm-metrics.png differ diff --git a/docs/apm/images/local-filter.png b/docs/apm/images/local-filter.png index faac5c143a7d8..8657e39f430aa 100644 Binary files a/docs/apm/images/local-filter.png and b/docs/apm/images/local-filter.png differ diff --git a/docs/apm/images/service-maps-java.png b/docs/apm/images/service-maps-java.png index e1a42f4c76e12..b3726bdc00ab6 100644 Binary files a/docs/apm/images/service-maps-java.png and b/docs/apm/images/service-maps-java.png differ diff --git a/docs/apm/images/service-maps.png b/docs/apm/images/service-maps.png index 078fabcfa2879..878a31adc69ca 100644 Binary files a/docs/apm/images/service-maps.png and b/docs/apm/images/service-maps.png differ diff --git a/docs/apm/images/service-quick-health.png b/docs/apm/images/service-quick-health.png new file mode 100644 index 0000000000000..aab1332513079 Binary files /dev/null and b/docs/apm/images/service-quick-health.png differ diff --git a/docs/apm/images/specific-transaction.png b/docs/apm/images/specific-transaction.png index 9911dbd879f41..52073bf76520a 100644 Binary files a/docs/apm/images/specific-transaction.png and b/docs/apm/images/specific-transaction.png differ diff --git a/docs/apm/machine-learning.asciidoc b/docs/apm/machine-learning.asciidoc index db2a1ef6e2da0..b31d717a6932e 100644 --- a/docs/apm/machine-learning.asciidoc +++ b/docs/apm/machine-learning.asciidoc @@ -14,7 +14,12 @@ Machine learning jobs are created per environment, and are based on a service's Because jobs are created at the environment level, you can add new services to your existing environments without the need for additional machine learning jobs. -After a machine learning job is created, results are shown in two places: +Results from machine learning jobs are shown in multiple places throughout the APM app: + +* The **Services overview** provides a quick-glance view of the general health of all of your services. ++ +[role="screenshot"] +image::apm/images/service-quick-health.png[Example view of anomaly scores on response times in the APM app] * The transaction duration chart will show the expected bounds and add an annotation when the anomaly score is 75 or above. + diff --git a/docs/apm/service-maps.asciidoc b/docs/apm/service-maps.asciidoc index d629a95073a74..d44c4ff6caa5c 100644 --- a/docs/apm/service-maps.asciidoc +++ b/docs/apm/service-maps.asciidoc @@ -33,7 +33,7 @@ distributed tracing will not work, and the connection will not be drawn on the m Select the **Service Map** tab to get started. By default, all instrumented services and connections are shown. Whether you're onboarding a new engineer, or just trying to grasp the big picture, -click around, zoom in and out, and begin to visualize how your services are connected. +drag things around, zoom in and out, and begin to visualize how your services are connected. If there's a specific service that interests you, select that service to highlight its connections. Clicking **Focus map** will refocus the map on that specific service and lock the connection highlighting. diff --git a/docs/apm/services.asciidoc b/docs/apm/services.asciidoc index 395e23c379306..2bf2e35c21cd8 100644 --- a/docs/apm/services.asciidoc +++ b/docs/apm/services.asciidoc @@ -2,8 +2,13 @@ [[services]] === Services overview -The *Services* overview gives you quick insights into the health and general performance of all of your instrumented services. -Services are sorted by the `service.name` configured in each of the {apm-agents-ref}[APM agents] you’ve installed. +The *Services* overview page provides a quick, high-level overview of the health and general +performance of all instrumented services. + +To help surface potential issues, services are sorted by their health status: +**critical** > **warning** > **healthy** > **unknown**. +Health status is powered by machine learning and requires anomaly detection to be enabled. +Learn more in <>. [role="screenshot"] -image::apm/images/apm-services-overview.png[Example view of services table the APM app in Kibana] \ No newline at end of file +image::apm/images/apm-services-overview.png[Example view of services table the APM app in Kibana] diff --git a/docs/apm/spans.asciidoc b/docs/apm/spans.asciidoc index c35fb115d2db4..7f29b1f003f1c 100644 --- a/docs/apm/spans.asciidoc +++ b/docs/apm/spans.asciidoc @@ -3,7 +3,7 @@ === Trace sample timeline The trace sample timeline visualization is a bird's-eye view of what your application was doing while it was trying to respond to a request. -This makes it useful for visualizing where the selected transaction spent most of its time. +This makes it useful for visualizing where a selected transaction spent most of its time. [role="screenshot"] image::apm/images/apm-transaction-sample.png[Example of distributed trace colors in the APM app in Kibana] @@ -43,9 +43,12 @@ this makes finding possible bottlenecks throughout your application much easier image::apm/images/apm-distributed-tracing.png[Example view of the distributed tracing in APM app in Kibana] Don't forget; by definition, a distributed trace includes more than one transaction. -When viewing these distributed traces in the timeline waterfall, you'll see this image:apm/images/transaction-icon.png[APM icon] icon, +When viewing distributed traces in the timeline waterfall, +you'll see this icon: image:apm/images/transaction-icon.png[APM icon], which indicates the next transaction in the trace. -These transactions can be expanded and viewed in detail by clicking on them. +For easier problem isolation, transactions can be collapsed in the waterfall by clicking +the icon to the left of the transactions. +Transactions can also be expanded and viewed in detail by clicking on them. After exploring these traces, you can return to the full trace by clicking *View full trace*. diff --git a/docs/apm/traces.asciidoc b/docs/apm/traces.asciidoc index 52b4b618de466..3bafebd733159 100644 --- a/docs/apm/traces.asciidoc +++ b/docs/apm/traces.asciidoc @@ -7,7 +7,8 @@ and which services were part of it. In addition to the Traces overview, you can view your application traces in the <>. The *Traces* overview displays the entry transaction for all traces in your application. -If you're using <>, this view is key to finding the critical paths within your application. +If you're using <>, +this view is key to finding the critical paths within your application. Transactions with the same name are grouped together and only shown once in this table. By default, transactions are sorted by _Impact_. diff --git a/docs/apm/transactions.asciidoc b/docs/apm/transactions.asciidoc index 84ab6b2a58579..fef98a86de1d0 100644 --- a/docs/apm/transactions.asciidoc +++ b/docs/apm/transactions.asciidoc @@ -10,17 +10,8 @@ Selecting a <> brings you to the *transactions* overview. [role="screenshot"] image::apm/images/apm-transactions-overview.png[Example view of transactions table in the APM app in Kibana] -The *time spent by span type*, *transaction duration*, and *requests per minute* chart display information on all transactions associated with the selected service: - -*Time spent by span type*:: -Visualize where your application is spending most of its time. -For example, is your app spending time in external calls, database processing, or application code execution? -+ -The time a transaction took to complete is also recorded and displayed on the chart under the "app" label. -"app" indicates that something was happening within the application, but we're not sure exactly what. -This could be a sign that the agent does not have auto-instrumentation for whatever was happening during that time. -+ -It's important to note that if you have asynchronous spans, the sum of all span times may exceed the duration of the transaction. +The *transaction duration*, *transactions per minute*, *transaction error rate*, and *time spent by span type* +charts display information on all transactions associated with the selected service: *Transaction duration*:: Response times for this service, broken down into average, 95th, and 99th percentile. @@ -28,11 +19,26 @@ If there's a weird spike that you'd like to investigate, you can simply zoom in on the graph - this will adjust the specific time range, and all of the data on the page will update accordingly. -*Requests per minute*:: +*Transactions per minute*:: Visualize response codes: `2xx`, `3xx`, `4xx`, etc., and is useful for determining if you're serving more of one code than you typically do. Like in the Transaction duration graph, you can zoom in on anomalies to further investigate them. +*Transaction error rate*:: +Visualize the total number of transactions with errors divided by the total number of transactions. +Any unexpected increases, decreases, or irregular patterns can be investigated further +with the <>. + +*Time spent by span type*:: +Visualize where your application is spending most of its time. +For example, is your app spending time in external calls, database processing, or application code execution? ++ +The time a transaction took to complete is also recorded and displayed on the chart under the "app" label. +"app" indicates that something was happening within the application, but we're not sure exactly what. +This could be a sign that the agent does not have auto-instrumentation for whatever was happening during that time. ++ +It's important to note that if you have asynchronous spans, the sum of all span times may exceed the duration of the transaction. + [[transactions-table]] ==== Transactions table @@ -61,42 +67,45 @@ refer to the documentation for each {apm-agents-ref}[APM Agent] you've implement ==== RUM Transaction overview The transaction overview page is customized for the JavaScript RUM Agent. -This page highlights things like *page load times*, *transactions per minute*, and even the *average page load duration distribution by country*. +Specifically, the page highlights *page load times* for your service: [role="screenshot"] image::apm/images/apm-geo-ui.png[average page load duration distribution] -This data is available due to the geo-ip and user agent pipelines being enabled by default, -which allows for the capture of geo-location and user agent data. -These visualizations make it easy for you to visualize performance information about your -end-users' experience based on their location. +Additional RUM goodies, like core vitals, and visitor breakdown by browser, location, and device, +are available in the Observability User Experience tab. +// To do +// Add link to the Observability UE docs when complete [[transaction-details]] ==== Transaction details Selecting a transaction group will bring you to the *transaction* details. -Transaction details include a high-level overview of the time spent by span type, -transaction group duration, requests per minute, and transaction group duration distribution. -It's important to note that all of these graphs show data from every transaction within the selected transaction group. +This page is visually similar to the transaction overview, but it shows data from all transactions within +the selected transaction group. [role="screenshot"] image::apm/images/apm-transaction-response-dist.png[Example view of response time distribution] Up to ten sampled transactions are also displayed. -These sampled transactions are based on your selection in the *Transactions duration distribution*. -You can update the sampled transactions by selecting a new _bucket_ in the transactions duration distribution graph. -The number of requests per bucket is displayed when hovering over the graph, and the selected bucket is highlighted to stand out. +These sampled transactions are based on the _bucket_ selection in the *Transactions duration distribution* chart. +You can update the sampled transactions by selecting a new _bucket_. +The number of requests per bucket is displayed when hovering over the graph, +and the selected bucket is highlighted to stand out. + +The screenshot below shows a typical distribution, and indicates most of our requests were served quickly--awesome! +It's the requests on the right, the ones taking longer than average, that we probably want to focus on. [role="screenshot"] image::apm/images/apm-transaction-duration-dist.png[Example view of transactions duration distribution graph] -This graph shows a typical distribution, and indicates most of our requests were served quickly--awesome! -It's the requests on the right, the ones taking longer than average, that we probably want to focus on. - -When you select one of these buckets, +When you select a bucket, you're presented with up to ten trace samples. -Each sample has a trace timeline waterfall that shows what a typical request in that bucket was doing. -By investigating this timeline waterfall, we can hopefully determine _why_ this request was slow and then implement a fix. +Each sample has a trace timeline waterfall that shows how a typical request in that bucket executed. +This waterfall is useful for understanding the parent/child hierarchy of transactions and spans, +and ultimately determining _why_ a request was slow. +For large waterfalls, expand problematic transactions and collapse well-performing ones +for easier problem isolation and troubleshooting. [role="screenshot"] image::apm/images/apm-transaction-sample.png[Example view of transactions sample] diff --git a/docs/apm/troubleshooting.asciidoc b/docs/apm/troubleshooting.asciidoc index 7ed2f57caeadd..e7e5419da78cb 100644 --- a/docs/apm/troubleshooting.asciidoc +++ b/docs/apm/troubleshooting.asciidoc @@ -14,6 +14,7 @@ Also, check out the https://discuss.elastic.co/c/apm[APM discussion forum]. * <> * <> * <> +* <> [float] [[no-apm-data-found]] @@ -180,3 +181,19 @@ setup.template.append_fields: type: object dynamic: true ---- + +[float] +[[service-map-rum-connections]] +=== Service maps: no connection between client and server + +If the service map is not showing an expected connection between the client and server, +it's likely because you haven't configured +{apm-agent-rum}/configuration.html#distributed-tracing-origins[`distributedTracingOrigins`]. + + +This setting is necessary, for example, for cross-origin requests. +If you have a basic web application that provides data via an API on `localhost:4000`, +and serves HTML from `localhost:4001`, you'd need to set `distributedTracingOrigins: ['https://localhost:4000']` +to ensure the origin is monitored as a part of distributed tracing. +In other words, `distributedTracingOrigins` is consulted prior to the agent adding the +distributed tracing `traceparent` header to each request.