Skip to content

Commit

Permalink
[DOC] Updated documentation for newly added monitoring alerts (#91272)
Browse files Browse the repository at this point in the history
* Documentation for recently added alerts

* [DOCS] Fixes broken link

* Addressed review feedback

Co-authored-by: lcawl <[email protected]>
Co-authored-by: Kibana Machine <[email protected]>
  • Loading branch information
3 people authored Mar 29, 2021
1 parent 0defebd commit caec3d4
Show file tree
Hide file tree
Showing 9 changed files with 77 additions and 79 deletions.
5 changes: 0 additions & 5 deletions docs/settings/monitoring-settings.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -37,11 +37,6 @@ For more information, see
monitoring back-end does not run and {kib} stats are not sent to the monitoring
cluster.

a|`monitoring.cluster_alerts.`
`email_notifications.email_address` {ess-icon}
| Specifies the email address where you want to receive cluster alerts.
See <<cluster-alert-email-notifications, email notifications>> for details.

| `monitoring.ui.elasticsearch.hosts`
| Specifies the location of the {es} cluster where your monitoring data is stored.
By default, this is the same as <<elasticsearch-hosts, `elasticsearch.hosts`>>. This setting enables
Expand Down
64 changes: 0 additions & 64 deletions docs/user/monitoring/cluster-alerts.asciidoc

This file was deleted.

1 change: 0 additions & 1 deletion docs/user/monitoring/index.asciidoc
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
include::xpack-monitoring.asciidoc[]
include::beats-details.asciidoc[leveloffset=+1]
include::cluster-alerts.asciidoc[leveloffset=+1]
include::elasticsearch-details.asciidoc[leveloffset=+1]
include::kibana-alerts.asciidoc[leveloffset=+1]
include::kibana-details.asciidoc[leveloffset=+1]
Expand Down
73 changes: 69 additions & 4 deletions docs/user/monitoring/kibana-alerts.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ To review and modify all the available alerts, use
This alert is triggered when a node runs a consistently high CPU load. By
default, the trigger condition is set at 85% or more averaged over the last 5
minutes. The alert is grouped across all the nodes of the cluster by running
checks on a schedule time of 1 minute with a re-notify internal of 1 day.
checks on a schedule time of 1 minute with a re-notify interval of 1 day.

[discrete]
[[kibana-alerts-disk-usage-threshold]]
Expand All @@ -38,7 +38,7 @@ checks on a schedule time of 1 minute with a re-notify internal of 1 day.
This alert is triggered when a node is nearly at disk capacity. By
default, the trigger condition is set at 80% or more averaged over the last 5
minutes. The alert is grouped across all the nodes of the cluster by running
checks on a schedule time of 1 minute with a re-notify internal of 1 day.
checks on a schedule time of 1 minute with a re-notify interval of 1 day.

[discrete]
[[kibana-alerts-jvm-memory-threshold]]
Expand All @@ -47,7 +47,7 @@ checks on a schedule time of 1 minute with a re-notify internal of 1 day.
This alert is triggered when a node runs a consistently high JVM memory usage. By
default, the trigger condition is set at 85% or more averaged over the last 5
minutes. The alert is grouped across all the nodes of the cluster by running
checks on a schedule time of 1 minute with a re-notify internal of 1 day.
checks on a schedule time of 1 minute with a re-notify interval of 1 day.

[discrete]
[[kibana-alerts-missing-monitoring-data]]
Expand All @@ -56,7 +56,72 @@ checks on a schedule time of 1 minute with a re-notify internal of 1 day.
This alert is triggered when any stack products nodes or instances stop sending
monitoring data. By default, the trigger condition is set to missing for 15 minutes
looking back 1 day. The alert is grouped across all the nodes of the cluster by running
checks on a schedule time of 1 minute with a re-notify internal of 6 hours.
checks on a schedule time of 1 minute with a re-notify interval of 6 hours.

[discrete]
[[kibana-alerts-thread-pool-rejections]]
== Thread pool rejections (search/write)

This alert is triggered when a node experiences thread pool rejections. By
default, the trigger condition is set at 300 or more over the last 5
minutes. The alert is grouped across all the nodes of the cluster by running
checks on a schedule time of 1 minute with a re-notify interval of 1 day.
Thresholds can be set independently for `search` and `write` type rejections.

[discrete]
[[kibana-alerts-ccr-read-exceptions]]
== CCR read exceptions

This alert is triggered if a read exception has been detected on any of the
replicated clusters. The trigger condition is met if 1 or more read exceptions
are detected in the last hour. The alert is grouped across all replicated clusters
by running checks on a schedule time of 1 minute with a re-notify interval of 6 hours.

[discrete]
[[kibana-alerts-large-shard-size]]
== Large shard size

This alert is triggered if a large (primary) shard size is found on any of the
specified index patterns. The trigger condition is met if an index's shard size is
55gb or higher in the last 5 minutes. The alert is grouped across all indices that match
the default patter of `*` by running checks on a schedule time of 1 minute with a re-notify
interval of 12 hours.

[discrete]
[[kibana-alerts-cluster-alerts]]
== Cluster alerts

These alerts summarize the current status of your {stack}. You can drill down into the metrics
to view more information about your cluster and specific nodes, instances, and indices.

An alert will be triggered if any of the following conditions are met within the last minute:

* {es} cluster health status is yellow (missing at least one replica)
or red (missing at least one primary).
* {es} version mismatch. You have {es} nodes with
different versions in the same cluster.
* {kib} version mismatch. You have {kib} instances with different
versions running against the same {es} cluster.
* Logstash version mismatch. You have Logstash nodes with different
versions reporting stats to the same monitoring cluster.
* {es} nodes changed. You have {es} nodes that were recently added or removed.
* {es} license expiration. The cluster's license is about to expire.
+
--
If you do not preserve the data directory when upgrading a {kib} or
Logstash node, the instance is assigned a new persistent UUID and shows up
as a new instance
--
* Subscription license expiration. When the expiration date
approaches, you will get notifications with a severity level relative to how
soon the expiration date is:
** 60 days: Informational alert
** 30 days: Low-level alert
** 15 days: Medium-level alert
** 7 days: Severe-level alert
+
The 60-day and 30-day thresholds are skipped for Trial licenses, which are only
valid for 30 days.

NOTE: Some action types are subscription features, while others are free.
For a comparison of the Elastic subscription levels, see the alerting section of
Expand Down
5 changes: 4 additions & 1 deletion src/core/public/doc_links/doc_links_service.ts
Original file line number Diff line number Diff line change
Expand Up @@ -218,12 +218,15 @@ export class DocLinksService {
guide: `${ELASTIC_WEBSITE_URL}guide/en/kibana/${DOC_LINK_VERSION}/maps.html`,
},
monitoring: {
alertsCluster: `${ELASTIC_WEBSITE_URL}guide/en/kibana/${DOC_LINK_VERSION}/cluster-alerts.html`,
alertsKibana: `${ELASTIC_WEBSITE_URL}guide/en/kibana/${DOC_LINK_VERSION}/kibana-alerts.html`,
alertsKibanaCpuThreshold: `${ELASTIC_WEBSITE_URL}guide/en/kibana/${DOC_LINK_VERSION}/kibana-alerts.html#kibana-alerts-cpu-threshold`,
alertsKibanaDiskThreshold: `${ELASTIC_WEBSITE_URL}guide/en/kibana/${DOC_LINK_VERSION}/kibana-alerts.html#kibana-alerts-disk-usage-threshold`,
alertsKibanaJvmThreshold: `${ELASTIC_WEBSITE_URL}guide/en/kibana/${DOC_LINK_VERSION}/kibana-alerts.html#kibana-alerts-jvm-memory-threshold`,
alertsKibanaMissingData: `${ELASTIC_WEBSITE_URL}guide/en/kibana/${DOC_LINK_VERSION}/kibana-alerts.html#kibana-alerts-missing-monitoring-data`,
alertsKibanaThreadpoolRejections: `${ELASTIC_WEBSITE_URL}guide/en/kibana/${DOC_LINK_VERSION}/kibana-alerts.html#kibana-alerts-thread-pool-rejections`,
alertsKibanaCCRReadExceptions: `${ELASTIC_WEBSITE_URL}guide/en/kibana/${DOC_LINK_VERSION}/kibana-alerts.html#kibana-alerts-ccr-read-exceptions`,
alertsKibanaLargeShardSize: `${ELASTIC_WEBSITE_URL}guide/en/kibana/${DOC_LINK_VERSION}/kibana-alerts.html#kibana-alerts-large-shard-size`,
alertsKibanaClusterAlerts: `${ELASTIC_WEBSITE_URL}guide/en/kibana/${DOC_LINK_VERSION}/kibana-alerts.html#kibana-alerts-cluster-alerts`,
metricbeatBlog: `${ELASTIC_WEBSITE_URL}blog/external-collection-for-elastic-stack-monitoring-is-now-available-via-metricbeat`,
monitorElasticsearch: `${ELASTICSEARCH_DOCS}configuring-metricbeat.html`,
monitorKibana: `${ELASTIC_WEBSITE_URL}guide/en/kibana/${DOC_LINK_VERSION}/monitoring-metricbeat.html`,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ export function createCCRReadExceptionsAlertType(): AlertTypeModel<ValidateOptio
description: ALERT_DETAILS[ALERT_CCR_READ_EXCEPTIONS].description,
iconClass: 'bell',
documentationUrl(docLinks) {
return `${docLinks.links.monitoring.alertsKibana}`;
return `${docLinks.links.monitoring.alertsKibanaCCRReadExceptions}`;
},
alertParamsExpression: (props: Props) => (
<Expression {...props} paramDetails={ALERT_DETAILS[ALERT_CCR_READ_EXCEPTIONS].paramDetails} />
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ export function createLargeShardSizeAlertType(): AlertTypeModel<ValidateOptions>
description: ALERT_DETAILS[ALERT_LARGE_SHARD_SIZE].description,
iconClass: 'bell',
documentationUrl(docLinks) {
return `${docLinks.links.monitoring.alertsKibana}`;
return `${docLinks.links.monitoring.alertsKibanaLargeShardSize}`;
},
alertParamsExpression: (props: Props) => (
<Expression {...props} paramDetails={ALERT_DETAILS[ALERT_LARGE_SHARD_SIZE].paramDetails} />
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ export function createLegacyAlertTypes(): AlertTypeModel[] {
description: LEGACY_ALERT_DETAILS[legacyAlert].description,
iconClass: 'bell',
documentationUrl(docLinks) {
return `${docLinks.links.monitoring.alertsCluster}`;
return `${docLinks.links.monitoring.alertsKibanaClusterAlerts}`;
},
alertParamsExpression: () => (
<Fragment>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ export function createThreadPoolRejectionsAlertType(
description: threadPoolAlertDetails.description,
iconClass: 'bell',
documentationUrl(docLinks) {
return `${docLinks.links.monitoring.alertsKibana}`;
return `${docLinks.links.monitoring.alertsKibanaThreadpoolRejections}`;
},
alertParamsExpression: (props: Props) => (
<>
Expand Down

0 comments on commit caec3d4

Please sign in to comment.