From febaf896d91cff229f1d9ac7c5f60eeab944ad27 Mon Sep 17 00:00:00 2001 From: Staci Cooper Date: Fri, 22 Sep 2023 10:58:59 -0700 Subject: [PATCH] Threshold alarms are low severity if not anomalous --- ...thumbnails_avg_response_time_above_threshold.md | 14 ++++++++------ ...thumbnails_p99_response_time_above_threshold.md | 14 ++++++++------ 2 files changed, 16 insertions(+), 12 deletions(-) diff --git a/documentation/meta/monitoring/runbooks/api_thumbnails_avg_response_time_above_threshold.md b/documentation/meta/monitoring/runbooks/api_thumbnails_avg_response_time_above_threshold.md index 825cef056aa..b48e6d9fafe 100644 --- a/documentation/meta/monitoring/runbooks/api_thumbnails_avg_response_time_above_threshold.md +++ b/documentation/meta/monitoring/runbooks/api_thumbnails_avg_response_time_above_threshold.md @@ -9,12 +9,14 @@ Alarm link: ## Severity Guide -Confirm that there is not a total outage of the service. If not, the severity is -likely low. Check for a recent deployment that may have introduced the problem, -and rollback to the previous version. If not, check the request count and -general network activity. If abnormally high, refer to the [traffic analysis run -book][traffic_runbook] to identify and block any malicious traffic. - +If the avg response time is not [anomalously high][anomaly_alarm], the severity +is likely low. Check for a recent deployment that may have introduced the +problem, and rollback to the previous version. If not, check the request count +and general network activity. If abnormally high, refer to the [traffic analysis +run book][traffic_runbook] to identify and block any malicious traffic. + +[anomaly_alarm]: + https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#alarmsV2:alarm/API+Thumbnails+Production+Average+Response+Time+anomalously+high [traffic_runbook]: /meta/monitoring/traffic/runbooks/identifying-and-blocking-traffic-anomalies.md diff --git a/documentation/meta/monitoring/runbooks/api_thumbnails_p99_response_time_above_threshold.md b/documentation/meta/monitoring/runbooks/api_thumbnails_p99_response_time_above_threshold.md index 870f9598285..557140e524d 100644 --- a/documentation/meta/monitoring/runbooks/api_thumbnails_p99_response_time_above_threshold.md +++ b/documentation/meta/monitoring/runbooks/api_thumbnails_p99_response_time_above_threshold.md @@ -9,12 +9,14 @@ Alarm link: ## Severity Guide -Confirm that there is not a total outage of the service. If not, the severity is -likely low. Check for a recent deployment that may have introduced the problem, -and rollback to the previous version. If not, check the request count and -general network activity. If abnormally high, refer to the [traffic analysis run -book][traffic_runbook] to identify and block any malicious traffic. - +If the P99 response time is not [anomalously high][anomaly_alarm], the severity +is likely low. Check for a recent deployment that may have introduced the +problem, and rollback to the previous version. If not, check the request count +and general network activity. If abnormally high, refer to the [traffic analysis +run book][traffic_runbook] to identify and block any malicious traffic. + +[anomaly_alarm]: + https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#alarmsV2:alarm/API+Thumbnails+Production+P99+Response+Time+anomalously+high [traffic_runbook]: /meta/monitoring/traffic/runbooks/identifying-and-blocking-traffic-anomalies.md