Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Prometheus metrics #773

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Fix Prometheus metrics #773

wants to merge 2 commits into from

Conversation

rofafor
Copy link

@rofafor rofafor commented Nov 26, 2024

Issue

Ratelimit Prometheus metrics are kind of mess with the latest Envoy Gateway. The included patch tries to solve these issues:

  1. Avoid repeating (optional) value if it's identical to key. This simplifies Prometheus labels by removing unnecessary duplication.
  2. Statsd mapping configuration defaults to glob rules where dots are reserved characters. One needs to replace these and my approach was to hotfix only IPv4 addresses used in CIDR matches.
  3. Simplify default statsd mappings configuration to use more generic (but slower) regex filters for fallbacks.

Ratelimit configuration

- name: LOG_LEVEL
  value: debug
- name: USE_STATSD
  value: "false"
- name: USE_PROMETHEUS
  value: "true"

Envoy configuration

"traffic": {
  "rateLimit": {
    "global": {
      "rules": [
        {
          "headerMatches": [],
          "cidrMatch": {
            "cidr": "0.0.0.0/0",
            "ip": "0.0.0.0",
            "maskLen": 0,
            "isIPv6": false,
            "distinct": true
          },
          "limit": {
            "requests": 1000,
            "unit": "Hour"
          }
        }
      ]
    }
  },
  "timeout": {
    "http": {
      "requestTimeout": "15s"
    }
  }
},

Envoy Ratelimit logs

envoy-ratelimit time="2024-11-26T14:03:45Z" level=debug msg="loading domain: envoy-gateway-system/internal/https"
envoy-ratelimit time="2024-11-26T14:03:45Z" level=debug msg="loading descriptor: key=envoy-gateway-system/internal/https.httproute/envoydemo/envoydemo/rule/0/match/0/envoydemo_domain_com_httproute/envoydemo/envoydemo/rule/0/match/0/envoydemo_domain_com"
envoy-ratelimit time="2024-11-26T14:03:45Z" level=debug msg="loading descriptor: key=envoy-gateway-system/internal/https.httproute/envoydemo/envoydemo/rule/0/match/0/envoydemo_domain_com_httproute/envoydemo/envoydemo/rule/0/match/0/envoydemo_domain_com.masked_remote_address_0.0.0.0/0"
envoy-ratelimit time="2024-11-26T14:03:45Z" level=debug msg="Creating stats for key: 'envoy-gateway-system/internal/https.httproute/envoydemo/envoydemo/rule/0/match/0/envoydemo_domain_com_httproute/envoydemo/envoydemo/rule/0/match/0/envoydemo_domain_com.masked_remote_address_0.0.0.0/0.remote_address'"
envoy-ratelimit time="2024-11-26T14:03:45Z" level=debug msg="loading descriptor: key=envoy-gateway-system/internal/https.httproute/envoydemo/envoydemo/rule/0/match/0/envoydemo_domain_com_httproute/envoydemo/envoydemo/rule/0/match/0/envoydemo_domain_com.masked_remote_address_0.0.0.0/0.remote_address ratelimit={requests_per_unit=1000, unit=HOUR, unlimited=false, shadow_mode=false}"

Ratelimit /metrics in current master

# HELP ratelimit_service_rate_limit_envoy_gateway_system_internal_https_httproute_envoydemo_envoydemo_rule_0_match_0_envoydemo_domain_com_httproute_envoydemo_envoydemo_rule_0_match_0_envoydemo_domain_com_masked_remote_address_0_0_0_0_0_remote_address_total_hits Metric autogenerated by statsd_exporter.
# TYPE ratelimit_service_rate_limit_envoy_gateway_system_internal_https_httproute_envoydemo_envoydemo_rule_0_match_0_envoydemo_domain_com_httproute_envoydemo_envoydemo_rule_0_match_0_envoydemo_domain_com_masked_remote_address_0_0_0_0_0_remote_address_total_hits counter
ratelimit_service_rate_limit_envoy_gateway_system_internal_https_httproute_envoydemo_envoydemo_rule_0_match_0_envoydemo_domain_com_httproute_envoydemo_envoydemo_rule_0_match_0_envoydemo_domain_com_masked_remote_address_0_0_0_0_0_remote_address_total_hits 1
# HELP ratelimit_service_rate_limit_envoy_gateway_system_internal_https_httproute_envoydemo_envoydemo_rule_0_match_0_envoydemo_domain_com_httproute_envoydemo_envoydemo_rule_0_match_0_envoydemo_domain_com_masked_remote_address_0_0_0_0_0_remote_address_within_limit Metric autogenerated by statsd_exporter.
# TYPE ratelimit_service_rate_limit_envoy_gateway_system_internal_https_httproute_envoydemo_envoydemo_rule_0_match_0_envoydemo_domain_com_httproute_envoydemo_envoydemo_rule_0_match_0_envoydemo_domain_com_masked_remote_address_0_0_0_0_0_remote_address_within_limit counter
ratelimit_service_rate_limit_envoy_gateway_system_internal_https_httproute_envoydemo_envoydemo_rule_0_match_0_envoydemo_domain_com_httproute_envoydemo_envoydemo_rule_0_match_0_envoydemo_domain_com_masked_remote_address_0_0_0_0_0_remote_address_within_limit 1

Ratelimit /metrics with this patch

# HELP ratelimit_service_rate_limit_total_hits Metric autogenerated by statsd_exporter.
# TYPE ratelimit_service_rate_limit_total_hits counter
ratelimit_service_rate_limit_total_hits{domain="envoy-gateway-system/internal/https",key1="httproute/envoydemo/envoydemo/rule/0/match/0/envoydemo_domain_com",key2="masked_remote_address_0_0_0_0/0"} 3
# HELP ratelimit_service_rate_limit_within_limit Metric autogenerated by statsd_exporter.
# TYPE ratelimit_service_rate_limit_within_limit counter
ratelimit_service_rate_limit_within_limit{domain="envoy-gateway-system/internal/https",key1="httproute/envoydemo/envoydemo/rule/0/match/0/envoydemo_domain_com",key2="masked_remote_address_0_0_0_0/0"} 3
# HELP ratelimit_service_response_time_seconds Metric autogenerated by statsd_exporter.

@arkodg
Copy link
Contributor

arkodg commented Nov 26, 2024

cc @zirain

@arkodg
Copy link
Contributor

arkodg commented Nov 26, 2024

dup of #716 ?

@rofafor
Copy link
Author

rofafor commented Nov 26, 2024

dup of #716 ?

Yes, I missed that one. The chore: replace dots in ipv4 addresses with slashes is pretty much the same as the mentioned PR - just a bit more generic regexp and without tests.

@zirain
Copy link
Contributor

zirain commented Nov 27, 2024

I'd like to close my PR as this one is better.

src/config/config_impl.go Outdated Show resolved Hide resolved
@rofafor
Copy link
Author

rofafor commented Dec 5, 2024

Squashed the fixup commit for merging

@arkodg
Copy link
Contributor

arkodg commented Dec 9, 2024

hey @rofafor can we split up the fix into 2 PRs, and use this PR for fixing the the SanitizeStatName ?
trying to figure out why EG generates an identical key, value pair downstream

@rofafor
Copy link
Author

rofafor commented Dec 9, 2024

hey @rofafor can we split up the fix into 2 PRs, and use this PR for fixing the the SanitizeStatName ? trying to figure out why EG generates an identical key, value pair downstream

Done by dropping the first commit. Do you want me to create another PR for the dropped identical-key-value commit or do you track down the actual issue with a proper fix?

@arkodg
Copy link
Contributor

arkodg commented Dec 9, 2024

hey @rofafor can we split up the fix into 2 PRs, and use this PR for fixing the the SanitizeStatName ? trying to figure out why EG generates an identical key, value pair downstream

Done by dropping the first commit. Do you want me to create another PR for the dropped identical-key-value commit or do you track down the actual issue with a proper fix?

would be great if you can create a GH issue in https://github.com/envoyproxy/gateway/issues for it

Copy link
Contributor

@arkodg arkodg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thanks !

@arkodg
Copy link
Contributor

arkodg commented Dec 9, 2024

cc @collin-lee @psbrar99

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants