-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DocDB] Pre-Aggregate Metrics for Faster Prometheus Metric Scraping #24565
Labels
area/docdb
YugabyteDB core features
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
Comments
yusong-yan
added
area/docdb
YugabyteDB core features
status/awaiting-triage
Issue awaiting triage
labels
Oct 22, 2024
yugabyte-ci
added
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
labels
Oct 22, 2024
1 task
yusong-yan
changed the title
[DocDB] Optimize Metric Scraping to Reduce Unnecessary Allocations and Redundant Operations
[DocDB] Pre-Aggregate Pre-Aggregate Metrics for Faster Prometheus Metric Scraping
Nov 21, 2024
1 task
yusong-yan
added a commit
that referenced
this issue
Dec 31, 2024
Summary: **Background** The current metric scraping approach aggregates metrics during the scrape, which has caused performance bottlenecks, leading to frequent Prometheus target downtimes for customers. We observed metric scrape times exceeding 15 seconds with 4,000 tables and 18,000 tablets, with most of this time consumed by aggregating tablet metrics to the table level. This update introduces pre-aggregation for tablet metrics that are summed, allowing metric scrapes to skip aggregation of most tablet metrics, significantly reducing the scrape duration. **Key Aspects of Pre-Aggregation:** Pre-aggregation is supported for tablet, xCluster, and CDCSDK metrics. Below, we use tablet metrics as an example: 1. Pre-Aggregation Setup: During metric creation, pre-aggregation is enabled based on the metric's entity type and aggregation function. Only tablet and stream metrics with a sum aggregation function are eligible for pre-aggregation, while other metrics are handled at scrape time. 2. Shared Atomic Variable: Pre-aggregation creates an Atomic Integer variable shared across all instances of the same metric within a table(or stream). When a pre-aggregated metric value is updated, the shared Atomic Integer variable is also updated accordingly. 3. Metric Destruction Handling: When a pre-aggregated tablet metric object is destroyed (e.g., due to tablet move), the shared aggregated value is decremented by the tablet metric value to maintain accuracy. Contention concern for (2): With many concurrent read or write operations, contention may occur, as updating a tablet metric value must compete with other threads updating the same table-level value. To verify performance impact, I ran several Sysbench read-only and and write-only workload, which showed no noticeable impact. [[ https://docs.google.com/spreadsheets/d/1O-RtRWWLkZYNTnjeNWLvrocenIpMtwCb9urjtNkSR9c/edit | Link to results ]]. **New Metric Scraping Steps:** 1. Handling Non-Pre-Aggregated Metrics: * Metrics that need to be aggregated at scrape time are aggregated in this step. * Metrics that do not require aggregation are flushed directly in this step. 2. Flushing pre-aggregated metrics and scrape-time-aggregated metrics. After completing these two phases, a separate asynchronous thread handles cleanup. This cleanup removes unreferenced metric values (e.g., when a table is removed, and no tablets reference the shared aggregated value) and cleans up attributes associated with pre-aggregated values. With these enhancements, scrape time has improved from 15 seconds to 2 seconds for 4,000 tables and 18,000 tablets. **Other Changes:** * Addressed a potential issue where aggregated metrics using the max aggregation function were assumed to always be greater than or equal to zero. However, negative values are possible and are now correctly handled. * Redesigned D35689: The aggregated metric now holds a shared_ptr to its prototype to ensure the OwningPrototype is not deleted before the flush operation. Jira: DB-13599 Test Plan: Jenkins MetricsTest.AggregationTest Reviewers: esheng, mlillibridge, rthallam, amitanand Reviewed By: amitanand Subscribers: amitanand, hsunder, yql, kannan, ybase Differential Revision: https://phorge.dev.yugabyte.com/D39667
vaibhav-yb
pushed a commit
to vaibhav-yb/yugabyte-db
that referenced
this issue
Jan 2, 2025
Summary: **Background** The current metric scraping approach aggregates metrics during the scrape, which has caused performance bottlenecks, leading to frequent Prometheus target downtimes for customers. We observed metric scrape times exceeding 15 seconds with 4,000 tables and 18,000 tablets, with most of this time consumed by aggregating tablet metrics to the table level. This update introduces pre-aggregation for tablet metrics that are summed, allowing metric scrapes to skip aggregation of most tablet metrics, significantly reducing the scrape duration. **Key Aspects of Pre-Aggregation:** Pre-aggregation is supported for tablet, xCluster, and CDCSDK metrics. Below, we use tablet metrics as an example: 1. Pre-Aggregation Setup: During metric creation, pre-aggregation is enabled based on the metric's entity type and aggregation function. Only tablet and stream metrics with a sum aggregation function are eligible for pre-aggregation, while other metrics are handled at scrape time. 2. Shared Atomic Variable: Pre-aggregation creates an Atomic Integer variable shared across all instances of the same metric within a table(or stream). When a pre-aggregated metric value is updated, the shared Atomic Integer variable is also updated accordingly. 3. Metric Destruction Handling: When a pre-aggregated tablet metric object is destroyed (e.g., due to tablet move), the shared aggregated value is decremented by the tablet metric value to maintain accuracy. Contention concern for (2): With many concurrent read or write operations, contention may occur, as updating a tablet metric value must compete with other threads updating the same table-level value. To verify performance impact, I ran several Sysbench read-only and and write-only workload, which showed no noticeable impact. [[ https://docs.google.com/spreadsheets/d/1O-RtRWWLkZYNTnjeNWLvrocenIpMtwCb9urjtNkSR9c/edit | Link to results ]]. **New Metric Scraping Steps:** 1. Handling Non-Pre-Aggregated Metrics: * Metrics that need to be aggregated at scrape time are aggregated in this step. * Metrics that do not require aggregation are flushed directly in this step. 2. Flushing pre-aggregated metrics and scrape-time-aggregated metrics. After completing these two phases, a separate asynchronous thread handles cleanup. This cleanup removes unreferenced metric values (e.g., when a table is removed, and no tablets reference the shared aggregated value) and cleans up attributes associated with pre-aggregated values. With these enhancements, scrape time has improved from 15 seconds to 2 seconds for 4,000 tables and 18,000 tablets. **Other Changes:** * Addressed a potential issue where aggregated metrics using the max aggregation function were assumed to always be greater than or equal to zero. However, negative values are possible and are now correctly handled. * Redesigned D35689: The aggregated metric now holds a shared_ptr to its prototype to ensure the OwningPrototype is not deleted before the flush operation. Jira: DB-13599 Test Plan: Jenkins MetricsTest.AggregationTest Reviewers: esheng, mlillibridge, rthallam, amitanand Reviewed By: amitanand Subscribers: amitanand, hsunder, yql, kannan, ybase Differential Revision: https://phorge.dev.yugabyte.com/D39667
rthallamko3
changed the title
[DocDB] Pre-Aggregate Pre-Aggregate Metrics for Faster Prometheus Metric Scraping
[DocDB] Pre-Aggregate Metrics for Faster Prometheus Metric Scraping
Jan 2, 2025
1 task
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
area/docdb
YugabyteDB core features
kind/enhancement
This is an enhancement of an existing feature
priority/medium
Medium priority issue
Jira Link: DB-13599
Description
The current metric scraping approach aggregates metrics during the scrape, which has caused performance bottlenecks, leading to frequent Prometheus target downtimes for customers. We observed metric scrape times exceeding 15 seconds with 4,000 tables and 18,000 tablets, with most of this time consumed by aggregating tablet metrics to the table level.
Issue Type
kind/enhancement
Warning: Please confirm that this issue does not contain any sensitive information
The text was updated successfully, but these errors were encountered: