[DocDB] Pre-Aggregate Metrics for Faster Prometheus Metric Scraping #24565

yusong-yan · 2024-10-22T17:15:46Z

Description

The current metric scraping approach aggregates metrics during the scrape, which has caused performance bottlenecks, leading to frequent Prometheus target downtimes for customers. We observed metric scrape times exceeding 15 seconds with 4,000 tables and 18,000 tablets, with most of this time consumed by aggregating tablet metrics to the table level.

Issue Type

kind/enhancement

Warning: Please confirm that this issue does not contain any sensitive information

I confirm this issue does not contain any sensitive information.

Summary: **Background** The current metric scraping approach aggregates metrics during the scrape, which has caused performance bottlenecks, leading to frequent Prometheus target downtimes for customers. We observed metric scrape times exceeding 15 seconds with 4,000 tables and 18,000 tablets, with most of this time consumed by aggregating tablet metrics to the table level. This update introduces pre-aggregation for tablet metrics that are summed, allowing metric scrapes to skip aggregation of most tablet metrics, significantly reducing the scrape duration. **Key Aspects of Pre-Aggregation:** Pre-aggregation is supported for tablet, xCluster, and CDCSDK metrics. Below, we use tablet metrics as an example: 1. Pre-Aggregation Setup: During metric creation, pre-aggregation is enabled based on the metric's entity type and aggregation function. Only tablet and stream metrics with a sum aggregation function are eligible for pre-aggregation, while other metrics are handled at scrape time. 2. Shared Atomic Variable: Pre-aggregation creates an Atomic Integer variable shared across all instances of the same metric within a table(or stream). When a pre-aggregated metric value is updated, the shared Atomic Integer variable is also updated accordingly. 3. Metric Destruction Handling: When a pre-aggregated tablet metric object is destroyed (e.g., due to tablet move), the shared aggregated value is decremented by the tablet metric value to maintain accuracy. Contention concern for (2): With many concurrent read or write operations, contention may occur, as updating a tablet metric value must compete with other threads updating the same table-level value. To verify performance impact, I ran several Sysbench read-only and and write-only workload, which showed no noticeable impact. [[ https://docs.google.com/spreadsheets/d/1O-RtRWWLkZYNTnjeNWLvrocenIpMtwCb9urjtNkSR9c/edit | Link to results ]]. **New Metric Scraping Steps:** 1. Handling Non-Pre-Aggregated Metrics: * Metrics that need to be aggregated at scrape time are aggregated in this step. * Metrics that do not require aggregation are flushed directly in this step. 2. Flushing pre-aggregated metrics and scrape-time-aggregated metrics. After completing these two phases, a separate asynchronous thread handles cleanup. This cleanup removes unreferenced metric values (e.g., when a table is removed, and no tablets reference the shared aggregated value) and cleans up attributes associated with pre-aggregated values. With these enhancements, scrape time has improved from 15 seconds to 2 seconds for 4,000 tables and 18,000 tablets. **Other Changes:** * Addressed a potential issue where aggregated metrics using the max aggregation function were assumed to always be greater than or equal to zero. However, negative values are possible and are now correctly handled. * Redesigned D35689: The aggregated metric now holds a shared_ptr to its prototype to ensure the OwningPrototype is not deleted before the flush operation. Jira: DB-13599 Test Plan: Jenkins MetricsTest.AggregationTest Reviewers: esheng, mlillibridge, rthallam, amitanand Reviewed By: amitanand Subscribers: amitanand, hsunder, yql, kannan, ybase Differential Revision: https://phorge.dev.yugabyte.com/D39667

yusong-yan added area/docdb YugabyteDB core features status/awaiting-triage Issue awaiting triage labels Oct 22, 2024

yusong-yan self-assigned this Oct 22, 2024

yugabyte-ci added kind/enhancement This is an enhancement of an existing feature priority/medium Medium priority issue labels Oct 22, 2024

yusong-yan mentioned this issue Oct 22, 2024

[DocDB] Prometheus Scrape Timeout Exceeds 15 Seconds with 18,000 Tablets #24405

Closed

1 task

yugabyte-ci removed the status/awaiting-triage Issue awaiting triage label Oct 23, 2024

yusong-yan changed the title ~~[DocDB] Optimize Metric Scraping to Reduce Unnecessary Allocations and Redundant Operations~~ [DocDB] Pre-Aggregate Pre-Aggregate Metrics for Faster Prometheus Metric Scraping Nov 21, 2024

yusong-yan mentioned this issue Dec 24, 2024

[DocDB] Fix unit test errors caused by inconsistent table id and table name #25423

Closed

1 task

rthallamko3 changed the title ~~[DocDB] Pre-Aggregate Pre-Aggregate Metrics for Faster Prometheus Metric Scraping~~ [DocDB] Pre-Aggregate Metrics for Faster Prometheus Metric Scraping Jan 2, 2025

yusong-yan mentioned this issue Jan 3, 2025

[DocDB] Add DCHECK to verify identical attributes for the same aggregation_id in metric pre-aggregation #25479

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DocDB] Pre-Aggregate Metrics for Faster Prometheus Metric Scraping #24565

[DocDB] Pre-Aggregate Metrics for Faster Prometheus Metric Scraping #24565

yusong-yan commented Oct 22, 2024 •

edited

Loading

[DocDB] Pre-Aggregate Metrics for Faster Prometheus Metric Scraping #24565

[DocDB] Pre-Aggregate Metrics for Faster Prometheus Metric Scraping #24565

Comments

yusong-yan commented Oct 22, 2024 • edited Loading

Description

Issue Type

Warning: Please confirm that this issue does not contain any sensitive information

yusong-yan commented Oct 22, 2024 •

edited

Loading